DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/10/2025 has been entered.
Status of Claims
Claims 1-21 are pending and are examined on the merits.
Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged. Priority of US application 63/557,394 filed 02/23/2024 is acknowledged.
Withdrawn Rejections/Objections
The objection to claim 10 in the Office action mailed 10 September 2025 is withdrawn in view of claim amendments filed 10 December 2025.
Claim Rejections - 35 USC § 101
This rejection is maintained from the previous Office action.
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-21 are rejected under 35 USC 101 because the claimed inventions are directed to non-statutory subject matter.
Step 1: Process, Machine, Manufacture, or Composition of Matter
Claims 1-18 is directed to a process, here a "method," with functional steps like “receiving,” “processing,” and “outputting”.
Claims 19 and 21 are directed to a machine and manufacturer, here a "one or more non-transitory computer storage media", with structural components like “one or more non-transitory computer storage media”.
Claim 20 is directed to a machine and manufacturer, here a "system," with structural components like “one or more computers”, “one or more storage device”.
Step 2A, Prong One: Identification of Judicial Recognized Exceptions
Claims 1, 19 and 20 recite:
Obtaining training data for a protein-ligand generative machine learning model, wherein the training data for the protein-ligand generative machine learning model comprises, for each of a set of example protein-ligand complexes, data characterizing a protein, a ligand, and a target 3D structure for the example protein-ligand complex;
This step recites obtaining and then describing data for model training, which can be achieved in human mind. Therefore, this step equates to an abstract idea of mental processes.
Training, using the training data for the protein-ligand generative machine learning model, the protein-ligand generative machine learning model to generate, by an output layer of the protein-ligand generative machine learning model, the target 3D structures for the set of example protein-ligand complexes, the training comprising, for each example protein-ligand complex:
This step recites the model training with the expected output (target 3D structure for the set of example protein-ligand complex). Under a broadest reasonable interpretation (BRI), model training will require tunning parameters for related algorithms, which is directed to mathematical operations. This step also recites generating the target 3D structures for the set of example protein-ligand complexes at an output layer. Under a BRI (according to par. [004]), this process applies a non-linear transformation to a received input to generate an output. Therefore, this step equates to an abstract idea of mathematical concepts.
Processing a model input characterizing the protein for the example protein-ligand complex using a protein generative machine learning model to generate, by an intermediate layer of the protein generative machine learning model, a latent representation characterizing an initial predicted three-dimensional (3D) structure of the protein for the protein-ligand complex, wherein the protein generative machine learning model has been trained to generate, by an output layer of the protein generative machine learning model, predicted 3D structures for a set of example unbound proteins;
This step recites processing the protein part of the protein-ligand complex to generate a latent representation characterizing 3D structures for a set of example unbound proteins. Under a BRI, this process will require mathematical operations according to related algorithms (according to par. [184], the latent representation can include a plurality of embeddings, with each embedding representing, e.g., a respective atom of the protein, a respective group of atoms of the protein, and so on). This step also recites generating predicted 3D structures for a set of example unbound proteins at an output layer. Under a BRI (according to par. [004]), this process applies a non-linear transformation to a received input to generate an output. Therefore, this step equates to an abstract idea of mathematical concepts.
Processing a model input characterizing the ligand for the example protein-ligand complex using a ligand generative machine learning model to generate, by an intermediate layer of the ligand generative machine learning model, a latent representation characterizing an initial predicted 3D structure of the ligand for the example protein-ligand complex, wherein the ligand generative model has been trained to generate, by an output layer of the ligand generative machine learning model, 3D structures for a set of example unbound ligands;
This step recites processing the ligand part of the protein-ligand complex to generate a latent representation characterizing a 3D structures for a set of example unbound ligands. Under a BRI, this process will require mathematical operations according to related algorithms. This step also recites generating 3D structures for a set of example unbound ligands at an output layer. Under a BRI (according to par. [004]), this process applies a non-linear transformation to a received input to generate an output. Therefore, this step equates to an abstract idea of mathematical concepts.
Training the protein-ligand generative machine learning model to process a model input comprising: (i) the latent representation characterizing the initial predicted 3D structure of the protein for the example protein-ligand complex, and (ii) the latent representation characterizing the initial predicted 3D structure of the ligand for the example protein-ligand complex to generate, by an output layer of the protein-ligand generative machine learning model, a model output that defines a predicted 3D structure of the example protein-ligand complex.
This step recites the model training using with the latent representation (according to par. [184], the latent representation can include a plurality of embeddings, with each embedding representing, e.g., a respective atom of the protein, a respective group of atoms of the protein, and so on) characterizing the initial predicted 3D structure of the protein and the latent representation characterizing ligand as input, and generating a predicted 3D structure of the example protein-ligand complex at an output layer. Under a BRI (according to par. [004]), this process applies a non-linear transformation to a received input (latent representations) to generate an output. Therefore, this step equates to an abstract idea of mathematical concepts.
Dependent claims further recite input/output data and the machine learning models. No additional elements are identified in the dependent claims.
.
Step 2A, Prong Two: Consideration of a Practical Application
The claims result in a process of outputting a predicted 3D structure of the example protein-ligand complex, which is part of an abstract idea of mathematical concepts. The claims do not recite any additional elements that integrate the abstract idea/judicial exception into a practical application.
The judicial exceptions are not integrated into a practical application because the claims do not meet any of the following criteria:
An improvement in the functioning of a computer, or an improvement to other technology or technical field, as discussed in MPEP §§ 2106.04(d)(1) and 2106.05(a);
Applying or using a judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, as discussed in MPEP § 2106.04(d)(2);
Implementing a judicial exception with, or using a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim, as discussed in MPEP § 2106.05(b);
Effecting a transformation or reduction of a particular article to a different state or thing, as discussed in MPEP § 2106.05(c); and
Applying or using the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception, as discussed in MPEP § 2106.05(e).
Step 2B: Consideration of Additional Elements and Significantly More
The following additional elements that are not JEs are identified:
One or more non-transitory computer storage media/devices (claims 19-20);
One or more computers (claim 20).
The above additional elements are related to generic computer components. They are all common and conventional.
As discussed above regarding Step 2A/Prong two, all the identified additional elements are providing computing environment for executing abstract idea. they are all well-known and conventional. None of the additional element is significant enough to be inventive.
Hence, claims 1-20 are not eligible under 35 U.S.C. 101.
Response to Arguments
In the Remarks filed 12/10/2025, Applicant argues (page 12, penultimate para through page 18, 3rd para) that claims are integrated into a practical application at Step 2A/Prong two due to an improvement to the computer functioning.
In response, Applicant’s argument is not persuasive. The generic computer components such as the non-transitory computer storage media/devices are tangentially recited as “apply it”, or “as a tool”. Appellant did not provide evidence the claimed “computer system” is anything more than a generic computer that implements the machine learning algorithm, etc., in the conventional and well-known way that computers ordinary function. To qualify as “a patent- eligible improvement,” the invention must be directed to a specific improvement in the computer’s functionality, not simply to use of the computer “as a tool” to implement an abstract idea. Customedia Techs., LLC v. Dish Network Corp., 951 F.3d 1359, 1363-1364 (Fed. Cir. 2020). Here, the invention falls into the latter category. It focuses on using a general purpose computer to carry out the abstract idea. Consequently, we do not discern an inventive concept in how the computer system operates.
Further, despite appellant citations to non-limiting and exemplary embodiments from the specification, the instant specification fails to provide any specific definition that expressly limits the scope of these claim elements. As such, the claims only present a general purpose computer as the device to "apply" the abstract sequence analysis embraced by the claims.
In the remarks (page 13, par. 2-4) Applicants argued that the claimed invention provides an improvement to the functioning of a computer, e.g. by reducing consumption of computational resources such as memory and computing power required to train a generative machine learning model to predict 3D structures of protein-ligand complexes.
In response, Applicant’s argument is not persuasive. As discussed above, Applicants has not pointed to the specific limitations responsible for the asserted improvement. If there are any improvements to computer technology, they are not currently reflected in the claims. The claims recite abstract ideas performed on a generically recited generative machine learning model which reads on a table or graph analyzed within a computer program.
In the remarks (page 13, par. 5 through page 16, par. 1) Applicants further argued that the claimed invention provides an improvement in machine learning technology, e.g. by expanding the amount of training data available for training generative machine learning models to predict 3D structures of protein-ligand complexes, addressing the issue of training data scarcity for protein-ligand complexes, reducing the architectural and computational complexity of generative machine learning models.
In response, Applicant’s argument is not persuasive. The claims recite training a generative machine learning model, using data of known protein-ligand pairs. Training data is not a technology and the steps applying the data are judicial exceptions as mathematical operations are required i for the training data. The claim does not even recite a testing run using proteins and ligands without reported binding. There is no convincing improvement to the machine learning technology. Additionally, improving to a technology needed be applied, captured and reflected in additional elements. The claims as recited, do not have such additional elements. The generated machine learning models are recited as an abstract idea of mathematical concepts.
In the Remarks, Applicant argued (page 17, 2nd~4th paras) over the recently released guidance (Ex parte Desjardins, Director Squires emphasized "Categorically excluding AI innovations from patent protection in the United States jeopardizes America's leadership in this critical emerging technology”), the argument is not specific and therefore not persuasive.
In the Remarks, Applicant argued (page 17, 2nd~4th paras) over the recently released guidance (Ex parte Desjardins, Director Squires emphasized "that §§ 102, 103 and 112 are the traditional and appropriate tools to limit patent protection to its proper scope"), the argument is not specific to be persuasive. Even if the 35 USC 101 rejection is not stressed in the list, 35 USC 101 analysis is still required by the MPEP guidance.
In the Remarks, Applicant argued (page 17, 2nd~4th paras) over the recently released guidance “A rejection of a claim should not be made simply because an examiner is uncertain as to the claim's eligibility” (Memorandum issued by the Deputy Commissioner for Patents Charles Kim in August 2025), the argument is not persuasive. The 101 analysis is guided by the MPEP 2106.04 and 2106.5, guidance here at Step 2A/Prong two.
Therefore the 101 rejection is maintained.
Claim Rejections - 35 USC § 103
This rejection is maintained from the previous Office action.
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-21 are rejected under 35 U.S.C. 103 as being unpatentable over Qiao et al. ("State-specific protein–ligand complex structure prediction with a multiscale deep generative model." Nature Machine Intelligence 6.2 (2024): 195-208. Previously cited), and further in view of Strokach et al. ("Deep generative modeling for protein design." Current opinion in structural biology 72 (2022): 226-236. Newly cited).
Claim 1 is interpreted as a generative machine learning method for predicting protein-ligand docking. Regarding claim 1, Qiao disclosed NeuralPLexer, a “state-specific protein–ligand complex structure prediction with a multiscale deep generative model” (Qiao: title and section Abstract, page 1), which reads on the claim 1 subject matter. Particularly,
Qiao provides “the datasets used for training and testing end-to-end structure prediction were constructed from chains of all monomeric proteins and homomeric complexes in the Protein Data Bank accessed on April 2022. We filtered the retrieved structures by discarding structures with experimental resolution lower than 3:0Å and chain coverage lower than 95%, and deleting ligands that contain more than 1000 heavy atoms and non-transition-metal single-atom ligands. The holo (i.e., ligand-bound protein, Examiner) structures are obtained from the intersection between the filtered set and the annotated protein-ligand complex database BioLip [102] using their non-redundant index file as well as an extra set of chains from GPCRdb [103], and dropping samples with DNA and RNA ligands; the apo structures are obtained from the filtered set that contains no ligand or only ligands in the artifact list provided by Ref. [102]. After combining the apo (i.e., ligand-free protein. Examiner) and holo structures and removing duplicate chains, we obtained 85140 unique samples. The final PL2019-74k dataset is obtained by removing samples deposited after Jan 2019 and samples with UniProt ID in the PocketMiner dataset [75], resulting in 74477 samples for model training. The contrasting apo-holo pair test set is then obtained by taking the samples of the same PDB code and chain index in the original PocketMiner dataset” (1st para. In Section “Datasets and Training Summary”, page 14), which teaches receiving data characterizing a protein and a ligand for the protein-ligand complex.
Qiao provides Fig. 1a (page 2) and Section “H Model training details” (page 33), which teaches training the model using the training data to generate the predicted protein-ligand complex at the output layer.
Qiao provides “NeuralPLexer enables accurate prediction of protein-ligand complex structure and conformational changes. (a) Method overview. To perform predictions, the input protein sequence is first used to retrieve protein language model (PLM) features and structure templates; NeuralPLexer then combines the set of PLM and template features with molecular graph representations of the input ligands to directly sample an ensemble of binding complex structures via a multi-scale generative
model. The main network of NeuralPLexer is comprised of a coarse-grained, auto-regressive contact prediction module (CPM) and an atomistic, diffusion-based equivariant structure denoising module (ESDM). SDE: stochastic differential equation. (b-c) Prediction example on a target with large-scale domain motions upon ligand binding (UniProt:P38998). (b) The structure similarities against experimental apo (i.e., ligand-free protein, PDB:3UGK) and holo (i.e., ligand-bound protein, PDB:3UH1)
structures measured by TM-score are plotted for AlphaFold2 predictions (grey), ligand-free NeuralPLexer predictions (blue), and ligand-bound NeuralPLexer predictions (red). (c) Visualizations of representative NeuralPLexer-predicted structures (blue for apo, red for holo) are overlaid with the experimental apo structure (grey) and the holo structure (light yellow). (d) Visualization of a prediction example (PDB:7CKI, UniProt:P00953) for which NeuralPLexer achieves high atomic accuracy for both the ATP (blue) and an inhibitor bound to the tryptophan site (magenta) upon an induced-fit structure rearrangement” (Fig. 1 legend, Fig 1, page 2), wherein the step a) teaches “processing a model input characterizing the protein, using a protein generative machine learning model, to generate a model output that includes data characterizing an initial predicted three-dimensional (3D) structure of the protein” and “processing a model input characterizing the ligand, using a ligand generative machine learning model, to generate a model output that includes data characterizing an initial predicted 3D structure of the ligand”; and wherein the step b-c) teaches the claim limitation on predicting the 3D structure of both protein and ligand; and wherein step d) teaches the claim limitation on outputting the predicted 3D structure of the protein and the ligand.
Qiao does not teach a separate model to predict neither the protein 3D structure nor the ligand 3D structure. Qiao is not explicit on latent representation of protein/ligand 3D structure.
Strokach provides “many generative models of proteins have been developed that encompass all known protein sequences, model specific protein families, or extrapolate the dynamics of individual proteins. Those generative models can learn protein representations that are often more informative of protein structure and function than hand-engineered features” (page 226, col 1, Section “Abstract” lines 4-9) and Fig. 1 (page 227), which teaches predicting protein 3D structure through generative machine learning models, and the models also characterizes the predicted protein.
Strokach provides Fig. 2(b) and 2(c) (page 228), which teaches latent representation of model data (protein, ligand and protein-ligand complex).
It is obvious that Strokach also teaches predicting ligand 3D structure through a generative machine learning model, and the model also characterizes the predicted ligand. Because a ligand is a protein.
Claims 19 and 20 are the non-transit storage media version and the computer system version of the method claim 1. Since Qiao provides “we present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures solely using protein sequence and ligand molecular graph inputs” (Section “Abstract”, page 1), which teaches the computational approach, the art applied to claim 1 also teaches claims 19 and 20.
Regarding claim 2, Qiao teaches claim 1.Qiao provides “to perform structure prediction, NeuralPLexer jointly samples the 3D heavy-atom coordinates of the protein x and those of the ligands y from a generative model conditioned on the sequence and graph inputs fsg;fGg. In addition to the primary sequence and graph inputs, we retrieved inputs from readily available transformer protein language models and templates from alternative experimental structures or protein structure prediction networks to provide extra conditioning signals to the generative model. In particular, we use protein sequence embeddings from the ESM-2 [9] language model and template structures generated from AlphaFold2 as auxiliary inputs in this study” (2nd para., page 4), which teaches sampling, by the protein-ligand generative machine learning model, the predicted 3D structure of the complex comprising the protein and the ligand from a distribution over a space of possible 3D protein-ligand structures, and wherein the distribution over the space of possible 3D protein-ligand structures is conditionally generated by the protein-ligand generative machine learning model in accordance with values of a set of protein-ligand generative machine learning model parameters.
Qiao does not teach a separate model to predict neither the protein 3D structure nor the ligand 3D structure. Qiao is not explicit on latent representation of protein/ligand 3D structure.
Strokach provides “many generative models of proteins have been developed that encompass all known protein sequences, model specific protein families, or extrapolate the dynamics of individual proteins. Those generative models can learn protein representations that are often more informative of protein structure and function than hand-engineered features” (page 226, col 1, Section “Abstract” lines 4-9) and Fig. 1 (page 227), which teaches predicting protein 3D structure through generative machine learning models, and the models also characterizes the predicted protein.
Strokach provides Fig. 2(b) and 2(c) (page 228), which teaches latent representation of model data (protein, ligand and protein-ligand complex).
It is obvious that Strokach also teaches predicting ligand 3D structure through a generative machine learning model, and the model also characterizes the predicted ligand. Because a ligand is a protein.
Regarding claim 3, Qiao teaches claim 1. Qiao further provides a latent coordination z (formulas 7 and 9, page 13), which teaches sampling, by the protein generative machine learning model, the initial predicted 3D structure of the protein from a distribution over a space of possible 3D protein structures, wherein the distribution over the space of possible 3D protein structures is conditionally generated by the protein generative machine learning model in accordance with values of a set of protein generative machine learning model parameters; or generating, by the protein generative machine learning model, a latent representation of the initial predicted 3D structure of the protein, and providing the latent representation of the initial predicted 3D structure of the protein as the model output of the protein generative machine learning model.
Regarding claims 4-5, Qiao teaches claim 1. Qiao provides “the datasets used for training and testing end-to-end structure prediction were constructed from chains of all monomeric proteins and homomeric complexes in the Protein Data Bank accessed on April 2022. We filtered the retrieved structures by discarding structures with experimental resolution lower than 3:0Å and chain coverage lower than 95%, and deleting ligands that contain more than 1000 heavy atoms and non-transition-metal single-atom ligands. The holo structures are obtained from the intersection between the filtered set and the annotated protein-ligand complex database BioLip [102] using their non-redundant index file as well as an extra set of chains from GPCRdb [103], and dropping samples with DNA and RNA ligands; the apo structures are obtained from the filtered set that contains no ligand or only ligands in the artifact list provided by Ref. [102]. After combining the apo and holo structures and removing duplicate chains, we obtained 85140 unique samples. The final PL2019-74k dataset is obtained by removing samples deposited after Jan 2019 and samples with UniProt ID in the PocketMiner dataset [75], resulting in 74477 samples for model training. The contrasting apo-holo pair test set is then obtained by taking the samples of the same PDB code and chain index in the original PocketMiner dataset; the recent receptors test set is obtained by taking the filtered samples deposited in Jan 2019 for which a reference experimental sample with sequence identity > 98%, backbone RMSD>2Å as reported by TMAlign [60], and distinct bound ligands can be found from the PDB” (1st para. In Section “Datasets and Training Summary”, page 14), which teaches training datasets contain (i) an amino acid sequence of the protein, and (ii) a 3D structure of the protein; and the 3D structure of the protein that is included in the training example defines a 3D structure of the protein in an unbound state (apo is the ligand-free protein, according to Fig. 1 legend, page 2).
Regarding claim 6, Qiao teaches claim 1. Qiao further provides a latent coordination z (formulas 7 and 9, page 13), which teaches sampling, by the ligand generative machine learning model, the initial predicted 3D structure of the ligand from a distribution over a space of possible 3D ligand structures, wherein the distribution over the space of possible 3D ligand structures is conditionally generated by the ligand generative machine learning model in accordance with values of a set of ligand generative machine learning model parameters; or generating, by the ligand generative machine learning model, a latent representation of the initial predicted 3D structure of the ligand, and providing the latent representation of the initial predicted 3D structure of the ligand as the model output of the ligand generative machine learning model.
Regarding claim 7, Qiao teaches claim 1. Qiao provides “Architecture details. (a) Ligand molecules and amino acids are encoded as the collection of atoms, local coordinate frames (depicted as semi-transparent triangles), and stereochemistry-specific pairwise embeddings (depicted as dashed lines) representing their interactions. (b) Information flow in the contact prediction module (CPM) network. The CPM network samples block-wise adjacency matrices among protein and ligand nodes using an autoregressive decoding scheme, where the block adjacency matrices lk sampled from last step are passed to the network to update the predicted histograms of pairwise distances ("distograms") and contact maps ˆL. (c) The forward-time SDE introduces structured drift and noise covariance terms among protein Ca atoms, non-Ca atoms, and ligand atoms. (d) Denoising diffusion process to generate the binding complex 3D atomistic structure. The protein (colored as red-blue from N- to C-terminus) and ligand (colored as grey) structures are jointly generated through a reverse-time, simulated annealing SDE starting from randomly initialized coordinate variables. (e) Information flow in the ESDM neural network. The ESDM network operates on a heterogeneous graph formed by protein atoms (P), ligand atoms (L), protein backbone frames for all residues (B), backbone frames of the selected patches (S), and ligand local frames (F) to predict the denoised atomic coordinates ˆx0; ˆy0 used in the reverse-time SDE. The heterogeneous graph comprises randomized local edges (orange arrows) and densely connected long-range edges (blue arrows), where the long-range edges and inter-residue local edges are initialized via the CPM embeddings” (Fig. 2 legend, page 4), and “Generating contact maps and pair representations We define the contact map of a protein-ligand complex based on the pairwise proximities among all residues and selected frames (Methods, Distograms and contact maps). During model inference,
the contact maps are modeled as the logits of a categorical posterior distribution; the CPM auto-regressively refines a single realization of the contact map L using feedback from the previous iteration predictions. Specifically, the CPM samples a one-hot adjacency matrix from the last-iteration-predicted contact map and sums all sampled adjacency matrices {I}NLk=1 as an additional signal passed to the neural network, until k = NL when each ligand frame is assigned to a protein patch (2nd para., page 5), which teaches the processed model input comprising: (i) the data characterizing the initial predicted 3D structure of the protein, and (ii) the data characterizing the initial predicted 3D structure of the ligand, using the protein-ligand generative machine learning model, to generate the model output that defines the predicted 3D structure of the complex comprising the protein and the ligand comprises: generating an noisy 3D structure of the complex comprising the protein and the ligand; and denoising the noisy 3D structure of the complex comprising the protein and the ligand, over a sequence of one or more denoising iterations and in accordance with values of a set of protein-ligand generative machine learning model parameters, to generate the predicted 3D structure of the complex comprising the protein and the ligand.
Regarding claim 8, Qiao teaches claim 7. Qiao provides “Architecture details. (a) Ligand molecules and amino acids are encoded as the collection of atoms, local coordinate frames (depicted as semi-transparent triangles), and stereochemistry-specific pairwise embeddings (depicted as dashed lines) representing their interactions. (b) Information flow in the contact prediction module (CPM) network. The CPM network samples block-wise adjacency matrices among protein and ligand nodes using an autoregressive decoding scheme, where the block adjacency matrices lk sampled from last step are passed to the network to update the predicted histograms of pairwise distances ("distograms") and contact maps ˆL. (c) The forward-time SDE introduces structured drift and noise covariance terms among protein Ca atoms, non-Ca atoms, and ligand atoms. (d) Denoising diffusion process to generate the binding complex 3D atomistic structure. The protein (colored as red-blue from N- to C-terminus) and ligand (colored as grey) structures are jointly generated through a reverse-time, simulated annealing SDE starting from randomly initialized coordinate variables. (e) Information flow in the ESDM neural network. The ESDM network operates on a heterogeneous graph formed by protein atoms (P), ligand atoms (L), protein backbone frames for all residues (B), backbone frames of the selected patches (S), and ligand local frames (F) to predict the denoised atomic coordinates ˆx0; ˆy0 used in the reverse-time SDE. The heterogeneous graph comprises randomized local edges (orange arrows) and densely connected long-range edges (blue arrows), where the long-range edges and inter-residue local edges are initialized via the CPM embeddings” (Fig. 2 legend, page 4), which teaches
performing a denoising process over a set of structure parameters that jointly parameterize the 3D structure of the complex comprising the protein and the ligand.
Regarding claim 9, Qiao teaches claim 8. Qiao provides “Architecture details. (a) Ligand molecules and amino acids are encoded as the collection of atoms, local coordinate frames (depicted as semi-transparent triangles), and stereochemistry-specific pairwise embeddings (depicted as dashed lines) representing their interactions. (b) Information flow in the contact prediction module (CPM) network. The CPM network samples block-wise adjacency matrices among protein and ligand nodes using an autoregressive decoding scheme, where the block adjacency matrices lk sampled from last step are passed to the network to update the predicted histograms of pairwise distances ("distograms") and contact maps ˆL. (c) The forward-time SDE introduces structured drift and noise covariance terms among protein Ca atoms, non-Ca atoms, and ligand atoms. (d) Denoising diffusion process to generate the binding complex 3D atomistic structure. The protein (colored as red-blue from N- to C-terminus) and ligand (colored as grey) structures are jointly generated through a reverse-time, simulated annealing SDE starting from randomly initialized coordinate variables. (e) Information flow in the ESDM neural network. The ESDM network operates on a heterogeneous graph formed by protein atoms (P), ligand atoms (L), protein backbone frames for all residues (B), backbone frames of the selected patches (S), and ligand local frames (F) to predict the denoised atomic coordinates ˆx0; ˆy0 used in the reverse-time SDE. The heterogeneous graph comprises randomized local edges (orange arrows) and densely connected long-range edges (blue arrows), where the long-range edges and inter-residue local edges are initialized via the CPM embeddings” (Fig. 2 legend, page 4), which teaches the protein-ligand generative machine learning model is a diffusion generative model; and performing the denoising process over the set of structure parameters that jointly parameterize the 3D structure of the complex comprising the protein and the ligand comprises performing a reverse diffusion process over the set of structure parameters.
Regarding claim 10, Qiao teaches claim 8. Qiao provides “Architecture details. (a) Ligand molecules and amino acids are encoded as the collection of atoms, local coordinate frames (depicted as semi-transparent triangles), and stereochemistry-specific pairwise embeddings (depicted as dashed lines) representing their interactions. (b) Information flow in the contact prediction module (CPM) network. The CPM network samples block-wise adjacency matrices among protein and ligand nodes using an autoregressive decoding scheme, where the block adjacency matrices lk sampled from last step are passed to the network to update the predicted histograms of pairwise distances ("distograms") and contact maps ˆL. (c) The forward-time SDE introduces structured drift and noise covariance terms among protein Ca atoms, non-Ca atoms, and ligand atoms. (d) Denoising diffusion process to generate the binding complex 3D atomistic structure. The protein (colored as red-blue from N- to C-terminus) and ligand (colored as grey) structures are jointly generated through a reverse-time, simulated annealing SDE starting from randomly initialized coordinate variables. (e) Information flow in the ESDM neural network. The ESDM network operates on a heterogeneous graph formed by protein atoms (P), ligand atoms (L), protein backbone frames for all residues (B), backbone frames of the selected patches (S), and ligand local frames (F) to predict the denoised atomic coordinates ˆx0; ˆy0 used in the reverse-time SDE. The heterogeneous graph comprises randomized local edges (orange arrows) and densely connected long-range edges (blue arrows), where the long-range edges and inter-residue local edges are initialized via the CPM embeddings” (Fig. 2 legend, page 4), which teaches the protein-ligand generative machine learning model is a flow based generative model; and performing the denoising process over the set of structure parameters that jointly parameterize the 3D structure of the complex comprising the protein and the ligand comprises determining, using differential equations specified by the flow based generative model, denoising trajectories for the set of structure parameters.
Regarding claim 11, Qiao teaches claim 8. Qiao provides “Architecture details. (a) Ligand molecules and amino acids are encoded as the collection of atoms, local coordinate frames (depicted as semi-transparent triangles), and stereochemistry-specific pairwise embeddings (depicted as dashed lines) representing their interactions. (b) Information flow in the contact prediction module (CPM) network. The CPM network samples block-wise adjacency matrices among protein and ligand nodes using an autoregressive decoding scheme, where the block adjacency matrices lk sampled from last step are passed to the network to update the predicted histograms of pairwise distances ("distograms") and contact maps ˆL. (c) The forward-time SDE introduces structured drift and noise covariance terms among protein Ca atoms, non-Ca atoms, and ligand atoms. (d) Denoising diffusion process to generate the binding complex 3D atomistic structure. The protein (colored as red-blue from N- to C-terminus) and ligand (colored as grey) structures are jointly generated through a reverse-time, simulated annealing SDE starting from randomly initialized coordinate variables. (e) Information flow in the ESDM neural network. The ESDM network operates on a heterogeneous graph formed by protein atoms (P), ligand atoms (L), protein backbone frames for all residues (B), backbone frames of the selected patches (S), and ligand local frames (F) to predict the denoised atomic coordinates ˆx0; ˆy0 used in the reverse-time SDE. The heterogeneous graph comprises randomized local edges (orange arrows) and densely connected long-range edges (blue arrows), where the long-range edges and inter-residue local edges are initialized via the CPM embeddings” (Fig. 2 legend, page 4), which teaches the set of structure parameters that jointly parametrize the 3D structure of the complex comprising the protein and the ligand comprise a plurality of backbone torsion angles of the protein.
Regarding claim 12, Qiao teaches claim 8. Qiao further provides “RosettaLigand [S20] runs are launched with a configuration modified from the standard protocol. We set the receptor Calpha constraint parameter to 100.0 to enable a fully flexible receptor; the ligand coordinates are initialized using the aligned-ground-truth conformation as obtained by TM-Align [S21], with randomized torsion angles using the BCL [S22] library as described in the standard protocol” (5th para., page 35), which teaches the set of structure parameters that jointly parametrize the 3D structure of the complex comprising the protein and the ligand comprise a plurality of side chain torsion angles of the protein.
Regarding claim 13, Qiao teaches claim 8. Qiao further provides “The diffusion processes are equivariant to translations and rotations, as detailed in Supplementary Information B.2” (3rd para., page 5), and “RosettaLigand [S20] runs are launched with a configuration modified from the standard protocol. We set the receptor Calpha constraint parameter to 100.0 to enable a fully flexible receptor; the ligand coordinates are initialized using the aligned-ground-truth conformation as obtained by TM-Align [S21], with randomized torsion angles using the BCL [S22] library as described in the standard protocol” (5th para., page 35), which teaches the set of structure parameters that jointly parametrize the 3D structure of the complex comprising the protein and the ligand comprise a plurality of translational, rotational, and torsional parameters of the ligand.
Regarding claim 14, Qiao teaches claim 7. Qiao further provides “The Equivariant Structure Denoising Module (ESDM) of the NeuralPLexer network predicts denoised three-dimensional structures ˆZ0 using the noise input coordinates Zt and graph representations of the binding complex” (3rd para., page 31), which teaches denoising the noisy 3D structure of the complex comprising the protein and the ligand comprises, at each denoising iteration in the sequence of one or more denoising iterations: receiving a current noisy 3D structure of the complex comprising the protein and the ligand; processing a network input that is derived from the current noisy 3D structure of the complex comprising the protein and the ligand using a denoising neural network to generate a denoising output; and updating the current noisy 3D structure of the complex comprising the protein and the ligand using the denoising output of the denoising neural network.
Regarding claim 15, Qiao teaches claim 14, which also teaches at one or more denoising iterations of the sequence of denoising iterations (Fig. 2 part (b), page 4): providing the current noisy 3D structure of the complex comprising the protein and the ligand for processing at a next denoising iteration in the sequence of denoising iterations.
Regarding claim 16, Qiao teaches claim 14, which also teaches at one or more denoising iterations (Fig. 2 part (b), page 4), updating the current noisy 3D structure of the complex comprising the protein and the ligand using the denoising output of the denoising neural network comprises: generating a current predicted structure of the complex comprising the protein and the ligand based on the denoising output of the denoising neural network; and updating the current noisy 3D structure of the complex comprising the protein and the ligand by applying a diffusion sampling technique to a set of structure parameters that parametrize the current predicted structure of the complex comprising the protein and the ligand.
Regarding claim 17, Qiao teaches claim 16. Qiao further provides “In particular, given a ligand-free receptor protein conformation and the binding site centroid coordinate, we adopt a fine-tuned NeuralPLexer model to jointly predict the structure for a cropped spherical region within 6:0Å of any ligand atom by inpainting all the amino acid and ligand atomic coordinates conditioning on the uncropped parts of the receptor backbone. The binding pocket structure accuracy relative to the reference experimental structures is measured by the lDDT-BS, the all-atom Local Distance Difference Test [70] metric averaged for residues within 4:0Å of any ligand atom consistent with CAMEO [71]. Input backbones are obtained using template-free AF2 predictions of 154 selected chains with TM-score>0.8 (that is, high backbone accuracy) but with lDDT-BS<0.9 out of the PDBBind2020 test set, a subset representing cases where AF2 correctly predicts the global protein folding but fails to reproduce the exact bound-state binding site structure. We first assessed the fidelity of these initial AF2 structures for ligand docking by computing the steric clash rate between the ligand and the receptor, defined as the fraction of ligand heavy atoms with a Lennard-Jones energy > 100 kcal/mol using UFF [72] parameters. The stringency of this criterion is reflected by the observation that 9% of the experimental structures are classified as clash-containing because of experimental errors and structures with cross-linking” (2nd para., page 7), which teaches
processing data derived from the current noisy 3D structure of the complex comprising the protein and the ligand using the Lennard-Jones energy function to generate an energy of the current noisy 3D structure of the complex comprising the protein and the ligand; and generating the current predicted structure of the complex comprising the protein and the ligand based on both: (i) the denoising output, and (ii) the energy of the complex comprising the protein and the ligand.
Regarding claim 18, Qiao teaches claim 17. Qiao further provides “the notion of "frames" in a coordinate-free topological molecular graph is justified by the observation that most bending and stretching modes in molecular vibrations are of high frequency, i.e., most bond lengths and bond angles fall into a small range as predicted by valence bond theory” (last para., page 24), which teaches the energy function to generate the energy of the current noisy 3D structure of the complex comprising the protein and the ligand comprises: determining a respective bond length of each of a plurality of bonds in the complex comprising the protein and the ligand; and generating the energy of the current noisy 3D structure of the complex comprising the protein and the ligand based at least in part on the respective bond lengths of the plurality of bonds, structure of a complex comprising the protein and the ligand; and outputting the predicted 3D structure of the protein and the ligand.
Regarding claim 21, Qiao teaches claim 1.Qiao provides “to perform structure prediction, NeuralPLexer jointly samples the 3D heavy-atom coordinates of the protein x and those of the ligands y from a generative model conditioned on the sequence and graph inputs fsg;fGg. In addition to the primary sequence and graph inputs, we retrieved inputs from readily available transformer protein language models and templates from alternative experimental structures or protein structure prediction networks to provide extra conditioning signals to the generative model. In particular, we use protein sequence embeddings from the ESM-2 [9] language model and template structures generated from AlphaFold2 as auxiliary inputs in this study” (2nd para., page 4), which teaches sampling, by the protein-ligand generative machine learning model, the predicted 3D structure of the complex comprising the protein and the ligand from a distribution over a space of possible 3D protein-ligand structures, and wherein the distribution over the space of possible 3D protein-ligand structures is conditionally generated by the protein-ligand generative machine learning model in accordance with values of a set of protein-ligand generative machine learning model parameters.
Qiao does not teach a separate model to predict neither the protein 3D structure nor the ligand 3D structure. Qiao is not explicit on latent representation of protein/ligand 3D structure.
Strokach provides “many generative models of proteins have been developed that encompass all known protein sequences, model specific protein families, or extrapolate the dynamics of individual proteins. Those generative models can learn protein representations that are often more informative of protein structure and function than hand-engineered features” (page 226, col 1, Section “Abstract” lines 4-9) and Fig. 1 (page 227), which teaches predicting protein 3D structure through generative machine learning models, and the models also characterizes the predicted protein.
Strokach provides Fig. 2(b) and 2(c) (page 228), which teaches latent representation of model data (protein, ligand and protein-ligand complex).
It is obvious that Strokach also teaches predicting ligand 3D structure through a generative machine learning model, and the model also characterizes the predicted ligand. Because a ligand is a protein.
It would have been prima facie obvious to combine Qiao’s protein-ligand complex structure prediction pipeline with Strokach’s ligand-free protein (as well as ligand) structure prediction model (Strokach: page 226, col 1, Section “Abstract” lines 4-9) and Fig. 1 (page 227), because Strokach’s ligand-free protein (and protein-free ligand) will allow more input data training and consequently more accurate protein (and ligand) structure prediction. One would reasonably expect success as both Qiao and Strokach are about protein structure modeling using generative machine modeling and Qiao’s pipeline already have modules that predict input protein (and ligand) structure (Qiao: Fig. 1 legend, Fig 1, page 2).
Response to Applicant’s Arguments
In the Remarks filed 12/10/2025, Applicant argues (page 19, 2nd para) that Qiao does not teach latent representation and Strokach failed to cure the deficiency.
In response, Applicant’s argument is not persuasive. Qiao teaches “latent coordinates” (pages 13, 35). Although Qiao is silent in “latent representation”, “latent coordinates” are the components of “latent representation” (a vector). Qiao is explicit (page 12) in “graph-topological representations”, “pair representations”, “molecular representations with both 3D molecular structure and bioactivity information”, “residue-scale graph representation”, “edge representations” and “densely-connected edge representations”. Strokach is explicit (page 230, col 1, para 3-5; page 228, Fig. 2b, 2c) in “latent representation”. Essentially, Applicant argued here, and stressed in claim amendments, that the “the ligand generative machine learning model” hands the input protein and ligand 3D structures respectively through “latent representation”, wherein the “latent representation” for protein and ligand are generated first respective, then served as input to predict the 3D structure of the protein-ligand complex. Strokach’s Variational autoencoders (VAEs) and the Normalizing flows (NFs) (page 228, Fig. 2b, 2c) teaches to map inputs to and from a latent representation.
In the Remarks, Applicant argues (page 19, 3rd para through page 20, 1st para) that Qiao does not teach training a protein-ligand generative machine learning model by "processing a model input characterizing the protein for the example protein-ligand complex to generate, by an intermediate layer of the protein generative machine learning model, a latent representation characterizing an initial predicted three-dimensional (3D) structure of the protein for the protein-ligand complex".
In response, Applicant’s argument is not persuasive. As discussed above, Qiao teaches “latent coordinates” (pages 13, 35). Although Qiao is silent in “latent representation”, “latent coordinates” are the components of “latent representation” (a vector). Qiao is explicit (page 12) in “graph-topological representations”, “pair representations”, “molecular representations with both 3D molecular structure and bioactivity information”, “residue-scale graph representation”, “edge representations” and “densely-connected edge representations”. Strokach is explicit (page 230, col 1, para 3-5; page 228, Fig. 2b, 2c) in “latent representation”. Essentially, Applicant argued here, and stressed in claim amendments, that the “the ligand generative machine learning model” hands the input protein and ligand 3D structures respectively through “latent representation”, wherein the “latent representation” for protein and ligand are generated first respective, then served as input to predict the 3D structure of the protein-ligand complex. Strokach’s Variational autoencoders (VAEs) and the Normalizing flows (NFs) (page 228, Fig. 2b, 2c) teaches to map inputs to and from a latent representation.
In the Remarks, Applicant argues (page 20, 2nd para) that “Strokach does not describe utilizing separate protein and ligand generative machine learning models to first generate latent representations characterizing initial, unbound 3D structures for the protein and the ligand and to then process the latent representations characterizing those initial, unbound 3D structures using a protein-ligand generative machine learning model to generate the predicted 3D structure for the protein-ligand complex of the protein and the ligand”.
In response, Applicant’s argument is not persuasive. Qiao teaches separate protein and ligand generative machine learning models (page 4, Fig. 2b) to first generate dense representations characterizing initial, unbound 3D structures for the protein and the ligand and to then process the latent representations characterizing those initial, unbound 3D structures using a protein-ligand generative machine learning model to generate the predicted 3D structure for the protein-ligand complex of the protein and the ligand”. As discussed above, Strokach’s Variational autoencoders (VAEs) and the Normalizing flows (NFs) (page 228, Fig. 2b, 2c) teaches to map inputs to and from a latent representation, which cures Qiao’s dense representations.
In the Remarks, Applicant argues (page 20, 3rd para through page 21, 1st para) that Strokach does teach generating latent representations characterizing initial, unbound 3D structures for the protein and the ligand, as argued above. Applicant further argues that Strokach does not teach predicting 3D structures for a set of example unbound proteins, and predicting 3D structures for a set of example unbound ligands.
In response, Applicant’s argument is not persuasive. First it is obvious that linear or logistic regression models are available to predict protein structure (Qiao: page 227, Fig. 2b); second, it is well-known that in the sci-tech communities, AlphaFold 2 already finished predicting all protein structures; third, Qiao does teach predicting the protein-ligand complex structure in the bounded status (Qiao: page 2, Fig. 1). It is the latent representation of the protein and the ligand (not the 3D structure of the protein and ligand) that contribute to the prediction of the protein-ligand complex (instant claim 1 last step), the prediction of unbound protein and ligand structure in instant invention serves no purpose and is not use by a later step.
Therefore, the 103 rejection is maintained.
Conclusion
No claims are allowed.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GUOZHEN LIU whose telephone number is (571)272-0224. The examiner can normally be reached Monday-Friday 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Larry D Riggs can be reached at (571) 270-3062. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GL/
Patent Examiner
Art Unit 1686
/Anna Skibinsky/
Primary Examiner, AU 1635