DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/10/2025 has been entered.
Applicant's response, filed on 12/10/2025, is fully considered. The following rejections and/or objections are either reiterated or newly applied. They constitute the complete set presently being applied to the instant application.
Status of claims
Claims 1 and 19 are amended. Claims 1-20 are pending. Claims 1-20 are examined below.
Priority
Acknowledgment is made of applicant's claim for domestic priority with Provisional Application No. 63117068, filed on of 11/23/2020.
Drawings
The drawings filed 5/13/2021 are accepted.
Withdrawn Rejections/Objections
The rejection of claims 1-20 under 35 U.S.C. §103 over Olafson (US 2019/0259470 A1, published Aug. 22, 2019; 09/13/2022 IDS Document) in view of Elledge (U.S. Patent Publication No. 2019/0055545, published Feb. 21, 2019; cited on the attached “Notice of References Cited” form 892) as discussed above 1-3, 4-11, 13-15 and 16-20 and further in view of Stojevic (US 2021/0081804 A1, published Mar. 18, 2021; Foreign priority May 30, 2017; cited on the 02/23/2024 “Notice of References Cited” form 892), in the Office action mailed 09/10/2025 is withdrawn in view of the amendments filed 12/10/2025, however a new rejection under §103 is applied.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Independent claim 1 (page 3, line 4) and claim 19 (page 11, line 13) recite “…to improve the value received by the unselected first machine learning model or the second machine learning model during subsequent benchmark analysis…” The term “improve” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The relationship between the value and the machine learning model is not based on any known standard. It is not clear whether the “value” in claim 1 (page 3, line 4) and claim 19 (page 11, line 13) is the same “value” as the first value or the second value in the previously recited determining and comparing steps. Therefore, it is unclear what encompasses an improved value.
Dependent claims are rejected for being dependent on rejected claims 1 and 19.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Olafson (US 2019/0259470 A1, published Aug. 22, 2019; 09/13/2022 IDS Document) in view of Elledge (U.S. Patent Publication No. 2019/0055545, published Feb. 21, 2019; cited on the 09/13/2022 IDS Document); Stojevic (US 2021/0081804 A1, published Mar. 18, 2021; Foreign priority May 30, 2017; cited on the 02/23/2024 “Notice of References Cited” form 892) and Jamali ("DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins." Drug discovery today 21.5 (2016): 718-724; cited on the attached “Notice of References Cited” form 892).
Regarding claims 1, 13 and 19, Olafson teaches the protein mutational database is described and implemented as ProtaBank. ProtaBank comprises three main functionalities: (1) data deposition tools (2) data storage, and (3) tools for data searching and analysis (Paragraph [0044]). An example design and workflow for ProtaBank is summarized in FIG. 1 (Paragraph [0044]). As shown in FIG. 1, users can interact with ProtaBank through the web interface or the REST API. Data sent to the server is validated and curated before final submission into the database (Paragraph [0044]). In embodiments, ProtaBank comprises a central repository for storing and sharing the world's published protein sequence mutation data, in much the same way that GenBank is a central repository for nucleic acid sequences (Paragraph [0044]). In some embodiments, ProtaBank collects sequence mutation data for a wide range of properties that address all aspects of protein engineering in drug discovery and development including, for example, stability, expression, binding, activity, solubility, aggregation, and half-life, viscosity, immunogenicity, crystallizability, spectral properties, toxicity, bacterial resistance, and specificity (Paragraph [0044]). The sequence mutation data for a wide range of properties that address all aspects of protein engineering in drug discovery correspond to design space. Although, Olafson does not expressly teach a design space for a peptide, it would have been obvious to substitute the protein taught by Olafson with peptide of Elledge as discussed below and there would have been a reasonable expectation of success because both peptide and protein consists of sequences of amino acids. The application is in drug discovery and development. Therefore, this corresponds to the claim limitation of generating a design space for a peptide for an application.
Olafson teaches An embodiment of ProtaBank offers several search and analysis tools that allow users to: (1) browse and search for relevant studies queried by publication/study details (title, abstract, author), protein name, PDB ID, UniProt accession number, or protein sequence, (2) identify data and mutants related to a given protein sequence by BLAST search, (3) visualize mutational data mapped onto a three-dimensional (3D) protein structure, and/or (4) compare and correlate data measured using different assays (Paragraph [0070]). Example 2 of the disclosure illustrates the use of example tools. Embodiments of the disclosure include analysis tools which find statistical correlations between various data elements (Paragraph [0070]). In embodiments, users can identify data by keyword search of the study title, abstract, protein name, publication author or date, PDB ID, and UniProt ID. Data from across studies can be queried using a BLAST sequence search to identify relevant sequences, assays, and protein properties (Paragraph [0070]). This corresponds to the claim limitation of identifying a plurality of sequences for the peptide; and updating the plurality of sequences by determining, for each of the plurality of sequences.
Olafson teaches an embodiment of a database and associated tools, herein referred to as ProtaBank, for storing and searching all types of PE data, spanning a wide range of properties, including those related to activity, binding, stability, folding, and solubility (Paragraph [0038]). The database organizes, integrates, annotates and structures mutational data obtained from diverse approaches, including computational and other types of rational design, saturation mutagenesis, directed evolution, and deep mutational scanning. ProtaBank's functionality permits accurate comparisons between different data sets which facilitates sharing PE data with collaborators and improves the usability of PE datasets for data mining and other analysis methods (Paragraph [0038]). ProtaBank's analysis tools help users gain insights into sequence-activity and structure-activity relationships, improve understanding of how proteins function, and leads to the design of proteins with new and improved properties (Paragraph [0038]). This corresponds to the claim limitation of determining, for each of the plurality of sequences, a respective plurality of activities pertaining to the application, wherein the updating produces an updated plurality of sequences each having an updated respective plurality of activities.
Olafson teaches AI training modules and AI machine learning models can include any of the data from the ProtaBank database, include full length native protein sequences, full length mutant protein sequences, differences between the sequences, data associated with full length native sequences and data associated with a full length protein sequences, any of the protein properties found in Table 1, and assay data (Paragraph [0080]). The training modules can be trained to optimize protein sequence in order to effect functional properties of the protein, for example, any of the characteristics found in Table 1, including efficacy, binding affinity, and serum half-life (Paragraph [0080]). Proteins can also be engineered for proper function or activity, such as binding to a drug target or deactivation of disease-causing biomolecules, other properties like expression level, solubility, and serum half-life can also be maintained or improved (Paragraph [0080]). Olafson also teaches in Paragraph [0011] a computer-executed method of engineering proteins, the method comprising: storing a plurality of full length mutant protein sequences and a plurality of characteristic data sets in a database, wherein each characteristic data set is associated with one of the protein mutant sequences, wherein each protein mutant sequence comprises a string representing a sequence of amino acids, and wherein the characteristic data sets include data from assays done with the protein of the respective full length mutant protein sequence; receiving a protein identifier and protein functional data; matching the protein identifier to one or more full length mutant protein sequences stored in the database; generating an AI training set with the matching full length mutant protein sequences; training a machine learning model using the AI training dataset; employing the machine learning model to design one or more synthetic protein sequences and calculate each synthetic proteins predicted functional data; and outputting the one or more synthetic protein sequences and predicted functional data (Paragraph [0011]). In some embodiments, the data from assays comprises one or more of experimental assay type, numerical value obtained for the assay, units associated with the numerical value, derived values dependent on other experimental values (Paragraph [0011]). In some embodiments, wherein the machine learning model comprises one or more of a neural network, genetic algorithm, decision tree, gradient boosting, and support vector machines. In specific embodiments, matching comprises comparing the full length protein sequence of the protein identifier to the full length mutant protein sequences in the database and returning a match when the sequences are at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or more than 99% similar (Paragraph [0011]). The percent similarity could also be a user entered value. For example, the user could enter 45% and protein mutant sequences with greater than 45% match would be returned. In some embodiments, the characteristic data set, the received protein functional data, or both comprises one or more of the following: Activity, Catalytic efficiency (kcat/Km), Catalytic rate constant (kcat), Count/Number, EC50, Energy, Enrichment, Epistasis, Fitness, IC50, Inhibition constant (Ki), Maximal rate (Vmax), Michaelis constant (Km), Relative activity, Specific activity, Association constant (Ka), Binding affinity, Count/Number, Dissociation constant (Kd), ELISA, Energy, Enrichment, Enthalpy of binding (ΔH), Entropy of binding (ΔS), Epistasis, Fitness, Frequency of occurrence, Gibbs free energy of binding (ΔG), Inhibition constant (Ki), Rate constant of association (kon), Rate constant of dissociation (koff), Concentration, Energy, Enrichment, Frequency of occurrence, Minimum inhibitory concentration (MIC), Yield, Antimicrobial resistance, Energy, Enrichment, Frequency of occurrence, Optical density (OD), Bioavailability, EC50, Half-life (t1/2), IC50, Immunogenicity, Toxicity, Concentration, Energy, Fractional increase in solubility, Insoluble fraction, Oligomerization state, Soluble fraction, Energy, Frequency of occurrence, Relative activity, Relative affinity, Relative kcat, Relative kcat/Km, Relative Kd, Brightness, Emission wavelength (λem), Energy, Excitation wavelength (λex), Extinction coefficient, Fluorescence intensity, Maturation half-time, Photobleaching half-time, pKa, Quantum yield, Constant pressure heat capacity of unfolding (ΔCp), Count/Number, Denaturant concentration at midpoint of unfolding transition (Cm), Energy, Enthalpy of unfolding (ΔH), Entropy of unfolding (ΔS), Equilibrium constant (K), Gibbs free energy of folding/unfolding (ΔG), Melting temperature (Tm), Rate of folding (kF), Rate of unfolding (kU), Slope of chevron plot (m), Slope of the denaturant unfolding curve/cooperativity value (m), Temperature of maximum stability, Thermal tolerance, ß-Tanford value, and Φ-value (Paragraph [0011]). The Temperature of maximum stability and Maximal rate (Vmax) are activities that are equivalent to threshold levels. Olafson further teaches "Machine learning models may be generated by the AI platform during an initial set-up process or after receiving instructions provided by a user. For example, the AI Platform could generate a machine learning model by training on a subset of proteins found in ProtaBank. Example of training during set up could include training on a subset of a specific type of protein, for example, antibodies, membrane bound proteins, metalloproteins, tyrosine kinases, proteases, globular proteins, and beta-barrel proteins, to generate an AI machine learning model for each type or subset of protein." (para. [0078]). Olafson teaches the generation of multiple machine learning models that corresponds to the recited first and second machine learning models. Olafson also teaches "FIG. 12 illustrates an embodiment of a staged method for engineering proteins using AI and the Protabank database. Stage 1) Generation of potential sequences using various techniques from random mutation to computationally designed combinatorial libraries. Stage 2) Application of a previously trained machine learning model for prediction purposes. Stage 3) Evaluation of a large number of potential sequences for the desired properties of interest. Stage 4) Selection of high performing sequences, either individually or as a library of sequences; for example, as an optimized degenerate codon library. Stage 5) Validation of designed sequences using experimental assays. These new data points could now be combined with prior existing data to generate a more predictive machine learning model and the process iterated until the desired optimal protein sequence mutations are found." (Para. [0094]). Overall, Olafson teaches the process of iteratively generating a more predictive machine learning model until the desired optimal protein sequence mutations are found. Each time a model is trained and generated through the described iterative process it would result in a first and second machine learning model. This corresponds to the claim limitation of performing, using at least a first machine learning model and a second machine learning model to process the solution space, one or more trials to identify a candidate drug compound that represents a sequence having at least one level of activity that exceeds one or more threshold levels.
Olafson teaches "The machine learning model can output predictions for novel mutant protein sequences comprising fitness and/or protein functional data, such as the protein functional data found in Table 1 (FIG. 12A). These predicted characteristics are used in a selection process composed of a ranking algorithm to order the corresponding mutant sequences based on desired protein characteristics. Rankings can be accomplished through comparing the predicted characteristics to wanted characteristics and taking the closest matching 10, 20, 30, 40, 50, 60, 70, 80, or 100, for example. The rankings can also be done by taking the synthetic mutant sequences which have predicted characteristic that are within a certain percentage of the wanted characteristic, for example." (Para. [0085]) and Olafson teaches the process of iteratively generating a more predictive machine learning model until the desired optimal protein sequence mutations are found is the selected machine learning model (Para. [0094]). This corresponds to the claim limitation of determining a first value associated with the candidate drug compound identified by the first machine learning model and a second value associated with the candidate drug compound identified by the second machine learning model; comparing the first value and the second value and based on the comparison, selecting the first machine learning model or the second machine learning model to identify the candidate drug compound. The comparison of machine learning models’ value is between the value of the model’s previous version and the current version obtained through retraining.
Olafson also teaches generating multiple machine learning models for each type or subset of protein. Olafson teaches “In some embodiments, the AI Platform includes an AI training module. Machine learning models may be generated by the AI platform during an initial set-up process or after receiving instructions provided by a user. For example, the AI Platform could generate a machine learning model by training on a subset of proteins found in ProtaBank. Example of training during set up could include training on a subset of a specific type of protein, for example, antibodies, membrane bound proteins, metalloproteins, tyrosine kinases, proteases, globular proteins, and beta-barrel proteins, to generate an AI machine learning model for each type or subset of protein.” (Para. [0078]). Generating multiple machine learning models would allow for the models’ performance to be compared with one another.
Jamali also teaches the claim limitation of performing, using at least a first machine learning model and a second machine learning model to process the solution space, one or more trials to identify a candidate drug compound that represents a sequence having at least one level of activity that exceeds one or more threshold levels; determining a first value associated with the candidate drug compound identified by the first machine learning model and a second value associated with the candidate drug compound identified by the second machine learning model; comparing the first value and the second value, and based on the comparison, selecting the first machine learning model or the second machine learning model to identify the candidate drug compound with “We performed a comparative analysis of machine learning algorithms to determine which classifier(s) predicted druggable proteins with appropriate performance in terms of their accuracy, sensitivity, and specificity. (page 719, col. 1, para. 3) and Figure 1 (Page 720). Figure 1 depicts the proposed approach for drug target prediction that includes feature generation.
It would have been prima facia obvious to the skilled artisan to combine the teachings of Olafson to include comparing machine learning models as taught by Jamali in order to select for the model with the appropriate accuracy, sensitivity, and specificity. There would have been a reasonable expectation of success because both Olafson and Jamali teach methods that pertain to using machine learning for predicting protein function.
Olafson teaches that in some embodiments, ProtaBank collects sequence mutation data for a wide range of properties that address all aspects of protein engineering in drug discovery and development including, for example, stability, expression, binding, activity, solubility, aggregation, and half-life, viscosity, immunogenicity, crystallizability, spectral properties, toxicity, bacterial resistance, and specificity (Paragraph [0044]). Olafson also teaches the systems of the disclosure can include one or more I/O (input/output) devices allow a user to enter commands and information into the system, and also allow information to be presented to the user and/or other components or devices (Paragraph [0051]). This corresponds to the claim limitation of transmitting information describing the candidate drug compound to a computing device.
Olafson teaches that FIG. 11 is a non-limiting example system for engineering proteins 1100 (Paragraph [0052]). It comprises a computer 1102, a processor 1104, a memory 1106, and a storage repository 1108 (Paragraph [0052]). The storage repository 1108 can comprise a database 1110. Input/Output devices 1112 are connected to the computer 1102 and usable by a user (Paragraph [0052]). A bus (not shown) can allow the various components and devices to communicate with one another. A bus can be one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. A bus can include wired and/or wireless buses (Paragraph [0052]). The components shown in FIG. 11 are not exhaustive, and in some embodiments, one or more of the components shown in FIG. 11 may not be included in a specific embodiment. Further, one or more components shown in FIG. 11 can be rearranged. Is should also be understood that in embodiments, the various elements shown here can be located together or located remotely from each other. For example, the database could be stored in a different location, such as on a server, from the processor used by the AI Platform (Paragraph [0052]). This corresponds to the claim limitation of A tangible, non-transitory computer-readable medium storing instructions of claim 13 and A system comprising: a memory device storing instructions; and a processing device communicatively coupled to the memory device of claim 19.
Olafson teaches the recited generating, based on the updated plurality of sequences each having the updated respective plurality of activities, a solution space within the design space with "An embodiment of the disclosure is a central repository designed to collect and organize protein sequence mutation data for many different properties and to facilitate the creation of ML datasets from the accumulated data is a necessary component for AI protein engineering" (Paragraph [0042]) and with "In some embodiments, ProtaBank collects sequence mutation data for a wide range of properties that address all aspects of protein engineering in drug discovery and development including, for example, stability, expression, binding, activity, solubility, aggregation, and half-life, viscosity, immunogenicity, crystallizability, spectral properties, toxicity, bacterial resistance, and specificity" (Paragraph [0044]). The recited "plurality of activities" is addressed by Olafson's different properties, while the "updated plurality of sequences" is addressed by Olafson's collect and organize protein sequence.
Olafson teaches the recited the solution space comprises a target subset of the updated plurality of sequences each having the updated respective plurality of activities with "A specific embodiment could have greater than a thousand data points. In embodiments, ProtaBank comprises data including specific properties like binding to a target including large protein variant-drug target binding datasets (e.g., data collected via deep sequencing and screening or selection of combinatorial protein variant libraries)" (Paragraph [0042]). The recited "target subset" is addressed by Olafson's large protein variant-drug target binding datasets.
Olafson does not expressly teach a design space for a peptide.
However, Elledge teaches a design space for a peptide. Elledge teaches "a method for identifying a pathogenic component in a disease, the method comprising: (a) obtaining a biological sample from a plurality of subjects having a common disease, wherein the common disease is suspected of having a pathogenic component, (b) separately contacting each sample of a plurality of reaction samples with each biological sample under conditions that allow formation of at least one antibody-peptide complex, wherein the reaction samples each comprise a display library comprising a plurality of peptides derived from a plurality of pathogens, (c) isolating the at least one antibody-peptide complex formed in each reaction sample from unbound phage, (d) correlating at least one peptide in the at least one antibody-peptide complex in each reaction sample to the pathogen from which it is derived, and (e) identifying a pathogen that is significantly enriched in the plurality of subjects with disease compared to subjects without the disease." (Para. [0023]). Elledge teaches "...a display library comprising a plurality of peptides..." that corresponds to the recited design space for a peptide.
Both Elledge and Olafson are directed to protein search interfaces. Therefore, it would have been obvious to the skilled artisan to modify Olafson by substituting the protein with the peptides as taught by Elledge because Elledge discloses the motivation to generate a library of peptides for use to identify specific peptides. There would have been a reasonable expectation of success because both peptide and protein consists of sequences of amino acids. Also, paragraph 4 of the specification of the instant application discloses a method that includes generating a design space for a protein (e.g., peptide).
Olafson teaches based on the comparison and selection of the unselected first machine learning model or the second machine learning model, optimizing an aspect of the first machine learning model or the second machine learning model that is unselected by adjusting a weight, a bias, a level of hidden nodes, or some combination thereof to improve the value received by the unselected first machine learning model or the second machine learning model during subsequent benchmark analysis with “These new data points could now be combined with prior existing data to generate a more predictive machine learning model and the process iterated until the desired optimal protein sequence mutations are found.” (para. [0094]) and “For theoretical and computational scientists, ProtaBank permits easy access to data sets that can be used to benchmark, test, and improve predictive methods.” (para. [0100]). Olafson teaches a machine learning model that is iteratively being modified and retrained, which is in itself a form of optimizing the model, which is shown in Fig. 10 with training weights. The recited “unselected machine learning model” corresponds to the machine learning model that is iteratively being modified and retrained as taught by Olafson. The comparison of machine learning models’ value is between the value of the model’s previously generated version and the newly generated version obtained through retraining.
Stojevic also teaches optimizing a machine learning model. Stojevic teaches “The weights in the neural network, or constituent tensors in a tensor network, are optimised to minimise the difference between outputs 118 and inputs 114.” (para. [0147]) and “The weights in the network are optimised to minimise some desired cost function.” (para. [0025]).
It would have been prima facia obvious to the skilled artisan to modify Olafson to include optimizing the neural network as taught by Stojevic to minimize the cost function for the purpose of improving the model’s performance. There would have been a reasonable expectation of success because both Olafson and Stojevic teach methods that pertain to using machine learning for generating molecules.
Regarding claims 2, 14 and 20, Olafson teaches in Figure 10 multiple machine learning models, which is equivalent to the first machine learning model and a second model. Olafson also teaches the process of iteratively generating a more predictive machine learning model until the desired optimal protein sequence mutations are found is the selected machine learning model (Para. [0094]). Olafson also teaches another general embodiment is a computer-executed method of engineering proteins, the method comprising: storing a plurality of full length mutant protein sequences and a plurality of characteristic data sets in a database, wherein each characteristic data set is associated with one of the protein mutant sequences, wherein each protein mutant sequence comprises a string representing a sequence of amino acids, and wherein the characteristic data sets include data from assays done with the protein of the respective full length mutant protein sequence; receiving a protein identifier and protein functional data; matching the protein identifier to one or more full length mutant protein sequences stored in the database; generating an AI training set with the matching full length mutant protein sequences; training a machine learning model using the AI training dataset; employing the machine learning model to design one or more synthetic protein sequences and calculate each synthetic proteins predicted functional data; and outputting the one or more synthetic protein sequences and predicted functional data (Paragraph [0011]). In some embodiments, the data from assays comprises one or more of experimental assay type, numerical value obtained for the assay, units associated with the numerical value, derived values dependent on other experimental values (Paragraph [0011]). In some embodiments, wherein the machine learning model comprises one or more of a neural network, genetic algorithm, decision tree, gradient boosting, and support vector machines. In specific embodiments, matching comprises comparing the full length protein sequence of the protein identifier to the full length mutant protein sequences in the database and returning a match when the sequences are at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or more than 99% similar (Paragraph [0011]). The percent similarity could also be a user entered value (Paragraph [0011]). For example, the user could enter 45% and protein mutant sequences with greater than 45% match would be returned. In some embodiments, the characteristic data set, the received protein functional data, or both comprises one or more of the following: Activity, Catalytic efficiency (kcat/Km), Catalytic rate constant (kcat), Count/Number, EC50, Energy, Enrichment, Epistasis, Fitness, IC50, Inhibition constant (Ki), Maximal rate (Vmax), Michaelis constant (Km), Relative activity, Specific activity, Association constant (Ka), Binding affinity, Count/Number, Dissociation constant (Kd), ELISA, Energy, Enrichment, Enthalpy of binding (ΔH), Entropy of binding (ΔS), Epistasis, Fitness, Frequency of occurrence, Gibbs free energy of binding (ΔG), Inhibition constant (Ki), Rate constant of association (kon), Rate constant of dissociation (koff), Concentration, Energy, Enrichment, Frequency of occurrence, Minimum inhibitory concentration (MIC), Yield, Antimicrobial resistance, Energy, Enrichment, Frequency of occurrence, Optical density (OD), Bioavailability, EC50, Half-life (t1/2), IC50, Immunogenicity, Toxicity, Concentration, Energy, Fractional increase in solubility, Insoluble fraction, Oligomerization state, Soluble fraction, Energy, Frequency of occurrence, Relative activity, Relative affinity, Relative kcat, Relative kcat/Km, Relative Kd, Brightness, Emission wavelength (λem), Energy, Excitation wavelength (λex), Extinction coefficient, Fluorescence intensity, Maturation half-time, Photobleaching half-time, pKa, Quantum yield, Constant pressure heat capacity of unfolding (ΔCp), Count/Number, Denaturant concentration at midpoint of unfolding transition (Cm), Energy, Enthalpy of unfolding (ΔH), Entropy of unfolding (ΔS), Equilibrium constant (K), Gibbs free energy of folding/unfolding (ΔG), Melting temperature (Tm), Rate of folding (kF), Rate of unfolding (kU), Slope of chevron plot (m), Slope of the denaturant unfolding curve/cooperativity value (m), Temperature of maximum stability, Thermal tolerance, ß-Tanford value, and Φ-value (Paragraph [0011]). Olafson teaches the user entered value of percent similarity of sequences that is interpreted to be equivalent to the query parameter comprises a sequence parameter. Employing the machine learning model to design one or more synthetic protein sequences and calculate each synthetic proteins predicted functional data that is interpreted to be equivalent model trained to measure, based on a query parameter, a level of the updated respective plurality of activities. This corresponds to performed by a second machine learning model trained to measure, based on a query parameter, a level of the updated respective plurality of activities, wherein the query parameter comprises a sequence parameter.
Regarding claims 4 and 16, Olafson teaches Embodiments of the disclosure use protein feature encodings to add physical or biological knowledge to amino acid sequences to create representations amenable to machine learning (Paragraph [0084]). As the choice of encoding varies based on the size and diversity of the input, as well as the task, several encoding methods can be implemented, allowing users to test and select the encodings most relevant to their problem (Paragraph [0084]). The AI Platform can include the following encodings, for example: one-hot, autoencoders, amino acid property encoders, learned BLOSUM/MSA evolutionary encodings, sequence mutation representation relative to WT, secondary structure/solvent accessible surface area encodings, learned AA embeddings, POOL, Phoenix, and/or structural/graph/topological encodings (Paragraph [0084]). Olafson teaches autoencoders which is capable of autoencoding. This corresponds to the claim limitation of wherein the generating the solution space within the design space further comprises performing, using the query parameter and the updated plurality of sequences each having the updated respective plurality of activities: uniform manifold approximation and projection (UMAP) for dimension reduction to identify the target subset, linear decomposition, principal component analysis (PCA), kernel PCA, matrix factorization, generalized discriminant analysis, linear discriminant analysis, autoencoding, or some combination thereof.
Regarding claims 3 and 15, Olafson teaches receiving the query parameter; and generating, based on the query parameter and the updated plurality of sequences each having the updated respective plurality of activities, the solution space within the design space, wherein the solution space comprises the target subset of the plurality of sets of the updated plurality of sequences, and each sequence of the updated plurality of sequences in the target subset comprises the updated respective plurality of activities that are modified in view of the query parameter. Olafson teaches "In some embodiments, dataset creation tools are used to generate context-specific data subsets (e.g., only stability, expression, and solubility assays for a group of related proteins). As described above, rigorous collection and structured storage of experimental assay techniques, properties measured, conditions, and other metadata enables the identification and analysis of relevant data. ProtaBank's query APIs can be used to support the creation of ML-friendly curated datasets and develop a set of tools within the ProtaBank AI Platform to assist in preparing and combining data from several studies into a single dataset. In embodiments of the disclosure, Protabank comprises tools to enable the following: data selection based on sequence/structural identity, protein property, and assay condition; data filtering to exclude proline mutations, fold changes, membrane proteins, data with high standard error, and other conditions; data cleaning to transform missing, range based, or categorical data; data harmonization to combine data from multiple studies; automatic tools for unit conversion and sequence/mutation mapping; tools to quantify study/assay overlap and correlation to help users select studies and develop customized harmonization functions; and/or data sampling to create even, non-redundant, and distinct training and test sets for ML." (Para. [0092]) and "In some embodiments, the AI Platform includes dynamic design protocols. Specific embodiments can include a dynamic design protocol that integrates ProtaBank with CPD software platforms, such as Triad, to produce protein designs informed by experimental data. Advanced query/search tools were developed to identify and retrieve data based on protein identity, local or global sequence similarity, and other criteria. This integration allows Triad to directly access this data to identify beneficial mutations (informing design parameters) and to create hybrid score functions that identify variants with good structural properties (using CPD score function terms) and good assay potential (using a data-based scoring term)." (Para. [0093]).
Regarding claims 5 and 17, Olafson teaches in Figure 8 and Figure 3A a user interface for users to submit queries. FIG. 3A shows a screenshot of the web interface in which the “Browse submitted studies” tool was used to filter studies by protein name (“ubiquitin”) (Paragraph [0114]). This search returns a sortable table containing all studies with ubiquitin in the protein name or study title (Paragraph [0114]). Clicking on the study ID at the left brings up the analysis page for that study (Paragraph [0114]). Olafson also teaches that the additional embodiments of the disclosure are a graphical tool for specifying residue numbering, support for commonly used biological file formats such as fasta and fastq, tools for entry of equations and special characters, multiowner permissions for users to easily modify publication details or make revisions, and commonly used data reporting formats used in the literature or by the users (Paragraph [0068]). This corresponds to the claim limitation of wherein the receiving the query parameter further comprises receiving the query parameter from a graphical element of a user interface presenting the design space.
Regarding claims 6 and 18, Olafson teaches in specific embodiments, matching comprises comparing the full length protein sequence of the protein identifier to full length mutant protein sequences in the database and returning a match when the sequences are at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or more than 99% similar (Paragraph [0010]). The percent similarity could also be a user entered value (Paragraph [0010]). For example, the user could enter 45% and protein mutant sequences with greater than 45% match would be returned (Paragraph [0010]). In another embodiment, the storage repository further comprises computer executable instructions for execution by the processor comprising: receiving a protein identifier; matching the identifier to one or more full length mutant protein sequences stored in the database; and, outputting the matched full length mutant protein sequences and the data from assays associated with each matched full length mutant protein sequence (Paragraph [0010]). This corresponds to the claim limitation of receiving the query parameter and a desired threshold level of a target activity for the query parameter that the target subset is to exceed in order to be included in the solution space.
Regarding claim 7, Olafson teaches that Some embodiments of ProtaBank include a step-by-step study entry page, improving data entry for studies with multiple proteins, multiple chains, or multiple-component-complexes (such as enzyme-substrate, antibody-antigen, protein-ligand), supporting SMILES (simplified molecular-input line-entry system) string inputs, and a multiple sequence viewer can be added to assist users during the data input process (Paragraph [0068]). Additional embodiments of the disclosure are a graphical tool for specifying residue numbering, support for commonly used biological file formats such as fasta and fastq, tools for entry of equations and special characters, multiowner permissions for users to easily modify publication details or make revisions, and commonly used data reporting formats used in the literature or by the users (Paragraph [0068]). This corresponds to the claim limitation of wherein the application comprises at least one of: anti-infective, anti-cancer, antimicrobial, anti-viral, anti-fungal, anti-inflammatory, anti-cholinergic, anti-dopaminergic, anti-serotonergic, anti-noradrenergic, immunomodulatory, neuromodulatory, a physiological effect caused by a signaling peptide, anti-prionic, functional biomaterials comprising adhesives, sealants, binders, chelates, diagnostic reporters, or some combination thereof, and structural biomaterials comprising biopolymers, encapsulation films, flocculants, desiccants, or some combination thereof.
Regarding claim 8, Olafson teaches in some embodiments, ProtaBank includes integrations between ProtaBank and the Research Collaboratory for Structural Bioinformatics (RCSB) PDB (the regional center in the USA) to allow users to easily navigate between structure data and mutation data for a particular protein or family of proteins (Paragraph [0069]). In this way users can seamlessly explore protein structure, sequence, and mutation space across these major databases. In some embodiments, standardized formats are used for the description of protein sequence, structure, and properties (Paragraph [0069]). The description of protein sequence, structure, and properties corresponds to protein characteristics. This corresponds to the claim limitation of receiving a selection of a sequence from the target subset; and providing information pertaining to the sequence, wherein the information comprises at least classes of: protein characteristics, protein-to-protein interactions, protein-ligand interactions, protein homology and phylogeny, sequence and structure motifs, chemical and physical stability, attributes expressed in solubility data, related structures, related drugs, chemical synthesis, biological synthesis, intellectual property data, clinical data, market data, pharmacological associations, systems biology, protein folding, or some combination thereof.
Regarding claim 9, Olafson teaches in Figures 5, 6 and 7 graphs that corresponds to a topographical map. FIG. 5 is a graph which plots of Cm vs. Tm for Gβ1 data (Paragraph [0018]). A plot of all Gβ1 mutant sequences for which both a Tm and Cm were measured (circles) gives a moderate correlation (r=0.45, dotted line) (Paragraph [0018]). If only data obtained under similar assay conditions is included (restricting Cm data to guanidinium chloride denaturation, pH 5-7, 20-30° C., and Tm data to pH 5-7, no denaturant added) (triangles), a very strong correlation (r=0.80, solid line) between these two measures of stability is observed (Paragraph [0018]). FIGS. 6A and 6B are graphs comparing predicted with experimentally measured ΔΔG values in ProtaBank (Paragraph [0019]). The ΔΔG predictor values (ΔΔGscreen) were plotted against experimental ΔΔG values reported in the literature (ΔΔGliterature) (Paragraph [0019]). FIG. 6A is a graph of an unfiltered search of ProtaBank database identifying 343 mutant sequence pairs with both predicted and experimental ΔΔG values. FIG. 6B is a graph showing a search filtered by the mutations and background sequences from the Olson et al. study yielding 82 pairs, reproducing their data (Paragraph [0019]). FIG. 7 compares fitness and proximity to the binding site for Gβ1 point mutants (Paragraph [0020]). The ProtaBank visualizer was used to map the Olson et al. fitness data to the Gβ1 structure and make the two images shown here. Gβ1 is displayed bound to the Fc domain (PDB ID: 1FCC) (Paragraph [0020]). In FIG. 7A, the Gβ1 backbone is shaded by median deviation from the wild-type value, with large deviations in blue, medium in white, and small to no deviations in red. In FIG. 7B, the backbone is shaded by proximity to the binding partner (Paragraph [0020]). The structural analysis shows that most of the Gβ1 residues near the binding interface are particularly sensitive to mutation (Paragraph [0020]). This corresponds to the claim limitation of providing the solution space to the computing device for presentation as a topographical map in a user interface of the computing device, wherein the topographical map comprises a plurality of indications that each represent a level of activity for a sequence at a given point on the topographical map.
Regarding claim 10, Olafson teaches that this platform integrates: (1) protein sequence mutation data from the public and proprietary ProtaBank databases, (2) AI framework, (3) encoded sequence and structure features using protein domain knowledge for enhanced ML, and (4) a workflow that makes powerful AI drug development approaches highly accessible (Paragraph [0074]). In embodiments, the AI Platform predicts new protein sequence variants to be validated experimentally (Paragraph [0074]). In some embodiments, the AI Platform provides all the tools needed for AI drug development in a single platform (Paragraph [0074]). This corresponds to the claim limitation of causing the candidate drug compound to be manufactured.
Regarding claim 11, Olafson teaches in embodiments, the protein mutational database is described and implemented as ProtaBank. ProtaBank comprises three main functionalities: (1) data deposition tools (2) data storage, and (3) tools for data searching and analysis. An example design and workflow for ProtaBank is summarized in FIG. 1. As shown in FIG. 1, users can interact with ProtaBank through the web interface or the REST API. Data sent to the server is validated and curated before final submission into the database. In embodiments, ProtaBank comprises a central repository for storing and sharing the world's published protein sequence mutation data, in much the same way that GenBank is a central repository for nucleic acid sequences. In some embodiments, ProtaBank collects sequence mutation data for a wide range of properties that address all aspects of protein engineering in drug discovery and development including, for example, stability, expression, binding, activity, solubility, aggregation, and half-life, viscosity, immunogenicity, crystallizability, spectral properties, toxicity, bacterial resistance, and specificity. In some embodiments, ProtaBank employs standard formats that facilitate comparison of results across different datasets. Example, standard formats include standardized assay conditions for stability measurements including temperature, concentration, and pH. In some embodiments, ProtaBank comprises search and analysis tools and collection utilities. In additional embodiments, the search and analysis tools and collection utilities are used to create datasets for machine learning. In some embodiments, ProtaBank integrates a company's proprietary protein sequence mutation data with the ProtaBank public data while securely protecting the proprietary protein sequence mutation data. In some embodiments, ProtaBank provides an organization-wide centralized repository to track, persist, and maintain a company's valuable protein sequence mutation data for later use in AI and other traditional protein engineering projects. Immunogenicity and aggregation correspond to immunomodulatory activity and self-aggregation. This corresponds to the claim limitation of wherein the updated respective plurality of activities comprises immunomodulatory activity, receptor binding activity, self-aggregation, cell-penetrating activity, anti-viral activity, peptidergic activity, or some combination thereof.
Olafson does not teach determining one or more metrics of the machine learning model that performs the one or more trials, wherein the one or more metrics comprise memory usage, graphic processing unit temperature, power usage, processor usage, central processing unit temperature, or some combination thereof; and comparing the one or more metrics to one or more second metrics of a second machine learning model that performs the one or more trials of claim 12. However, these limitations are taught by Stojevic.
Regarding claim 12, Stojevic teaches the recited determining one or more metrics of the first machine learning model and the second machine learning model that perform the one or more trials, wherein the one or more metrics comprise memory usage, graphic processing unit temperature, power usage, processor usage, central processing unit temperature, or some combination thereof; and comparing the one or more metrics. Stojevic teaches "Given how memory intensive deep neural networks typically are, substantial effort has been made to reduce number of parameters these networks require without significantly reducing their accuracy." (Para. [0472]) and "When comparing the MERA network to the fully connected model, FC-1 we see a considerable drop in the number of parameters required with only a modest drop in the accuracy of the network (Para. [0467]).
It would have been prima facia obvious to combine the teachings of Olafson and Stojevic. A person of ordinary skill in the art would have been motivated to modify the teachings of Olafson to determine and compare machine learning models metrics as taught by Stojevic to select for the model with fewer parameters required to reduce memory usage. Furthermore, there would have been a reasonable expectation of success, since both Olafson and Stojevic teach methods that pertain to using machine learning for generating molecules.
Response to 35 USC §103
Applicant amended independent claims 1 and 19. It is noted that Applicant’s remarks are based on amended claims.
In Applicant's remarks Claim Rejections under 35 U.S.C. §103, filed 12/10/2025, see pages 12-13, Applicant asserts that the cited references fail to teach or suggest at least the following portion of amended independent claims.
…based on the comparison and selection of the unselected first machine
learning model or the second machine learning mode, optimizing an
aspect of the first machine learning model or the second machine learning
model that is unselected by adjusting a weight, a bias, a level of hidden
nodes, or some combination thereof to improve the value received by the
unselected first machine learning model or the second machine learning
model during subsequent benchmark analysis.
Applicant states that Olafson does not teach or suggest selecting between two machine learning models based on comparing scores associated with two candidate drug compounds separately identified by the two machine learning models. Applicant also states Olafson appears to iteratively generate a new predictive machine learning model until a desired optimal protein sequence mutation is found. Applicant further asserts that Olafson is silent regarding optimizing an aspect of a first machine learning model or a second machine learning model that is unselected by adjusting a weight, a bias, a level of hidden nodes, or some combination thereof. Applicant also states that Olafson does not discuss doing anything to an unselected machine learning model because Olafson appears to modify one machine learning model to iteratively generate a new predict machine learning model.
In response, Applicant’s remarks are not persuasive. With Olfason, the comparison of machine learning models’ value is between the value of the model’s previously generated version and the newly generated version obtained through retraining. Olfason also teaches the recited “unselected machine learning model” corresponds to the machine learning of Olfason that is iteratively being modified and trained, which is in itself a form of optimizing the model, which is shown in Fig. 10 with training weights. Also, as indicated in the last Office Action, dated 09/10/2025, Stojevic teaches optimizing a machine learning model by adjusting a weight. Stojevic teaches “The weights in the neural network, or constituent tensors in a tensor network, are optimised to minimise the difference between outputs 118 and inputs 114.” (para. [0147]) and “The weights in the network are optimised to minimise some desired cost function.” (para. [0025]).
Olafson also teaches generating multiple machine learning models for each type or subset of protein. Olafson teaches “In some embodiments, the AI Platform includes an AI training module. Machine learning models may be generated by the AI platform during an initial set-up process or after receiving instructions provided by a user. For example, the AI Platform could generate a machine learning model by training on a subset of proteins found in ProtaBank. Example of training during set up could include training on a subset of a specific type of protein, for example, antibodies, membrane bound proteins, metalloproteins, tyrosine kinases, proteases, globular proteins, and beta-barrel proteins, to generate an AI machine learning model for each type or subset of protein.” (Para. [0078]). Generating multiple machine learning models would allow for the models’ performance to be compared with one another.
Additionally, as discussed above in the 103 rejection section, newly applied art, Jamali also teaches the claim limitation of performing, using at least a first machine learning model and a second machine learning model to process the solution space, one or more trials to identify a candidate drug compound that represents a sequence having at least one level of activity that exceeds one or more threshold levels; determining a first value associated with the candidate drug compound identified by the first machine learning model and a second value associated with the candidate drug compound identified by the second machine learning model; comparing the first value and the second value, and based on the comparison, selecting the first machine learning model or the second machine learning model to identify the candidate drug compound with “We performed a comparative analysis of machine learning algorithms to determine which classifier(s) predicted druggable proteins with appropriate performance in terms of their accuracy, sensitivity, and specificity. (page 719, col. 1, para. 3) and Figure 1 (Page 720). Figure 1 depicts the proposed approach for drug target prediction that includes feature generation.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
In accordance with MPEP § 2106, claims found to recite statutory subject matter (Step 1: YES) are then analyzed to determine if the claims recite any concepts that equate to an abstract idea, law of nature or natural phenomenon (Step 2A, Prong 1). In the instant application, the claims recite the following limitations that equate to an abstract idea:
Claims 1, 13 and 19 recite a method comprising: generating a design space for a peptide for an application, wherein the generating comprises: identifying a plurality of sequences for the peptide; and updating the plurality of sequences by determining, for each of the plurality of sequences, a respective plurality of activities pertaining to the application, wherein the updating produces an updated plurality of sequences each having an updated respective plurality of activities; generating, based on the updated plurality of sequences each having the updated respective plurality of activities, a solution space within the design space, wherein the solution space comprises a target subset of the updated plurality of sequences each having the updated respective plurality of activities; performing, using at least a first machine learning model and a second machine learning model to process the solution space, one or more trials to identify a candidate drug compound that represents a sequence having at least one level of activity that exceeds one or more threshold levels; determining a first value associated with the candidate drug compound identified by the first machine learning model and a second value associated with the candidate drug compound identified by the second machine learning model; comparing the first value and the second value and based on the comparison, selecting the first machine learning model or the second machine learning model to identify the candidate drug compound; based on the comparison and selection of the unselected first machine learning model or the second machine learning model, optimizing an aspect of the first machine learning model or the second machine learning model that is unselected by adjusting a weight, a bias, a level of hidden nodes, or some combination thereof to improve the value received by the unselected first machine learning model or the second machine learning model during subsequent benchmark analysis.
Claims 2, 14 and 20 recite wherein the generating the solution space within the design space is performed by a second machine learning model trained to measure, based on a query parameter, a level of the updated respective plurality of activities, wherein the query parameter comprises a sequence parameter.
Claims 3 and 15 recite generating, based on the query parameter and the updated plurality of sequences each having the updated respective plurality of activities, the solution space within the design space, wherein the solution space comprises the target subset of the plurality of sets of the updated plurality of sequences, and each sequence of the updated plurality of sequences in the target subset comprises the updated respective plurality of activities that are modified in view of the query parameter.
Claims 4 and 16 recite wherein the generating the solution space within the design space further comprises performing, using the query parameter and the updated plurality of sequences each having the updated respective plurality of activities: uniform manifold approximation and projection (UMAP) for dimension reduction to identify the target subset, linear decomposition, principal component analysis (PCA), kernel PCA, matrix factorization, generalized discriminant analysis, linear discriminant analysis, autoencoding, or some combination thereof.
Claim 12 recites determining one or more metrics of the machine learning model that performs the one or more trials, wherein the one or more metrics comprise memory usage, graphic processing unit temperature, power usage, processor usage, central processing unit temperature, or some combination thereof; and comparing the one or more metrics to one or more second metrics of a second machine learning model that performs the one or more trials.
The processes of claims 1-4 and 12-16 and 19-20 include generating, identifying, comparing, selecting, determining, adjusting weights and benchmark analysis are acts of analyzing and evaluating information and then making a judgement, which could be practically performed in the human mind with pen and paper. Therefore, under the broadest reasonable interpretation, the claims can be practically carried out in the human mind or with pen and paper as claimed, which falls under the "mental processes" grouping of abstract ideas. Accordingly, the "mental processes" abstract idea grouping is defined as concepts performed in the human mind, and examples of mental processes include observations, evaluations, judgments, and opinions (see MPEP § 2106.04(a)(2), subsection III). Although, claim 1 recites transmitting the information to a computing device and claims 13 and 19 recite performing the method executed on a computer, there are no additional imitations to indicate that anything other than a generic computer is required. Requiring that the steps are carried out with a generic computer does not negate the mental nature of these steps and equates rather to merely using a computer as a tool to perform the mental process (see MPEP § 2106.04(a)(2), subsection III(C)). The processes of claims 1, 2, 13, 14, 19 and 20 include the optimizing machine learning model, machine learning model trained to measure or to process and claims 4 and 16 include algorithms that requires carrying out a series of mathematical calculations. This falls under the “mathematical concepts” grouping of abstract ideas (see MPEP § 2106.04(a)(2), subsection I). As such, claims 1-20 recite an abstract idea
(Step 2A, Prong 1: YES).
Claims found to recite a judicial exception under Step 2A, Prong 1 are then further analyzed to determine if the claims as a whole integrate the recited judicial exception into a practical application or not (Step 2A, Prong 2). This judicial exception is not integrated into a practical application because the claims do not recite an additional elements or steps or limitations that further apply, rely on, or use the judicial exception(s) in such a way that amounts to an integration into a practical application. For example there are no limitations that further reflects an improvement to technology or applies or uses the recited judicial exception in some other meaningful way. The instant claims recite the following additional elements (in addition the judicial exceptions):
Claim 1 recites transmitting information describing the candidate drug compound to a computing device.
Claims 3 and 15 recite receiving the query parameter
Claims 5 and 17 recite wherein the receiving the query parameter further comprises receiving the query parameter from a graphical element of a user interface presenting the design space.
Claims 6 and 18 recite receiving the query parameter and a desired threshold level of a target activity for the query parameter that the target subset is to exceed in order to be included in the solution space.
Claim 7 recites wherein the application comprises at least one of: anti-infective, anti-cancer, antimicrobial, anti-viral, anti-fungal, anti-inflammatory, anti-cholinergic, anti-dopaminergic, anti-serotonergic, anti-noradrenergic, immunomodulatory, neuromodulatory, a physiological effect caused by a signaling peptide, anti-prionic, functional biomaterials comprising adhesives, sealants, binders, chelates, diagnostic reporters, or some combination thereof, and structural biomaterials comprising biopolymers, encapsulation films, flocculants, desiccants, or some combination thereof.
Claim 8 recites receiving a selection of a sequence from the target subset; and providing information pertaining to the sequence, wherein the information comprises at least classes of: protein characteristics, protein-to-protein interactions, protein-ligand interactions, protein homology and phylogeny, sequence and structure motifs, chemical and physical stability, attributes expressed in solubility data, related structures, related drugs, chemical synthesis, biological synthesis, intellectual property data, clinical data, market data, pharmacological associations, systems biology, protein folding, or some combination thereof.
Claim 9 recites providing the solution space to the computing device for presentation as a topographical map in a user interface of the computing device, wherein the topographical map comprises a plurality of indications that each represent a level of activity for a sequence at a given point on the topographical map.
Claim 10 recites causing the candidate drug compound to be manufactured.
Claim 11 recites wherein the updated respective plurality of activities comprises immunomodulatory activity, receptor binding activity, self-aggregation, cell-penetrating activity, anti-viral activity, peptidergic activity, or some combination thereof.
Claim 13 recites A tangible, non-transitory computer-readable medium storing instructions that, when executed, cause a processing device
Claim 19 recites A system comprising: a memory device storing instructions; and a processing device communicatively coupled to the memory device
The components of claims 13 and 19 equate to generic computer storage with stored instructions that are executed to implement the abstract idea on a generic computer or data gathering activity of obtaining query parameter, solution space or sequences. These limitations equate to a generic computer environment. Claims 5-6 and 8-9 recites methods of receiving data that serves as input to the recited judicial exception in the claims. The limitations of claims 7-8 and 11 are providing information on what the data represents and do not require that the particular data generating processes be performed. Therefore, these limitations do not change the character of the obtaining data step beyond mere data gathering activity. While claims 2, 4 and 12, 14, 16 and 20 do not recite any elements in addition to the judicial exception. As such, as currently recited, the claims do not appear to recite an improvement to technology or apply or use the recited judicial exception in some other meaningful way. Therefore, claims 1-20 are directed to an abstract idea.
(Step 2A, Prong 2: NO).
Claims found to be directed to a judicial exception are then further evaluated to determine if the claims recite an inventive concept that provides significantly more than the judicial exception itself (Step 2B). The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims recite additional elements that equate to well-understood, routine and conventional activities, insignificant extra-solution activity or mere instructions to implement the abstract idea on a generic computer. The instant claims recite the following additional elements:
Claim 1 recites transmitting information describing the candidate drug compound to a computing device.
Claims 3 and 15 recite receiving the query parameter
Claims 5 and 17 recite wherein the receiving the query parameter further comprises receiving the query parameter from a graphical element of a user interface presenting the design space.
Claims 6 and 18 recite receiving the query parameter and a desired threshold level of a target activity for the query parameter that the target subset is to exceed in order to be included in the solution space.
Claim 7 recites wherein the application comprises at least one of: anti-infective, anti-cancer, antimicrobial, anti-viral, anti-fungal, anti-inflammatory, anti-cholinergic, anti-dopaminergic, anti-serotonergic, anti-noradrenergic, immunomodulatory, neuromodulatory, a physiological effect caused by a signaling peptide, anti-prionic, functional biomaterials comprising adhesives, sealants, binders, chelates, diagnostic reporters, or some combination thereof, and structural biomaterials comprising biopolymers, encapsulation films, flocculants, desiccants, or some combination thereof.
Claim 8 recites receiving a selection of a sequence from the target subset; and providing information pertaining to the sequence, wherein the information comprises at least classes of: protein characteristics, protein-to-protein interactions, protein-ligand interactions, protein homology and phylogeny, sequence and structure motifs, chemical and physical stability, attributes expressed in solubility data, related structures, related drugs, chemical synthesis, biological synthesis, intellectual property data, clinical data, market data, pharmacological associations, systems biology, protein folding, or some combination thereof.
Claim 9 recites providing the solution space to the computing device for presentation as a topographical map in a user interface of the computing device, wherein the topographical map comprises a plurality of indications that each represent a level of activity for a sequence at a given point on the topographical map.
Claim 10 recites causing the candidate drug compound to be manufactured.
Claim 11 recites wherein the updated respective plurality of activities comprises immunomodulatory activity, receptor binding activity, self-aggregation, cell-penetrating activity, anti-viral activity, peptidergic activity, or some combination thereof.
Claim 13 recites A tangible, non-transitory computer-readable medium storing instructions that, when executed, cause a processing device
Claim 19 recites A system comprising: a memory device storing instructions; and a processing device communicatively coupled to the memory device
Limitations (as listed above) that equate to data gathering and outputting via generic computer components, such as receiving data at a computer or outputting data, amount to insignificant extra-solution activity as set forth by the courts in Mayo, 566 U.S. at 79, 101 USPQ2d at 1968 and OIP Techs., Inc, v, Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015). Also, the additional elements include storing and retrieving information in memory. Storing and retrieving information in memory were identified by the courts as well-understood, routine and conventional in Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93. Also, the use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more as identified by the courts in Affinity Labs v. DirecTV, 838 F.3d 1253, 1262, 120 USPQ2d 1201, 1207 (Fed. Cir. 2016) (cellular telephone); TLI Communications LLC v. AV Auto, LLC, 823 F.3d 607, 613, 118 USPQ2d 1744, 1748 (Fed. Cir. 2016) (computer server and telephone unit). Overall, the additional elements do not amount to significantly more when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception. Claims 2, 4 and 12, 14, 16 and 20 do not recite additional limitations (in addition to the recited judicial exceptions discussed in detail above). Therefore, the claims do not amount to significantly more than the judicial exception itself
(Step 2B: No).
As such, claims 1-20 are not patent eligible.
Response to 35 USC §101
Applicant amended independent claims 1 and 19. It is noted that Applicant’s remarks are based on amended claims.
In Applicant's remarks Claim Rejections under 35 U.S.C. §101, filed 12/10/2025, see pages 13-17, Applicant discusses the case on September 26, 2025 where the USPTO Director John Squires overturned a Patent Trial and Appeal Board decision that ruled a machine learning claim was ineligible under 35 USC 101. Applicant also discusses Ex Parte YiFang Liu, 2019-002937 (June 26, 2020), page 10. Applicant states that the instant claims are related to training machine learning models, which are similar to the claims of the mentioned cases.
Applicant further states that the current claims provide at least a substantial technical benefit for machine-learning systems: the machine learning models are continuously updated to improve their score relative to other machine learning models by modifying their features (e.g., weights, activation functions, hidden layer numbers, loss, etc.), which improves how the machine learning models perform related to certain respective parameters. In such a manner, computing resources may be saved by only using the most highly rated machine learning model to generate candidate drug compounds.
Applicant refers to the instant specification, paragraphs [0250]-[0251]. Paragraph 250 states At block 1206, the instant specification states that the processing device may determine which creator module 151 of the set of creator modules performs better for each respective parameter. The scores of the parameters for each of the set of creator modules 151 may be presented on a display screen of a computing device. The best performing creator modules for each parameter may also be presented on the display screen. Paragraph 251 states At block 1208, the instant specification states that the processing device may tune the set of creator modules 151 to cause the set of creator modules 151 to receive higher scores for certain parameters during subsequent benchmark analysis. The tuning may optimize certain weights, activation functions, hidden layer number, loss, and the like of one or more generative modules included in the creator modules.
In response, Applicant’s arguments under Step 2A Prong 2 regarding improvement have been fully considered and are not persuasive. It is understood that the Applicant asserts that the claimed invention provides a technical benefit for machine learning systems because the machine learning models are continuously updated to improve their score relative to other machine learning models by modifying their features (e.g., weights, activation functions, hidden layer numbers, loss, etc.), which improves how the machine learning models perform related to certain respective parameters. Applicant also states that computing resources may be saved by only using the most highly rated machine learning model to generate candidate drug compounds. Applicant’s argument of improvement is a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art. The specification does not sufficiently disclose a need or the technical problem and explains the details of an unconventional technical solution expressed in the claim. From the asserted improvement, it is not clear how the claimed invention improves over existing technology and it is also not clear how one would gauge the improvement since there are no metrics for comparison between the claimed technology and previous technology. Paragraphs [0250]-[0251] of the specification cited by the Applicant also does not provide sufficient details. Overall, one of ordinary skill in the art cannot gauge whether the improvements asserted are delivered by the claims because the details provided in the specification do not provide sufficient details such that the improvement would be apparent, do not explain the details of an unconventional technical solution expressed in the claim, or identify technical improvements realized by the claim over the prior art. As stated in MPEP 2106.05(a) and MPEP 2106.04(d), the disclosure must provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. Furthermore, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. An indication that the claimed invention provides an improvement can include a discussion in the specification that identifies a technical problem and explains the details of an unconventional technical solution expressed in the claim, or identifies technical improvements realized by the claim over the prior art. (see MPEP 2106.05(a) and MPEP 2106.04(d)).
Regarding Ex Parte Desjardins, Appeal No. 2024-0005676 (PTAB, September 26, 2025, Appeals Review Panel (ARP) Decision), the Appeals Review Panel (ARP) determined that the specification identified improvements as to how the machine learning model itself operates, including training a machine learning model to learn new tasks while protecting knowledge about previous tasks to overcome the problem of “catastrophic forgetting” encountered in continual learning systems. The ARP also determined that the technological improvements are associated with reduced storage, reduced system complexity and streamlining, and preservation of performance attributes associated with earlier tasks during subsequent computational tasks that were disclosed in the specification.
Regarding Ex Parte YiFang Liu, Appeal 2019-002937 (June 26, 2020) decision, the PTAB determined that the claims provide clear improvements to the way computer-based machine learning models are built and operate. The PTAB also determined that the features recited in the claims improve machine learning systems by enabling the system to be able to provide “better feature selection” and consequently better predictions than the prior systems, as discussed in the Specification.
In the instant claimed subject matter, as mentioned above, Applicant’s argument of improvement is a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art. It is also not clear how the claimed invention improves over existing technology and it is also not clear how one would gauge the improvement since there are no metrics for comparison between the claimed technology and previous technology. Overall, the specification does not sufficiently disclose a need or the technical problem and explains the details of an unconventional technical solution expressed in the claim, or identifies technical improvements realized by the claim over the prior art. (see MPEP 2106.05(a) and MPEP 2106.04(d)).
Conclusion
No claims are allowed.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KETTIP KRIANGCHAIVECH whose telephone number is (571)272-1735. The examiner can normally be reached 8:30am-5:00pm EDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Larry D. Riggs can be reached on (571) 270-3062. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/K.K./Examiner, Art Unit 1686
/LARRY D RIGGS II/Supervisory Patent Examiner, Art Unit 1686