DETAILED ACTION
Applicant’s response filed 01/21/2026 has been fully considered. The following rejections and/or objections are either reiterated or newly applied.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Status
Claims 23-26 are newly added by Applicant.
Claims 1-26 are currently pending and are herein under examination.
Claims 1-26 are rejected.
Claim 7 is objected.
Priority
The instant application claims domestic benefit to U.S. Application No. 62/990,814 filed 03/17/2020. The claim to domestic benefit for claims 1-26 is acknowledged. As such, the effective filing date for claims 1-26 is 03/17/2020.
Information Disclosure Statement
The IDS filed 12/30/2025 follows the provisions of 37 CFR 1.97 and has been considered in full. A signed copy of the list of references cited from this IDS is included with this Office Action.
Withdrawn Rejections
35 USC 112(b)
The rejection of claims 6-8, 10-20 and 22 under 35 U.S.C. 112(b) are withdrawn in view of claim amendments.
35 USC 103
The rejection of claims 1-6, 9-16 and 19-22 under 35 U.S.C. 103 as being unpatentable over Geeleher et al. and Gautier et al. in view of Sakellaropoulos et al. is withdrawn in view of claim amendment. A new ground of rejection is set forth necessitated by claim amendment.
The rejection of claims 7 and 17 under 35 USC 103 for being unpatentable over Geeleher et al. and Gautier et al. in view of Sakellaropoulos et al. and in further view of Correia et al. and Sha et al. is withdrawn in view of claim amendment. A new ground of rejection is set forth necessitated by claim amendment.
The rejection of claims 8 and 18 under 35 USC 103 for being unpatentable over Geeleher et al. and Gautier et al. in view of Sakellaropoulos et al., Correia et al., and Sha et al. and in further view of Roszik et al. is withdrawn in view of claim amendment. A new ground of rejection is set forth necessitated by claim amendment.
Claim Objections
The objections to claims 4, 10, 16 and 22 are withdrawn in view of claim amendments.
Claim 7 is objected to because of the following informality: line 12 recites “non- responders” which should be “non-responders”. Appropriate correction is required.
Claim Rejections - 35 USC § 112
35 USC 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-26 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
This rejection is newly recited as necessitated of claim amendment.
Claim 1, lines 27-29, recites the phrase “the gene expression data … the gene expression data”, which renders the claim indefinite. It is unclear if the phrase refers to the baseline or disease agnostic gene expression or both. To overcome this rejection, clarify which data is being referenced.
Claim 1, line 36, recites “the tumor sample” which renders the claim indefinite. It is unclear which tumor sample is being referenced because line 35 recites “at least one tumor sample”. To overcome this rejection, clarify which tumor sample is being referenced.
Furthermore, claims 2-9, 21, and 23-24 are also rejected because they depend on claim 1, which is rejected, and because they do not resolve the issue of indefiniteness.
Claim 10, line 5, recites “the baseline gene expression data” which lacks antecedent basis. To overcome this rejection, provide antecedent basis.
Claim 10, line 29, recites “the tumor sample” which renders the claim indefinite. It is unclear which tumor sample is being referenced because line 28 recites “at least one tumor sample”. To overcome this rejection, clarify which tumor sample is being referenced.
Furthermore, claims 11-20, 22 and 25-26 are also rejected because they depend on claim 10, which is rejected, and because they do not resolve the issue of indefiniteness.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-26 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea and a natural phenomenon without significantly more.
Any newly recited portions herein are necessitated by claim amendment.
Step 1:
Step 1 asks whether the claims recite statutory subject matter. In the instant application, claims 1-9, 21 and 23-24 recite a method, and claims 10-20, 22 and 25-26 recite a method. As such, these claims recite statutory subject matter (Step 1: YES).
Step 2A, Prong 1:
Claims that recite statutory subject matter are analyzed under Step 2A, Prong 2 to determine if they recite any concepts that equate to an abstract idea, law of nature or natural phenomena. The instant claims recite the following limitations that equate to one or more categories of judicial exception:
Claim 1 recites “determining … disease agnostic gene expression data associated with a first plurality of genes, wherein the first plurality of genes are sequenced from a first plurality of tumor samples, wherein the disease agnostic gene expression data comprises an indication that each tumor sample of the first plurality of tumor samples is either a responder or a non-responder to a therapy; … wherein the first plurality of genes and the second plurality of genes comprise at least one gene in common, … wherein the disease agnostic gene expression data relates to a different disease than the baseline gene expression data; determining … based on the disease agnostic gene expression data and the baseline gene expression data, a plurality of features for a neural network of a disease specific predictive machine learning model, wherein determining the plurality of features comprises computationally deriving regulatory-based feature representations using transcription regulator (TR) information that reflects regulatory relationships between transcription regulators and the first plurality of genes and the second plurality of genes, wherein the plurality of features are generated by transforming the gene expression data into a feature space different from raw gene-level expression values to produce derived computational feature representations not explicitly present in the gene expression data and, are generated through automated processing …; training … based on only a first portion of the baseline gene expression data, the neural network of the disease specific predictive machine learning model according to the plurality of features; training … based on only a first portion of the baseline gene expression data, the neural network of the disease specific predictive machine learning model according to the plurality of features; using … generate a therapy response prediction for at least one tumor sample based on gene expression data associated with the tumor sample;”
Claim 3 recites “wherein one or more of the first plurality of genes comprise one or more of an immune cell type and/or function gene set, a tumor microenvironment component and signaling gene set, or a cancer cell proliferation and DNA repair gene set.”
Claim 4 recites “wherein determining the disease agnostic gene expression data associated with the first plurality of genes comprises: determining, based on the baseline gene expression data, a list of the second plurality of genes; querying one or more databases comprising the first plurality of genes are sequenced from the first plurality of tumor samples with the list of the second plurality of genes to identify one or more gene data sets that comprise at least one gene in the list of the second plurality of genes; and generating, based on the one or more gene data sets, the disease agnostic gene expression data.”
Claim 5 recites “wherein the disease agnostic gene expression data is comprised of gene data from a plurality of different gene data sets associated with different conditions and generated from various data types and/or platforms.”
Claim 6 recites “determining which tumors associated with the second plurality of tumor samples are responders or non-responders to the therapy after each tumor associated with the second plurality of tumor samples has received the therapy; labeling the baseline gene expression levels for each tumor associated with the second plurality of tumor samples, as responder or non-responder; and generating, based on the labeled baseline gene expression levels, the baseline gene expression data.”
Claim 7 recites “wherein determining, based on the disease agnostic gene expression data and the baseline gene expression data, the plurality of features for the neural network of the disease specific predictive machine learning model comprises: determining, from the disease agnostic gene expression data, genes present in two or more of the plurality of different gene data sets as a first set of candidate genes; determining, from the baseline gene expression data, genes of the first set of candidate genes expressed at greater than or equal to 2 Transcripts Per Million (TPM) in at least half of the second plurality of tumor samples as a second set of candidate genes; and determining, from the baseline gene expression data, genes of the second set of candidate genes with an expression level that differs by at least a predetermined fold change between responders and non-responders as a third set of candidate genes, wherein the plurality of features comprises the third set of candidate genes.”
Claim 8 recites “wherein determining, based on the disease agnostic gene expression data and the baseline gene expression data, the plurality of features for the neural network of the disease specific predictive machine learning model comprises: determining, for the third set of candidate genes, a tumor mutational burden (TMB) value for each of the second plurality of tumor samples associated with the third set of candidate genes; and determining, based on the TMB values, a fourth set of candidate genes, wherein the plurality of features comprises the fourth set of candidate genes.”
Claim 9 recites “identifying, based on training the neural network of the disease specific predictive machine learning model according to the plurality of features, a gene signature indicative of a responder.”
Claim 10 recites “providing … to a neural network of a disease specific predictive machine learning model, the baseline gene expression data, wherein the neural network of the disease specific predictive machine learning model is trained on only a portion of baseline gene expression data using a plurality of features derived from disease agnostic gene expression data and the baseline gene expression data, wherein determining the plurality of features comprises computationally transforming the disease agnostic gene expression data and the baseline gene expression data into a reduced feature set by applying a multi-step feature selection pipeline, wherein the disease agnostic gene expression data is associated with a first plurality of genes and comprises an indication that each tumor sample of a first plurality of tumor samples is either a responder or a non-responder to the therapy, … wherein the first plurality of genes and the second plurality of genes comprise at least one gene in common; using … to generate a therapy response prediction for at least one tumor sample based on gene expression data associated with the tumor sample, wherein generating the therapy response prediction comprises computationally deriving regulatory-based feature representations using transcription regulator (TR) information that reflects regulatory relationships between transcription regulators and the first plurality of genes and the second plurality of genes, wherein the plurality of features are generated by transforming the gene expression data into a feature space different from raw gene-level expression values to produce derived computational feature representations not explicitly present in the gene expression data and are generated through automated processing …; and determining … that the subject is a candidate for the therapy.”
Claim 11 recites “recommending use of the therapy for treatment of the subject.”
Claim 12 recites “training the neural network of the disease specific predictive machine learning model.”
Claim 13 recites “wherein training the neural network of the disease specific predictive machine learning model comprises: determining disease agnostic gene expression data associated with the first plurality of genes; determining baseline gene expression data associated with the second plurality of genes, wherein the second plurality of genes are sequenced from the second plurality of tumor samples; determining, based on the disease agnostic gene expression data and the baseline gene expression data, the plurality of features for the neural network of the disease specific predictive machine learning model; training, based on only the first portion of the baseline gene expression data, the neural network of the disease specific predictive machine learning model according to the plurality of features; … ; testing, based on a second portion of the baseline gene expression data, the neural network of the disease specific predictive machine learning model;”
Claim 14 recites “determining one or more gene data sets that comprise at least one gene of the second plurality of genes; and generating, based on the one or more gene data sets, the disease agnostic gene expression data.”
Claim 15 recites “wherein the disease agnostic gene expression data is comprised of gene expression data from a plurality of different gene expression data sets.”
Claim 16 recites “determining baseline gene expression levels for the second plurality of genes in each tumor associated with the second plurality of tumor samples; determining which tumors associated with the second plurality of tumor samples are responders or non-responders to the therapy after each tumor associated with the second plurality of tumor samples has received the therapy; labeling the baseline gene expression levels for each tumor associated with the second plurality of tumor samples, as responder or non-responder; and generating, based on the labeled baseline gene expression levels, the baseline gene expression data.”
Claim 17 recites “determining, from the disease agnostic gene expression data, genes present in two or more of the plurality of different gene data sets as a first set of candidate genes; determining, from the baseline gene expression data, genes of the first set of candidate genes expressed at greater than or equal to 2 Transcripts Per Million (TPM) in at least half of the second plurality of tumor samples as a second set of candidate genes; and determining, from the baseline gene expression data, genes of the second set of candidate genes with an expression level that differs by at least a predetermined fold change between responders and non-responders as a third set of candidate genes, wherein the plurality of features comprises the third set of candidate genes.”
Claim 18 recites “determining, for the third set of candidate genes, a tumor mutational burden (TMB) value for each of the second plurality of tumors associated with the third set of candidate genes; and wherein the plurality of features comprises the fourth set of candidate genes.”
Claim 19 recites “wherein training, based on the first portion of the baseline gene expression data, the neural network of the disease specific predictive machine learning model according to the plurality of features results in determining a gene signature indicative of a responder.”
Claim 20 recites “wherein the therapy is a cancer treatment.”
Claims 21 and 22 recite “wherein the multi-step feature selection pipeline comprises: identifying genes that occur in two or more data sets of the disease agnostic gene expression data; filtering the identified genes based on expression thresholds in the baseline gene expression data; and selecting, from the filtered genes, genes with statistically significant expression differences between responders and non-responders to a therapy.”
Claim 23 and 25 recite “identifying a plurality of transcription regulators based on transcription regulator gene data derived from at least one of the disease agnostic gene expression data or the baseline gene expression data; generating, based on the plurality of transcription regulators and genes used to derive the plurality of features for the disease specific predictive model, a transcription regulator network; and determining, based on the transcription regulator network and the baseline gene expression data associated with the second plurality of genes, an enrichment score for each transcription regulator of the plurality of transcription regulators.”
Claims 24 and 26 recite “wherein determining the plurality of features further comprises selecting one or more transcription regulators for inclusion in the regulatory-based feature representations based on the enrichment scores.”
Limitations reciting a mental process.
The above cited limitations in claims 1, 4, 6-8, 10-11, 13-14, 16-18, and 21-26 are recited at such a high level of generality that they equate to a mental process because they are similar to the concepts of collecting information, analyzing it, and displaying certain results of the collection and analysis in Electric Power Group, LLC, v. Alstom (830 F.3d 1350, 119 USPQ2d 1739 (Fed. Cir. 2016)), which the courts have identified as concepts that can be practically performed in the human mind. The paragraphs below discuss the limitations in these claims that recite a mental process under their broadest reasonable interpretation (BRI).
Regarding claims 1 and 10, the BRI of determining disease agnostic gene expression data includes searching through databases of labeled gene expression data that contains responder and non-responders to a therapy then compiling the gene expression data. The BRI of determining that a subject is a candidate for the therapy includes analyzing gene expression data to determine if a subject is a suitable candidate.
Regarding claims 1 and 10 of computationally deriving regulatory-feature representations includes a mental process. The BRI of a “regulatory-based feature representation” includes a gene regulatory network (GRN) that contains nodes and edges. Knowledge from databases such as KEGG and ChIP-Seq experiments (transcription regulator information) can be accessed to build a template network which is then fitted with genes that are present in a gene expression dataset (first/second plurality of genes). The GRN can be constructed using mathematical models such Bayesian networks, which combine probability and graph theory to model properties of a GRN. The result of these GRNs are computational representations of regulatory networks (computational feature representations not explicitly present in the gene expression data). A human is capable of performing the mathematical operations Bayesian network or even a simple Boolean network.
Regarding claims 4, 6-8, 11, 14 and 16-18, a human is capable of creating lists, querying databases, identifying genes, determining which subjects are responders or non-responder by analyzing gene expression data, labeling data, compiling gene expression data sets, calculating a tumor mutational burden, determining that gene expression is greater than a TPM threshold, determining a relative increase in expression between two variables, and determining candidate genes based on data analysis. A human can also make a therapeutic recommendation.
Regarding claim 13, the BRI of these limitations includes finding and selecting datasets with particular genes that have associated gene expression values and generating a plurality of features from the selected genes. The BRI of testing the neural network includes analyzing the output of the trained neural network to determine if its results are accurate.
Regarding claims 21-22, the BRI of these limitations includes identifying genes that are shared across data sets, evaluating gene expression data based on thresholds, and making selections based on the threshold.
Regarding claims 23 and 25, identifying transcription regulators from gene expression data includes searching for gene name with associated expression data. A transcription regulator network includes drawing on pen and paper directed nodes and edges.
Regarding claims 24 and 26, a human can make a selection based on analyzing data.
Limitations reciting a mathematical concept.
The above cited limitations in claims 1, 8-10, 12-13 and 18-19 equate to a mathematical concept because these limitations are similar to the concepts of organizing and manipulating information through mathematical correlations in Digitech Image Techs., LLC v Electronics for Imaging, Inc. (758 F.3d 1344, 111 U.S.P.Q.2d 1717 (Fed. Cir. 2014)), which the courts have identified as mathematical concepts. The paragraph below discusses the limitations in these claims that recite a mathematical concept under their broadest reasonable interpretation (BRI).
Regarding claims 1 and 10, the BRI of “deriving regulatory-based feature representations” includes mathematical functions and calculations. Delgado et al. (“Delgado”; Artificial intelligence in medicine 95 (2019): 133-145; newly cited) for a review on computational models for GRN reconstruction and analysis (abstract). Biological knowledge such as pathways can be integrated into mathematical models for network reconstruction (sec. 4.4). The models include differential equations and Boolean networks (sec. 3). These models use gene expression to generate a feature space with edges and nodes, where the edges have weights (Figure 1). Feature selection is also a mathematical operation which uses clustering (sec. 4.1).
Regarding claims 1, 9-10, 12-13 and 19, the BRI of a training a neural network on gene expression data includes using gradient descent and backpropagation. Regarding claim 13, the BRI of testing the neural network, which is trained, includes calculating metrics such as accuracy, precision or and F-score. Regarding claims 8 and 18, the BRI of determining a tumor mutational burden value includes performing calculations to derive a numerical value.
Regarding claims 23 and 25, calculating an enrichment score includes calculations such as a running-sum statistic.
Limitations reciting organizing human activity.
Regarding the above cited limitation in claim 11 of recommending use of the therapy for treatment of the subject, this limitation equates to organizing human activity because it is similar to managing personal behavior or relationships or interactions between people (See MPEP 210604(a)(2).II.C). The BRI of this limitation includes a healthcare professional recommending a therapy to a patient.
Limitations reciting a natural phenomenon.
Claims 1, 6, 9-10, 13, 16 and 19 equate to a natural phenomenon because they are similar to the concept of the natural relationship between a patient’s CYP2D6 metabolizer genotype and the risk that the patient will suffer QTc prolongation after administration of a medication called iloperidone, Vanda Pharmaceuticals Inc. v. West-Ward Pharmaceuticals, 887 F.3d 1117, 1135-36, 126 USPQ2d 1266, 1281 (Fed. Cir. 2018). These claims use expression of specific genes to determine whether a subject is a responder or non-responder to treatment, wherein altered expression of the subject’s genes to the treatment is a natural phenomenon.
Limitations included in the recited judicial exception.
Regarding the above cited limitations in claims 3, 5, 15 and 20, these limitations are included in the recited judicial exception in claims 1, 10 and 13-14 because they further limit the abstract ideas.
As such, claims 1-26 recite an abstract idea and a natural phenomenon (Step 2A, Prong 1: YES).
Additional Elements:
Once the judicial exception(s) has/have been identified, additional elements in the claims are analyzed under Step 2A, Prong 2 then Step 2B. The instant claims recite the following additional elements that are analyzed below under both Step 2A, Prong 2 and Step 2B:
Claim 1 recites “sequencing a second plurality of genes from a second plurality of tumor samples corresponding to one or more patients and generating, based on the sequencing, baseline gene expression data associated with the second plurality of genes; storing, by the computing device at a memory, the baseline gene expression data … wherein the baseline gene expression data comprises an indication that each tumor sample of the second plurality of tumor samples is either a responder or a non-responder to the therapy … by the computing device … by the computing device, the trained neural network to … and storing, by the computing device in the memory, the trained neural network of the disease specific predictive machine learning model.”
Claim 2 recites “wherein the second plurality of genes comprise one or more immune cell marker genes”.
Claim 6 recites “wherein sequencing the second plurality of genes from the second plurality of tumor samples to generate the baseline gene expression data associated with the second plurality of genes comprises: determining baseline gene expression levels for the second plurality of genes in each tumor associated with the second plurality of tumor samples;”
Claim 10 recites “sequencing a plurality of genes from a tumor of a subject and generating, based on the sequencing, gene expression data associated with the plurality of genes for the subject; storing, by a computing device at a memory, the baseline gene expression data; … by the computing device … wherein the baseline gene expression data is associated with a second plurality of genes and comprises an indication that each tumor sample of a second plurality of tumor samples is either a responder or a non-responder to the therapy … and wherein the disease agnostic gene expression data relates to a different disease than the baseline gene expression data; … by the computing device, the trained neural network … by the computing device … by the computing device, based on the neural network of the disease specific predictive machine learning model ...”
Claim 13 recites “outputting, based on the testing, the neural network of the disease specific predictive machine learning model.”
Step 2A, Prong 2:
Claims found to recite a judicial exception under Step 2A, Prong 1 are then further analyzed to determine if the claims as a whole integrate the recited judicial exception into a practical application or not (Step 2A, Prong 2). The judicial exception is not integrated into a practical application because the claims do not recite additional elements that reflect an improvement to a computer, technology, or technical field (MPEP § 2106.04(d)(1) and 2106.5(a)), require a particular treatment or prophylaxis for a disease or medical condition (MPEP § 2106.04(d)(2)), implement the recited judicial exception with a particular machine that is integral to the claim (MPEP § 2106.05(b)), effect a transformation or reduction of a particular article to a different state or thing (MPEP § 2106.05(c)), nor provide some other meaningful limitation (MPEP § 2106.05(e)). Rather, the claims include limitations that equate to an equivalent of the words “apply it” and/or to instructions to implement an abstract idea on a computer (MPEP § 2106.05(f)), insignificant extra-solution activity (MPEP § 2106.05(g)), and field of use limitations (MPEP § 2106.05(h)). The paragraphs below discuss the additional elements recited above in the instant claims.
Regarding claims 1 and 10 of (i) storing, by the computing device at memory, baseline gene expression data (ii) storing, by the computing device in memory, the trained neural network, and (iii) by the computing device. There are no limitations that the computing device requires anything other than a generic computer. MPEP 2106.05(f)(2) recites “Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general-purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more.” Therefore, these limitations equate to mere instructions to implement an abstract idea on a generic computer, which the courts have established does not render an abstract idea eligible in Alice Corp. 573 U.S. at 223, 110 USPQ2d at 1983.
Regarding the above cited limitation in claim 2, this limitation also equates to storing data in memory because it further limits the second plurality of genes that are stored in memory as part of the baseline gene expression.
Regarding claims 1, 6, 10 and 13 of sequencing a second plurality of genes to generate baseline gene expression, these limitations equate to insignificant extra-solution activity of necessary data gathering because they acquire gene expression data used in the judicial exception of determining a plurality of features.
Regarding claim 13 of outputting the neural network, this equates to necessary data outputting.
Regarding claims 1 and 10 of “the trained neural network” and “based on the neural network of the disease specific predictive machine learning model”, the BRI of these limitations includes them being mere instructions to implement an abstract idea on a generic computer. MPEP 2106.05(f) provides the following considerations for determining whether a claim simply recites a judicial exception with the words “apply it” (or an equivalent), such as mere instructions to implement an abstract idea on a computer: (1) whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished; (2) whether the claim invokes computers or other machinery merely as a tool to perform an existing process; and (3) the particularity or generality of the application of the judicial exception. The two following paragraphs provide analysis under these considerations.
The neural network (NN) and trained NN perform the abstract ideas of “generate a therapy response prediction based on gene expression data” and “determining the subject is a candidate for the therapy”. The NN and trained NN are used to generally apply the abstract ideas without placing any limits on how they function. Rather, these limitations only recite the outcome of generating a therapy response prediction and determining a subject as a candidate. These limitations do not include any details about how the generating/determining is accomplished. See MPEP 2106.05(f).
These limitations also merely indicate a field of use or technological environment in which the judicial exception is performed. Although these limitations limit the identified judicial exceptions, these limitations merely confine the use of the abstract idea to a particular technological environment of neural networks and thus fail to add an inventive concept to the claims. See MPEP 2106.05(h).
As such, claims 1-26 are directed to an abstract idea and a natural phenomenon (Step 2A, Prong 2: NO).
Step 2B:
Claims found to be directed to a judicial exception are then further evaluated to determine if the claims recite an inventive concept that provides significantly more than the judicial exception itself (Step 2B). These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because these claims recite additional elements that equate to instructions to apply the recited exception in a generic way and/or in a generic computing environment (MPEP § 2106.05(f)) and to well-understood, routine and conventional (WURC) limitations (MPEP § 2106.05(d)). The paragraphs below discuss the additional elements recited above in the instant claims.
Regarding the above cited limitation in claims 1 and 10 of performing the abstract ideas “by the computing device”, there are no limitations that the computing device requires anything other than a generic computer. Therefore, these limitations equate to mere instructions to implement an abstract idea on a generic computer, which the courts have established does not render an abstract idea eligible in Intellectual Ventures I LLC v. Capital One Bank (USA), 792 F.3d 1363, 1367, 115 USPQ2d 1636, 1639 (Fed. Cir. 2015).
Regarding the above cited limitations in claims 1 and 10 of storing, by the computing device at memory, baseline gene expression data and the trained neural network, these limitations equate to storing information in memory, which the courts have established as a WURC function of a generic computer in Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015).
Regarding the above cited limitation in claim 2, this limitation also equates to storing data in memory because it further limits the second plurality of genes that are stored in memory as part of the baseline gene expression.
Regarding the above cited limitation in claim 13 of outputting the neural network, the BRI of this limitation includes transmitting the neural network across a network using a computer, which equates to receiving/transmitting data over a network, which the courts have established as WURC limitation of a generic computer in buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014).
Regarding claim 1 and 10 of “the trained neural network” and “based on the neural network of the disease specific predictive machine learning model”, these limitations equate to instructions to “apply” the abstract ideas, which cannot provide an inventive concept. See MPEP 2106.05(f). See above in section Step 2A, Prong 2 for further discussion.
Regarding claims 1, 6 and 10, of sequencing a plurality of tumor samples from patients to generate gene expression data and using generic computer components/functions, when viewed in combination these limitations are not WURC as taught by the instant specification, George et al. (“George”; Immunity 46, no. 2 (2017): 197-204; newly cited), Lesurf et al. (“Lesurf”; Annals of Oncology 28, no. 5 (2017): 1070-1077; newly cited), and Li et al. (“Li”; Journal of Cellular and Molecular Medicine 24, no. 8 (11 March 2020): 4726-4735; newly cited).
The instant specification teaches using WURC techniques such as RNA-seq and other next generation sequencing techniques [46].
George analyzed a treatment-naive patient by analyzing primary tumor, treatment-resists metastatic (summary), and RNA-seq was performed on a pre-treatment tumor of the patient (pg. 198, col. 2, last para.). George also accessed RNA-seq data from untreated primary sarcomas from TCGA (pg. 200, col. 1, last para.).
Lesurf performed RNA-seq on pre-treatment tumor biopsies and predicts patients who will not respond to treatment (abstract) (pg. 10,71, col. 2, para. 4). Table 1 shows that the patients are identified as response or non-responders, pathogenic complete response and residual disease, respectively.
Li uses RNA-seq data from TCGA of 516 pre-treatment primary lower grade gliomas from patients (sec. 2.1 and 3.4). Patient stratification was validated to predict treatment response, indicating that patients were labeled as responders and non-responders (Figure 4).
When these additional elements are considered individually and in combination, they do not provide an inventive concept because they all equate to WURC functions/components of a generic computer, mere instructions to apply an abstract idea on a generic computer, and to a WURC method of sequencing baseline gene expression of tumor samples in combination with a generic computer functions/components as taught above by the instant specification, George, Lesurf, and Li. Therefore, these additional elements do not transform the claimed judicial exception into a patent-eligible application of the judicial exception and do not amount to significantly more than the judicial exception itself (Step 2B: No).
As such, claims 1-26 are not patent eligible.
Response to Arguments under 35 USC 101
Applicant's arguments filed 01/21/2026 have been fully considered but they are not persuasive.
Applicant argues that the limitation in claims 1 and 10 of sequencing tumor samples to generate baseline gene expression data does not recite a judicial exception (sec. i on pg. 15-16 of Applicant’s remarks). This is persuasive in part because claim 1 requires sequencing the baseline gene expression, but claim 10 does not recite sequencing baseline gene expression. Rather claim 10 recites “sequencing a plurality of genes from a tumor of a subject and generating, based on the sequencing, gene expression data associated with the plurality of genes for the subject”. Nonetheless, this limitation in claim 10 recites an additional element.
Applicant appears to argue that the following limitations in claim 1 do not recite a mental process: “computationally deriving regulatory-based feature representations using transcription regulator (TR) information that reflects regulatory relationships between transcription regulators and the first plurality of genes and the second plurality of genes, wherein the plurality of features are generated by transforming the gene expression data into a feature space different from raw gene-level expression values to produce derived computational feature representations not explicitly present in the gene expression data and, are generated through automated processing by the computing device” (sec. ii, pg. 16-17 of Applicant’s remarks). Applicant’s arguments are not persuasive for the following reasons:
The BRI of a “regulatory-based feature representation” includes a gene regulatory network (GRN) that contains nodes and edges. Knowledge from databases such as KEGG and ChIP-Seq experiments (transcription regulator information) can be accessed to build a template network which is then fitted with genes that are present in a gene expression dataset (first/second plurality of genes). The GRN can be constructed using mathematical models such Bayesian networks, which combine probability and graph theory to model properties of a GRN. The result of these GRNs are computational representations of regulatory networks (computational feature representations not explicitly present in the gene expression data).
A human is capable of performing the mathematical operations Bayesian network or even a simple Boolean network. Even if a human were not capable of such calculations, these limitations would still equate to a mathematical concept because their BRI includes generating a GRN using mathematical functions that require calculations. See Delgado et al. (Artificial intelligence in medicine 95 (2019): 133-145; newly cited) for a review on computational models for GRN reconstruction and analysis that use feature selection methods, such as filtering networks by gene ontology, KEGG, RNA-seq and ChIP-seq experiments, for dimensionality reduction (abstract) (sec. 2.4) (sec. 4) (sec. 4.1, pg. 139, col. 2, para. 4) (sec. 4.4).
The recitation of “are generated through automated processing by the computing device” is an additional element that equates to mere instructions to apply the judicial exception (i.e., the mathematical models) on a computer.
Applicant argues that training a neural network is not a mental process because there is no recitation of an algorithm such as gradient descent, and that training is rather based on math but does not recite math. Applicant references Subject Matter Eligibility Example 39 (sec. iii, pg. 17-19 of Applicant’s remarks). Applicant’s arguments are not persuasive for the following reasons:
Regarding Example 39, the subject matter eligibility examples are “a teaching tool to assist Office personnel and the public in understanding how the Office applies its eligibility guidance in certain fact-specific situations” (SME front page on USPTO website). In the instant case, the instant claims differ from Example 39 in that Example 39 did not recite a judicial exception and training was performed on images. Moreover, MPEP 2106.04(a)(2)(B-C) discusses how textual replacements can be used for a particular equation and how a claim does not have to recite the word “calculating” for there to be a calculation. In the instant case, “training” is a textural replacement for using a mathematical equation such as a gradient descent to train a neural network. Training a neural network on numerical values such as gene expression or correlation coefficients encompasses math. This is similar to claim 2 step (c) of Subject Matter Eligibility Example 47 which identified training a neural network as math.
It is noted that the step of “using the trained neural network to generate a therapy response prediction” has been identified as an additional element that equates to generically apply the abstract idea. See above in section Step 2A, Prong 2. Also, the limitations for determining the plurality of features recite a judicial exception.
Applicant argues an improvement reflected in the claims by transforming gene expression data into regulatory based feature space, training a neural network using those features, and using the trained network to generate a therapy response (pg. 21, para. 1-2 of Applicant’s remarks). Applicant’s argument is not persuasive for the following reasons:
These limitations recite a judicial exception. MPEP 2106.05(a) recites “It is important to note, the judicial exception alone cannot provide the improvement.”
It is also noted that using the trained neural network to generate a therapy response has been identified as mere instructions to implement an abstract idea on a generic computer, which does not provide a practical application. See above in section Step 2A, Prong 2.
Applicant argues an improvement in accuracy, robustness, and predictive capability of machine learning systems used to analyze biological data by using, referencing Ex Parte Desjardins (pg. 21, last para. of Applicant’s remarks). Applicant’s argument is not persuasive for the following reasons:
Ex Parte Desjardins discussed how the actual training process of machine learning models was improved. This is distinct from the instant claims which provide specific data that improves machine learning predictions. Rather, the instant claims are more akin to Ex parte Wang, Appeal No. 2024-001155. In Ex Parte Wang, the PTAB decided “The question on this record, therefore, is whether Appellant’s claim 1 does more than apply established methods of machine learning to a new data environment––it does not” (pg. 31, para. 2). Instant claim 1 provides new data to a generic neural network that is trained in a generic way, as opposed to Ex Parte Desjardins where the actual training method itself was improved to improve machine learning. As of record, Applicant has not provided evidence for how the claims improve the actual training process of a machine learning data.
Applicant argues that claim 1 contains limitations that are not WURC (pg. 22, sec. ii of Applicant’s remarks). Applicant’s argument is not persuasive because it does not explicitly state which additional elements are not WURC and because there is no mention of long-read sequencing or genotyping/identification of disease-causing variants.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-6, 9-16 and 19-26 are rejected under 35 U.S.C. 103 as being unpatentable over Geeleher et al. (“Geeleher”; Genome biology, 15, 1-12; previously cited on PTO892 mailed 04/25/2025) in view of Cree et al. (“Cree”; Current opinion in pharmacology 10, no. 4 (2010): 375-379; newly cited) and Kang et al. (“Kang”; BMC bioinformatics 18, no. 1 (2017): 565; newly cited).
This rejection is newly recited and is necessitated by claim amendment.
The bold and italicized text below are the limitations of the instant claims, and the italicized text serves to map the prior art onto the instant claims.
Claim 1:
determining, by a computing device, disease agnostic gene expression data associated with a first plurality of genes, wherein the first plurality of genes are sequenced from a first plurality of tumor samples, wherein the disease agnostic gene expression data comprises an indication that each tumor sample of the first plurality of tumor samples is either a responder or a non-responder to a therapy;
Geeleher discloses a computer implemented method for prediction of chemotherapeutic response in patients using only before-treatment baseline tumor gene expression data (abstract; pg. 9, col. 2, para. 2).
The broadest reasonable interpretation of “disease agnostic gene expression data” includes using gene expression data from different types of cancer. Geeleher teaches “we identified clinical trial data-sets that had assessed tumor gene expression before drug treatment (using expression microarrays) and had subsequently measured a clear drug response phenotype” (pg. 2, col. 1, para. 2). Geeleher teaches “the four trials were for three different types of cancers treated with either cytotoxic or targeted agents” (pg. 2, col. 1, para. 2). The trials labeled patients as responders or non-responders (pg. 5, col. 1, para. 2).
The following limitation is being interpreted as a product-by-process limitation: “wherein the first plurality of genes are sequenced from a first plurality of tumor samples”. See MPEP 2113.I. This limitation defines a process previously performed to acquire the product of “disease agnostic gene expression data associated with a first plurality of genes”. As such, any method used to acquire the same product will read on this limitation. Thus, the microarray used to acquire the disease clinical trial datasets reads on this limitation.
sequencing a second plurality of genes from a second plurality of tumor samples corresponding to one or more patients and generating, based on the sequencing, baseline gene expression data associated with the second plurality of genes;
Geeleher uses microarray on cell lines to measure gene expression. However, Geeleher does not sequence tumors to acquire gene expression.
It would have been prima facie obvious to have substituted microarray for RNA-seq because Geeleher teaches their method will be enhanced by RNA-seq (pg. 8, col. 2, para. 1). There would have been a reasonable expectation of success for the substitution because Geeleher states that RNA-seq provides better estimates of gene expression than microarrays (pg. 8, col. 2, para. 1).
Cree evaluates efficacy of anti-cancer agents in cell lines versus human primary tumor tissue, and teaches that primary tumor cell explants obtained directly from human cancers improve accuracy of results during drug development (abstract) (pg. 375, col. 2, para. 2).
It would have been prima facie obvious to have substituted the cell lines in Geeleher for primary tumor cell explants as taught by Cree. There would have been a reasonable expectation of success for the substitution because Cree teaches that primary tumor samples show chemosensitivity that correlates well with clinical outcomes and have been used to develop new anti-cancer drugs and combinations (pg 375, col. 2, para. 2).
storing, by the computing device at a memory, the baseline gene expression data,
Geeleher teaches the cell line data was downloaded from ArrayExpress “(pg. 9, col. 2, para. 3). (pg. 1, col. 2, last para.).
wherein the first plurality of genes and the second plurality of genes comprise at least one gene in common, wherein the baseline gene expression data comprises an indication that each tumor sample of the second plurality of tumor samples is either a responder or a non-responder to the therapy, wherein the disease agnostic gene expression data relates to a different disease than the baseline gene expression data;
Geeleher shows in Figure 1 gene expression of cell lines and clinical trials is combined and a subset is acquired based on common genes between the two data sets (at least one gene in common) (pg. 10, col. 1, para. 1). The GCP cell lines were characterized as sensitive and resistant (responder or non-responder to therapy) (pg. 10, col. 2, para. 2). Geeleher states it’s advantageous to include cancer types that are not the same as the cancer type of the clinical trial data (relates to a different disease) (pg. 2, col. 2, last para – pg. 3, col. 1, para. 1).
determining, by the computing device, based on the disease agnostic gene expression data and the baseline gene expression data, a plurality of features for a neural network of a disease specific predictive machine learning model,
Geeleher teaches that the top 1,000 genes most differentially expressed genes were selected as features and were fitted into a ridge regression model (pg. 10, col. 2, para. 2), wherein the full set of genes were initially filtered by selecting genes that were shared between the GCP cell lines and clinical trial data sets (plurality of features based on baseline and disease agnostic gene expression) (pg. 10, col. 1, para. 1; Figure 1).
wherein determining the plurality of features comprises computationally deriving regulatory-based feature representations using transcription regulator (TR) information that reflects regulatory relationships between transcription regulators and the first plurality of genes and the second plurality of genes, wherein the plurality of features are generated by transforming the gene expression data into a feature space different from raw gene-level expression values to produce derived computational feature representations not explicitly present in the gene expression data and, are generated through automated processing by the computing device;
Geeleher selects features based on gene expression data from cell lines and primary tumors (plurality of features using disease agnostic first plurality of genes) (Figure 1). The feature selection process is performed by a computer (automated processing by the computing device).
However, Geeleher does not derive a regulatory-based feature representation using transcription regulator information or transform the gene expression data into a feature space different from raw gene-level expression values.
Kang predicts treatment response using a neural network based, baseline gene expression, and prior knowledge on gene-regulatory interactions that utilize upstream regulatory mechanisms (abstract). Kang converts a gene regulatory network (transcription regulator information that reflects regulatory relationships between transcription regulators) into a neural network (Figure 1). Figure 2 shows the NN gene regulatory network where edges indicate a regulatory interaction between a source node (regulator) and a target node (gene), wherein a regulator can be a protein, mRNA or compound (regulatory-based feature representations) (pg. 2, col. 2, last para.). The connections are edge weights (transforming gene expression data into a feature space different from raw gene-level expression values) (pg. 3, sec. Regularization).
It would have been prima facie obvious to have modified the feature selection method of Geeleher that homogenizes gene expression datasets of the cell lines and primary tumors by further performing feature selection by using the NN gene regulatory network of Kang. The motivation for doing so is taught by Kang who teaches that leveraging prior biological knowledge into biomarker discovery is a promising alternative to data-driven methods and that integrating prior biological knowledge into classification significantly improves robustness and generalizability of predictions to independent datasets (abstract) (pg. 2, col. 1, para. 2). There would have been a reasonable expectation of success to predict treatment response using genes selected as features from the NN gene regulatory network of Kang because stratification of patient response using risk factors has been successful in oncology (Kang at pg. 1, col. 2, para. 1). Kang also teaches that their model identifies reproducible and predictive signatures of response (pg. 9, col. 2, para. 1) (Figure 3).
training, by the computing device, based on only a first portion of the baseline gene expression data, the neural network of the disease specific predictive machine learning model according to the plurality of features; and using, by the computing device, the trained neural network to generate a therapy response prediction for at least one tumor sample based on gene expression data associated with the tumor sample;
Geeleher teaches “A ridge regression model is fitted for baseline gene expression levels in the cell lines against the in vitro drug IC50 estimates” (Figure 1, caption for step 3). Geeleher teaches “we fitted logistic ridge regression models for the 15 most sensitive (which had reliably measured IC50 values) versus the 55 most resistant CGP cell lines (see Materials and methods)” (a portion of the baseline gene expression data) (pg. 6, col. 2, para. 1). The trained ridge regression is then used to predict drug sensitivity using primary tumor samples (Figure 1, step 4) (abstract).
However, Geeleher does not train a neural network or use a trained neural network to predict therapy response.
Kang trains a NN on baseline gene expression and uses the trained NN to predict treatment response on test data (pg. 2, col. 2, para. 2-4) (results section of abstract) (Figure 3).
It would have been prima facie obvious to have the modified “base” method of a ridge regression in Geeleher for predicting treatment response based on baseline gene expression by applying the known “improved” technique of a NN in Kang. The NN is an improvement because Kang teaches the NN avoids limitations such as overfitting and undesirable effects in regression tasks when there is overlap between groups of covariates (background), and the NN outperformed other models such as Lasso and SVM (Figure 3). The result of substituting the ridge regression for the NN would have produced predictable results because Kang uses baseline gene expression to predict response to therapy (abstract) and is applicable to oncology (pg. 1, col. 2), wherein Geeleher uses baseline gene expression from cancer samples to predict patient response.
Moreover, Geeleher teaches motivation for using a different machine learning algorithm that may be better suited for RNA-seq data (pg. 8, col. 2, para. 1). Thus, the NN of Kang would be useful.
storing, by the computing device in the memory, the trained neural network of the disease specific predictive machine learning model.
Geeleher stored their method on a website in a Sweave format, which includes the ridge regression (pg. 9, col. 2, para. 2) (pg. 10, col 2, last para.). However, Geeleher does not train a neural network. Kang trains the neural network (abstract). The combination would result in saving the NN in memory rather than the ridge regression.
Claims 2-3:
Geeleher acquires gene expression for clinical trials (first plurality) and cell lines (second plurality) (Figure 1). However, Geeleher does not teach that the gene expression data contains specific types of genes. Kang shows in Figure 7 gene set enrichment for the nodes in the NN that contain inflammatory response (immune cell markers) (immune function gene set).
It would have been prima facie obvious to have modified the method of Geeleher by using the gene expression data to determine enrichment pathways as taught by Kang. Motivation is taught by Kang who teaches that enrichment analysis increases biological interpretation of results (pg. 8, sec. Biological interpretation of results). There would have been a reasonable expectation of success to perform gene set enrichment on the gene expression data of Geeleher because Kang uses TMOD in R which uses gene expression data as input (pg. 8, col. 2, para. 2).
Claim 4:
Geeleher acquires clinical trial gene expression data sets (disease agnostic gene expression data) and GCP cancer cell lines (baseline gene expression). The datasets were combined and a subset of the common genes between the two datasets were selected (querying a database with a list of the second plurality of genes), wherein the subset of common genes was used to generate features from the GCP cancer cell lines that were used to train the prediction model (Figure 8).
Claims 5 and 15:
Geeleher discloses the clinical trial datasets were collected from four different clinical trial datasets that contain three different cancer types (pg. 2, col. 1, para. 2). The limitation of “generated from various data types and/or platforms” is being interpreted as a product by process. Thus, the clinical trial datasets of Geeleher reads on this limitation.
Claim 6:
Geeleher discloses cell lines were characterized as sensitive and resistant samples (pg. 10, col. 2, para. 2). Figure 1 shows that the cell lines have associated IC50 data after treatment with drugs such as docetaxel (pg. 2, col. 2, para. 3). Geeleher uses baseline (i.e. before drug treatment) gene expression microarray data acquired from cancer cell lines from the Cancer Genome Project (GCP) (pg. 1, col. 2, last para.). Geeleher also teaches using RNA-seq to acquire gene expression values of samples (pg. 8, col. 2, para. 1).
However, Geeleher teaches cell lines not tumors for the second plurality of genes. As discussed above regarding claim 1, Cree uses primary tumors rather than cell lines because it improves accuracy of results during drug development (abstract) (pg. 375, col. 2, para. 2).
Claims 9 and 19:
Geeleher uses 1,000 genes that were most differentially expressed in the GCP cell lines as features to train the ridge regression to predict responders and non-responders (pg. 10, col. 2, para. 2) (Figure 1). However, Geeleher does not train a neural network. Kang trains a NN and discovers gene signatures through enrichment analysis (Figure 7) (pg. 9, col. 2, para. 1).
Claim 10:
The only difference between claim 1 and 10 is the recitation in claim 10 of “determining, by the computing device, based on the neural network of the disease specific predictive machine learning model, that the subject is a candidate for the therapy.” Geeleher predicts in vivo drug sensitivity of clinical trial cohorts (Figure 1; Figure 3). Kang uses a NN to predict responders in clinical cohorts (abstract) (methods). However, Geeleher and Kang do not explicitly recite determining that a subject is a candidate for the therapy. However, this limitation was obvious over the teachings of Geeleher and Kang because one of ordinary skill in the art would have recognized that predicting a patient to be sensitive to a cancer drug would constitute the patient as a candidate for therapy.
Claim 11:
Geeleher predicts cancer patient response to drugs (abstract), but does not recommend a drug to a patient based on the prediction. However, this limitation was obvious over the teachings of Geeleher because one of ordinary skill in the art would have recognized a prediction indicating that a patient is sensitive to a treatment should therefore be recommended to take the treatment.
Claim 12:
Geeleher trains a ridge regression not a neural network. Kang trains a NN for clinical outcome prediction based on baseline gene expression (abstract) (pg. 2, col. 2, para. 2).
Claim 13:
Geeleher discloses that the clinical trial data is comprised of gene expression data (disease agnostic expression data associated with the first plurality of genes) (pg. 2, col. 1, para. 2). GCP cancer cell lines are associated with gene expression data acquired from microarray expression experiments (baseline gene expression data associated with the second plurality of genes of tumor samples) (pg. 1, col. 2, last para.). The clinical trial and GCP cancer cell line datasets were combined and only a subset of the total genes is selected for features based on common genes shared between the datasets (determining based on the disease agnostic and the baseline gene expression data, a plurality of features for the disease specific predictive model) (Figure 1; pg. 10, col. 1, para. 1; pg. 10, col. 2, para. 2). The logistic ridge regression model was trained on the 1,000 genes used as features from the GCP cell lines (training, based on a first portion of the baseline gene expression data, the disease specific prediction model according to the plurality of features) (Figure 1; pg. 10, col. 2, para. 2). Leave-one-out cross validation was performed on the GCP cancer cell lines (testing, based on a second portion of the baseline gene expression data, the disease specific predictive model) (pg. 10, col. 1, para. 3), which resulted in the outputted trained predictive model (outputting, based on the testing, the disease specific predictive model) (Figure 1). However, Geeleher does not teach training or testing a neural network. Kang teaches training and testing a NN and predicts responders and non-responders (pg. 6, col. 1, para. 2) (abstract). The prima facie case for obviousness is recited above in claim 1.
Claim 14:
Geeleher uses four clinical trial datasets comprised of gene expression data (pg. 2, col. 1, para. 2). The clinical trial datasets are then combined with the GCP cancer cell lines and filtered to only contain genes shared between the two data sets (Figure 1; pg. 10, col. 1, para. 1), resulting in homogenized expression data (Figure 1; pg. 10, col. 1, para. 2).
Claim 16:
Geeleher states the GCP cell cancer lines contain baseline gene expression (i.e. before drug treatment) acquired from microarray experiments and sensitivity to 138 drugs in a panel of almost 700 cancer cell lines (determining baseline gene expression levels for each tumor associated with the second plurality of tumors) (pg. 1, col. 2, last para. – pg. 2, col. 1, para. 1). The GCP cell lines also contain sensitivity to a particular drug measured by concentration required for 50% of cellular growth inhibition (pg. 2, col. 1, para. 2; caption of Figure 1). The predicted IC50 values of the cancer cell lines were compared to the measured IC50 values after treatment, and the CGP cell line training data was divided into sensitive or resistant groups (determining which tumors associated with the second plurality of tumor samples are responder or nonresponder to the therapy after each tumor associated with the second plurality of tumors has received therapy; labeling the baseline gene expression levels for each tumor associated with the second plurality of tumor samples, as responders or non-responders) (pg. 10, col. 1, para. 3; pg. 10, col. 2, para. 2).
Claim 20:
Geeleher uses docetaxel, bortezomib, and erlotinib (pg. 9, col. 2, para. 3).
Claims 21-22:
Geeleher combines four clinical trial data sets of gene expression (Figure 1), wherein only genes present in each dataset were kept, which resulted in 10,000 genes (identifying genes that occur in two or more data sets of the disease agnostic gene expression data) (pg. 10, col. 1, para. 1). These genes were then further filters when predicting in vivo drug sensitivity for erlotinib clinical trial where only 1,000 genes were used that were most differentially expressed between sensitive and resistant samples (filtering the identified genes based on expression thresholds in the baseline gene expression data) (pg. 10, col. 2, para. 2). The genes filtered by gene expression were selected based on a t-test using the rowttest() function in genefilter (selecting, from the filtered genes, genes with statistically significant expression differences between responders and non-responders to a therapy) (pg. 10, col. 2, para. 2).
Claims 23 and 25:
identifying a plurality of transcription regulators based on transcription regulator gene data derived from at least one of the disease agnostic gene expression data or the baseline gene expression data; generating, based on the plurality of transcription regulators and genes used to derive the plurality of features for the disease specific predictive model, a transcription regulator network; and
Geeleher selects features for the ridge regression using cell lines and primary tumor gene expression data (plurality of features derived from disease agnostic and baseline gene expression data) (Figure 1). However, Geeleher does not generate transcription regulator information by identifying transcription regulator gene data.
Kang generates a gene regulatory network using the STRING DB database (transcription regulators based on transcription regulator gene data) which contains protein-protein and protein-gene interactions (Figure 1) (pg. 2, col. 2, para. 4). Genes of the baseline gene expression data that were not present in the network of regulatory interactions were filtered out (a transcription regulator network) (pg. 5, col. 1, para. 2).
determining, based on the transcription regulator network and the baseline gene expression data associated with the second plurality of genes, an enrichment score for each transcription regulator of the plurality of transcription regulators.
Geeleher discloses baseline gene expression (Figure 1), but does not calculate an enrichment score on transcription regulators based on a transcription regulator network and baseline gene expression.
Kang shows in Figure 7 an enrichment analysis for all nodes in the GRN of the NN, which used baseline gene expression as input (abstract).
It would have been prima facie obvious to have modified feature selection of Geeleher by using the gene-regulatory interaction NN model of Kang that produces reproducible and predictive signatures of treatment response (Kang pg. 9, col. 2, para. 1). This would have resulted in using a set of predictive gene signatures in Geeleher that can predict treatment response. Motivation is taught by Kang who teaches that integrating prior knowledge, such as regulatory networks, into a classification framework enables detection of predictors more likely to be biologically relevant (pg. 9, col. 2, para. 1). Kang also teaches that this method improves robustness and generalizability of predictions to independent datasets (abstract). There would have been a reasonable expectation of success to use the baseline gene expression of Geeleher in the network construction of Kang to identify biologically relevant predictors because Geeleher also uses baseline gene expression to predict treatment response and uses feature selection.
Claims 24 and 26:
Kang indicates that several nodes in the NN aggregate general immune system-related transcriptions important for discriminatory power (pg. 8, col. 1, last para. – col. 2) (Figure 6). The weights of these nodes were used to perform enrichment (Figure 7), which elucidated distinct patterns across the different disease datasets used as input (pg. 8, col. 2). The gene set enrichment in view with previous benchmarks indicate that the model accurately predicts response (pg. 8, col. 2. – pg. 9, col. 1, para. 1). It would have been prima facie obvious to one of ordinary skill in the art to have modified the feature selection of Geeleher to use enriched genes in a specific disease indicative of response prediction as taught by Kang. The motivation for doing so is because Kang teaches that they accurately and consistently predict treatment response (pg. 9, col. 1, para. 1). There is a reasonable expectation of success because the combination requires further feature selection of genes derived from baseline gene expression for response predictions.
Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Geeleher et al. (“Geeleher”; Genome biology, 15, 1-12; previously cited on PTO892 mailed 04/25/2025) in view of Cree et al. (“Cree”; Current opinion in pharmacology 10, no. 4 (2010): 375-379; newly cited) and Kang et al. (“Kang”; BMC bioinformatics 18, no. 1 (2017): 565; newly cited), as applied above to claims 1, 5, 10 and 15, and in further view of Correia et al. (“Correia”; Frontiers in genetics 9 (2018): 278; previously cited on PTO892 mailed 11/19/2024) and Sha et al. (“Sha”; In 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6461-6464. IEEE, 2015; previously cited on PTO892 mailed 04/25/2025).
This rejection is newly recited and is necessitated by claim amendment.
The limitations of claims 1, 5, 10, and 15 have been taught above by Geeleher, Cree, and Kang.
The bold and italicized text below are the limitations of the instant claims, and the italicized text serves to map the prior art onto the instant claims.
Claims 7 and 17:
determining, from the disease agnostic gene expression data, genes present in two or more of the plurality of different gene data sets as a first set of candidate genes;
Geeleher discloses combining the four clinical trial data sets of gene expression (Figure 1), wherein only genes present in each dataset were kept, which resulted in 10,000 genes (pg. 10, col. 1, para. 1).
determining, from the baseline gene expression data, genes of the first set of candidate genes expressed at greater than or equal to 2 Transcripts Per Million (TPM) in at least half of the second plurality of tumor samples as a second set of candidate genes; and
Geeleher discloses that from the 10,000 commonly shared genes, only genes that were substantially variably expressed were kept, resulting removing 20% of the genes with the lowest variability in expression across all samples (pg. 10, col. 1, para. 1) . However, Geeleher, Cree, and Kang do select genes that have an expression of 2 TMP or greater in at least half of the cancer cell lines.
Correia discloses a study on RNA-seq for measuring gene expression in peripheral blood samples (abstract). Correia uses a threshold of greater than or equal to 1 TPM across at least half of the total number of samples (≥12 for human and porcine, ≥18 for equine, and ≥5 for bovine) in order to remove lowly expressed genes (pg. 4, col. 2, para. 1).
determining, from the baseline gene expression data, genes of the second set of candidate genes with an expression level tha differs by at least a predetermined fold change between responders and non-responders as a third set of candidate genes, wherein the plurality of features comprises the third set of candidate genes.
Geeleher states that the 1,000 top differentially expressed genes were selected based off comparing the expression levels in the sensitive and resistant cancer cell lines, wherein these genes were used as the features to train the model (pg. 10, col. 2, para. 2; Figure 1).
It would have been prima facie obvious to have modified the filtered genes of Geeleher by removing genes below at least a 1 TMP threshold in at least half of the samples as taught by Correia because it would have removed any lowly expressed genes. This is advantageous because Sha states that filtering out low-expression genes improves detection of differentially expressed genes (abstract), wherein Geeleher uses differentially expressed genes for features in the predictive model (pg. 10, col. 2, para. 2). There would have had a reasonable expectation success for filtering low gene expression because Sha states that it results in improve differential gene expression identification.
Claims 8 and 18 are rejected under 35 USC 103 for being unpatentable over Geeleher et al. (“Geeleher”; Genome biology, 15, 1-12; previously cited on PTO892 mailed 04/25/2025) in view of Cree et al. (“Cree”; Current opinion in pharmacology 10, no. 4 (2010): 375-379; newly cited) and Kang et al. (“Kang”; BMC bioinformatics 18, no. 1 (2017): 565; newly cited), Correia et al. (“Correia”; Frontiers in genetics 9 (2018): 278; previously cited on PTO892 mailed 11/19/2024) and Sha et al. (“Sha”; In 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6461-6464. IEEE, 2015; previously cited on PTO892 mailed 04/25/2025), as applied above to claims 1, 5, 7, 10, 15 and 17, and in further view of Roszik et al. (“Roszik”; BMC medicine 14 (2016): 1-8; previously cited on PTO892 mailed 11/19/2024).
This rejection is newly recited and is necessitated by claim amendment.
The bold and italicized text below are the limitations of the instant claims, and the italicized text serves to map the prior art onto the instant claims.
The limitations of claim 1, 5, 10, and 15 have been taught above by Geeleher, Cree, and Kang. The limitations of claims 7 and 17 have been taught in the rejection above by Geeleher, Cree, Kang, Correia, and Sha.
Claims 8 and 18:
determining, for the third set of candidate genes, a tumor mutational burden (TMB) value for each of the second plurality of tumors associated with the third set of candidate genes;
Geeleher and Correia disclose the third set of candidate genes as discussed in the rejection above regarding claims 7 and 17. However, Geeleher, Cree, Kang, and Sha do not disclose determining a TMB value for the genes in the clinical trial datasets.
Roszik discloses a method to predict total mutational load (PTML) within tumors from a small set of genes that can be used in clinical next generation panels (abstract). PTML status was then correlated with clinical outcome following distinct immunotherapies (abstract).
and determining, based on the TMB values, a fourth set of candidate genes, wherein the plurality of features comprises the fourth set of candidate genes.
Geeleher discloses filtering genes to use them as features to train a predictive model (Figure 1). However, Geeleher, Cree, Kang and Sha do not disclose determining a fourth set of candidate genes based on TMB values.
Roszik states that the PTML (a gene set of 170 genes) is highly correlated with actual total mutational load (abstract). Roszik uses the 170 genes as features in their PTML algorithm to predict immunotherapy outcomes in melanoma (Figure 3).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the instant invention to have modified the gene filtering process of Geeleher and Correia to include a PTML status using the 170 genes of Roszik because Roszik states that the PTML gene set could predict clinical outcomes in patients to immunotherapies, which is advantageous for selecting targeted therapy approaches such as personalized use of immunotherapies (pg. 6, col. 1, para. 1 of Roszik). One of ordinary skill in the art would have had a reasonable expectation of success by combining Roszik with Geeleher and Correia because Roszik states that clinical outcome following immunotherapy has shown an association with tumor mutation load using whole exome sequencing, wherein the method of Geeleher takes into account mutations of genes and was able to accurately predict how sensitive a patient was to a drug based on the mutation (pg. 7, col. 1, para. 1; Figure 5).
Response to Arguments under 35 USC 103
Applicant's arguments filed 01/21/2026 have been fully considered but they are only persuasive in part.
Applicant argues that the claims now require sequencing both the first and second plurality of genes (pg. 23, sec. i of Applicant’s remarks). Applicant’s argument is persuasive in part for the following reasons:
Claim 1 requires sequencing the second plurality of genes (i.e. baseline gene expression), but the first plurality of genes “are sequenced” which still recites a product by process. Claim 10 requires sequencing of neither the first or second plurality of genes. Rather claim 10 requires sequencing “plurality of genes”, but this plurality is not explicitly correlated with the first or second plurality of genes.
Applicant argues that none of previously cited references disclose deriving regulatory-based feature representations or using the trained NN to generate a therapy prediction response based (pg. 24, sec. ii of Applicant’s remarks). Applicant’s argument is persuasive in part because Geeleher does not teach a neural network and does not derive regulatory-based feature representations. However, a new ground of rejection necessitated by claim amended has been applied that relies on a new reference to teach the neural network and regulatory-based feature representations.
Applicant’s remarks regarding the feature generation are noted but are not persuasive in view of the new grounds of rejection necessitated by claim amendment (pg. 25, para. 2 – pg. 26 of Applicant’s remarks).
Conclusion
No claims are allowed.
Notable, but not relied upon, prior art includes: Ali et al. (Biophysical reviews 11, no. 1 (2019): 31-39), Kibing et al. (Journal of biomedical informatics 61 (2016): 194-202), Mi et al. (Artificial intelligence in medicine 64, no. 3 (2015): 195-204), and Sun et al. (Oncotarget 7, no. 8 (2016): 9404).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Noah A. Auger whose telephone number is (703)756-4518. The examiner can normally be reached M-F 7:30-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached at (571) 272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/N.A.A./Examiner, Art Unit 1687
/KAITLYN L MINCHELLA/Primary Examiner, Art Unit 1685