Last updated: April 19, 2026
Application No. 18/753,906
DETERMINING PHENOMIC RELATIONSHIPS BETWEEN COMPOUNDS AND CELL PERTURBATIONS UTILIZING MACHINE LEARNING MODELS

Non-Final OA §101§102§112§DP
Filed
Jun 25, 2024
Examiner
ROSSI, VY BUI
Art Unit
1685
Tech Center
1600 — Biotechnology & Organic Chemistry
Assignee
Recursion Pharmaceuticals Inc.
OA Round
3 (Non-Final)
Interview Optional

— +46.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 39 resolved cases, 2023–2026
Examiner Intelligence

ROSSI, VY BUI View full profile →
Grants only 33% of cases
Career Allow Rate
13 granted / 39 resolved
-26.7% vs TC avg
Strong +47% interview lift
Without
With
+46.6%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
22 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
27.0%
-13.0% vs TC avg
§103
23.2%
-16.8% vs TC avg
§102
11.2%
-28.8% vs TC avg
§112
23.6%
-16.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 39 resolved cases
Office Action

§101 §102 §112 §DP
DETAILED ACTION
Applicant's response, filed 07/09/2025, has been fully considered.  The following rejections and/or objections are either reiterated or newly applied.  They constitute the complete set presently being applied to the instant application. Herein, "the prior Office action" refers to the Final rejection of 04/10/2025.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/07/2025 has been entered.

Claim Examination Status
Claims 1-20 are currently pending and under exam herein.
Claims 1-20 are rejected.
The instant application has been afforded Track I status.

Priority
	The previously discussed priority assigned an effective filing date of 06/25/2024, the date of filing.  

Information Disclosure Statement
The Information Disclosure Statement, filed 10/23/2025, has been considered.  Signed copies of the IDS are included with this Office Action.  

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101, because the claimed invention is directed to an abstract idea without significantly more. 
The instant rejection reflects the framework as outlined in the MPEP at 2106.04:
Framework with which to Evaluate Subject Matter Eligibility:
(1)    Are the claims directed to a process, machine, manufacture, or composition of matter;
(2A) Prong One: Do the claims recite a judicially recognized exception, i.e. a law of nature, a natural phenomenon, or an abstract idea;  
       Prong Two: If the claims recite a judicial exception under Prong One, then is the judicial exception integrated into a practical application (Prong Two); and 
(2B)  If the claims do not integrate the judicial exception, do the claims provide an inventive concept.

Framework Analysis as Pertains to the Instant Claims:
With respect to step (1): yes, the claims are directed to methods, a system, and a computer-readable storage media of trained embedding neural networks to generate structure-phenomic relationship predictions for cell perturbations, therefore the answer is "yes".  
With respect to step (2A)(1), the claims recite abstract ideas.  To determine if the claims recite any concepts that equate to an abstract idea, law of nature, or natural phenomenon, MPEP at 2106.03 teaches abstract ideas include mathematical concepts (mathematical formulas or equations, mathematical relationships, and mathematical calculations), certain methods of organizing human activity, and mental processes (including procedures for collecting, observing, evaluating, and organizing information (see MPEP 2106.04(a)(2)).  In the instant application, the claims recite the following limitations that equate to abstract idea with mental steps and mathematical concepts.  
With respect to the instant claims, under the step (2A)(1) evaluation, the claims are found herein to recite abstract ideas that fall into the grouping of mental processes (in particular procedures for obtaining, analyzing, and organizing cell bioactivity data), which are steps which can be done with the human mind, pen, and paper, as well as mathematical concepts (in particular mathematical relationships and formulas in a structure-phenomics relationship trained embedding neural network, threshold from a distribution of difference metrics, measure of loss, for phenomic similarity prediction). 
The claims directing to abstract ideas are as follows: 	
Mental processes: 
Claims 1, 10, and 16 recite generating a compound structure feature representation for a query chemical compound…generating, utilizing a structure-phenomics relationship trained embedding neural network, a phenomic similarity prediction for the compound structure feature representation and a target perturbation 
Claim 3 and 11 recite generating a similarity classification/prediction from a set of classifications comprising: a pheno-similar classification, a pheno-dissimilar classification, and a pheno-independent classification.


Mathematical concepts:
Claims 1, 7-8, 10-14, and 16-19 recite structure-phenomics relationship trained embedding neural network 
Claim 1 recites training a structure-phenomics relationship neural network… generating, , a first/second phenomic embedding…a predicted phenomic similarity… modifying parameters… determining a measure of loss
Claims 4, and 17 recite generating a phenomic image similarity by comparing the first/second phenomic embedding…
Claim 5 recites generating a difference metric between first/second phenomic embedding; and applying a pheno-similarity threshold to the difference metric to generate a pheno-similarity classification between the training compound and the training perturbation.
Claim 6 recites determining the pheno-similarity threshold from a distribution of difference metrics between the training perturbation and a plurality of additional perturbations.
Claim 7 recites generating, utilizing the structure-phenomics relationship trained embedding neural network, an additional predicted phenomic similarity from the training compound structure feature representation and an additional training perturbation.
Claim 8 recites comparing the predicted phenomic similarity with the phenomic image similarity to determine a measure of loss; and to reduce the measure of loss on a subsequent training iteration.
Claims 13 and 18 recite generating a difference metric between the first/second phenomic embedding; and applying a pheno-similarity threshold to the difference metric to generate a pheno-similarity classification between the training compound and the training perturbation, wherein the pheno-similarity threshold is determined from a distribution of difference metrics between the training perturbation and a plurality of additional perturbations.
Claims 14 and 19 recite generating…a predicted phenomic similarity from the training compound structure feature representation and the training perturbation; comparing the predicted phenomic similarity with the phenomic image similarity to determine a measure of loss; and modifying parameters of the structure-phenomics relationship neural network to reduce a difference between the predicted phenomic similarity and the phenomic image similarity on a subsequent training iteration based on the measure of loss..


Hence, the claims explicitly recite elements that, individually and in combination, constitute abstract ideas.  
With respect to step (2A), under the broadest reasonable interpretation (BRI), the instant claims recite a trained embedding neural network to generate structure-phenomic relationship predictions from cell perturbations.  Instant claims are therefore directed to the judicial exceptions of abstract groupings, both mathematical (structure-phenomics relationship trained embedding neural network, threshold from a distribution of difference metrics, measure of loss), and mental processes (generating a compound structure feature representation for a query chemical compound…generating, utilizing a structure-phenomics relationship…recite generating a similarity classification/prediction) which can be performed in the human mind with pen and paper.  
Because the claims do recite judicial exceptions, direction under step (2A)(2) provides that the claims must be examined further to determine whether they integrate the abstract ideas into a practical application (MPEP 2106.04(d).  A claim can be said to integrate a judicial exception into a practical application when it applies, relies on, or uses the judicial exception in a manner that imposes a meaningful limit on the judicial exception.  This is performed by analyzing the additional elements of the claim to determine if the abstract idea is integrated into a practical application (MPEP 2106.04(d).I.; MPEP 2106.05(a-h)). If the claim contains no additional elements beyond the judicial exception, the claim is said to fail to integrate into a practical application (MPEP 2106.04(d).III).
With respect to the instant recitations, the claims recite the following additional elements considered for practical application:
Claims 2 and 11 recite target perturbation comprises a target gene knockout perturbation or a target compound perturbation.
Claims 1, 10, and 16 recite computer-implemented, processor, system, non-transitory computer readable medium
Claims 9, 15, and 20 recite receiving the query chemical compound and the target perturbation via a user interface of a client device; and providing the phenomic similarity prediction for display via the user interface of the client device.
Claim 12 recites first/second task head of the structure-phenomics relationship neural network 


Said steps that are “in addition” to the recited judicial exception in the instant claims represent those of mere instructions or field of use limitations (target gene knockout/compound perturbations, in claims 2 and 11; utilizing a trained embedding neural network) to implement in the recited judicial exception and do not impart meaning to said recited judicial exception, such that is applied in a practical manner.  Further with respect to the additional elements in the instant claims, these steps direct to mere data gathering and handling (receiving the query chemical compound/ target perturbation providing the phenomic similarity prediction/for display via a user interface of a client device in claims 9, 15, and 20) to carry out the abstract idea without imposing any meaningful limitation on the abstract idea.  Thereby these steps are insignificant extra-solutions activity steps and are insufficient to integrate an abstract idea into a practical application (see MPEP 2106.05(g).  
Further steps herein directed to additional non-abstract elements of computer components/device networks (FIGs 6 and 8 and claims 10 and 16: computer processor, system, non-transitory computer readable medium) do not describe any specific computational steps by which the “computer parts” perform or carry out the abstract idea, nor do they provide any details of how specific structures of the computer, such as computer, processor, or system as recited in the specification [FIGs 6 and 8], are used to implement these functions. 
The claims state nothing more than a generic computer system used as a tool to perform the functions that constitute the abstract idea.  Hence, these are mere instructions to apply the abstract idea using a computer system, and therefore the claim does not integrate that abstract idea into a practical application. The courts have weighed in and consistently maintained that when, for example, a memory, display, processor, machine, etc.… are recited so generically that they represent no more than mere instructions to apply the judicial exception on a computer, and these limitations may be viewed as nothing more than generally linking the use of the judicial exception to the technological environment of a computer (see MPEP 2106.05(f)).  
	As such, the claims are lastly evaluated using the step (2B) analysis, wherein it is determined that because the claims recite abstract ideas which do not integrate the abstract ideas into a practical application, the claims also lack a specific inventive concept.  The judicial exception alone cannot provide the inventive concept or the practical application and that the identification of whether the additional elements amount to such an inventive concept requires considering the additional elements individually and in combination to determine if they provide significantly more than the judicial exception. (MPEP 2106.05.A i-vi).
With respect to the instant claims, the additional elements of data gathering, instructions, and field of use limitations described above do not rise to the level of significantly more than the judicial exception.  As directed in the Berkheimer memorandum of 19 April 2018 and set forth in the MPEP, determinations of whether or not additional elements (or a combination of additional elements) may provide significantly more and/or an inventive concept rests in whether or not the additional elements (or combination of elements) represents well-understood, routine, conventional activity.  Said assessment is made by a factual determination stemming from a conclusion that an element (or combination of elements) is widely prevalent or in common use in the relevant industry, which is determined by either a citation to an express statement in the specification or to a statement made by an applicant during prosecution that demonstrates a well-understood, routine or conventional nature of the additional element(s); a citation to one or more of the court decisions as discussed in MPEP 2106(d)(II) as noting the well-understood, routine, conventional nature of the additional element(s); a citation to a publication that demonstrates the well-understood, routine, conventional nature of the additional element(s); and/or a statement that the examiner is taking official notice with respect to the well-understood, routine, conventional nature of the additional element(s).  
With respect to the instant recitations, the claims recite the following additional elements considered for inventive concepts:
Claims 2 and 11 recite target perturbation comprises a target gene knockout perturbation or a target compound perturbation.
Claims 1, 10, and 16 recite computer-implemented, processor, system, non-transitory computer readable medium
Claims 9, 15, and 20 recite receiving the query chemical compound and the target perturbation via a user interface of a client device; and providing the phenomic similarity prediction for display via the user interface of the client device.

These additional elements do not contribute significantly more to well-known and conventional systems of chemical search with phenotypic factors related to cell perturbations, which are routinely evaluated by those with ordinary skill in the art, as of the effective filing date, as disclosed in the instant specification at [0020] disclosing  Mendez-Lucio et al (2022, IDS document; herein Mendez-Lucio) and illustrating that said system of transformer-based models are used for modeling in drug discovery in general (abstract-Mendez-Lucio).  Mendez-Lucio discloses a molecular foundation model/MolE (trained embedding neural network) for drug property prediction based on chemical structures (compound structure feature representation for query chemical compound). The MoIE adapts a NN transformed architecture with molecular graph inputs, including SMILES and Extended Connectivity Fingerprints/ECFPs (molecular fingerprints encoding substructures of molecules as preset chemical groups or as atom environments). This self-supervised model pretrains on learning chemical structures (compound structure feature representation) and uses a massive multi-task approach to learn biological information (phenomic images…exposed to [] perturbation…machine learning embedding), to achieve state-of-the-art results for absorption-distribution-metabolism-excretion-toxicity/ADMET properties (phenome) from the Therapeutic Data Commons [Mendez-Lucio at Abstract].
In some embodiments, the physiological condition of interest is a phenotype. For instance, in some embodiments, the physiological condition of interest is a physiological manifestation of a compound, a small molecule, and/or a therapeutic, such as toxicity and/or resolution of a disease… the physiological condition is a phenotype measured using experimental data including, but not limited to, flow cytometry readouts, imaging and microscopy annotations (e.g., H&E slides, IHC slides, radiology images, and/or other medical imaging), and/or cellular constituent data…the physiological condition of interest is a measure of toxicity [0237-0238].

Data (e.g. phenomic images of cells exposed to a training compound/perturbation in claims 4, 12, and 17) remain merely measured and manipulated in abstract ideas (generating a phenomic image similarity by comparing the first machine learning embedding and the second machine learning embedding… generating a difference metric… determining the pheno-similarity threshold from a distribution of difference metrics) to be used in the judicial exception.  Generic recitations of well-understood and conventional steps for utilizing any task appropriate computer-implemented module, processors, CRM, are recited in the specification without any particularity or specificity [FIGs 6 and 8] as recited in the specification below:
Although FIG. 6 illustrates the sphere system 102 being implemented by a particular component and/or device within the environment, the sphere system 102 can be implemented in whole or in part by other computing devices and/or components in the environment (e.g., additional client device(s)) [0065]… present disclosure may comprise or utilize a special purpose or general purpose computer including computer hardware, such as, for example, one or more processors and system memory, as discussed in greater detail below [0088]…those skilled in the art will appreciate that the disclosure may be practiced in network computing environments with many types of computer system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, tablets, pagers, routers, switches, and the like [0094].

The MPEP 2106.05(a) provides evidence to the routine nature of applying the judicial exception with computers or devices to perform or automate an existing process, and constitute insignificant extra-solution activity, as recited: “Mere automation of manual processes, such as using a generic computer to process an application for financing a purchase, Credit Acceptance Corp. v. Westlake Services, 859 F.3d 1044, 1055, 123 USPQ2d 1100, 1108-09 (Fed. Cir. 2017) or speeding up a loan-application process by enabling borrowers to avoid physically going to or calling each lender and filling out a loan application, LendingTree, LLC v. Zillow, Inc., 656 Fed. App'x 991, 996-97 (Fed. Cir. 2016) (non-precedential).” 
With respect to the instant claims, the steps (generating and relating compound structure representations and property data with bioactive states for structure-phenomic similarity analysis in order to identify other genes/compounds with similar cell bioactivity) and additional elements (processors, systems, target perturbations) involving mathematical algorithm application and automated mental steps do not comprise an inventive concept when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception. Therefore, the claims do not amount to significantly more than the judicial exception itself (Step 2B: No). As such, claims 1-20 are not patent eligible.
Response to Remarks – 101
	Applicant's remarks (p.14-15), filed 07/09/2025, have been fully considered and are not persuasive for the previously stated rejection of record, with additions necessitated by claim amendments.  
	First, Applicant asserts independent claims 1, 10, and 16 steps cannot reasonably be performed in the human mind which is not equipped to generate a phenomic embedding, a predicted phenomic similarities, or a compound structure feature representation.
	However this is not persuasive because the simplest embodiment of the claims directed to comparing feature data (i.e. structure and phenomic laboratory data from target compounds/perturbations) between a known compound to a query compound is not too complex to be performed in the human mind of a chemist with the aid of a pencil and paper.  Even if more data of a preferred embodiment is provided, the human mind can still slowly perform the task of comparing promising similarities between a known compound to those of a query compound.  An embedding is merely linking/associating data (first/second) piece of training data (compound/perturbation phenomic image data) for input into a mathematical model, which then outputs a query compound structure representations.  A biochemist can compare the visual effects (phenomics) of two different antibiotics (perturbations/compounds) on petri dishes, and then determine antibiotic A clears more of the bacteria than antibiotic B.  Automating the abstract steps of comparing the effects of two interventions as a trained embedding neural network embodiment, even if performed by a computer, does not mean the step is not a mental process and capable of being performed by a human mind with pen and paper (see MPEP 2106.05(a)). Furthermore, evidence of an existing process having ever been performed as a mental step, or by pen and paper is not a requirement to identify abstract ideas at Eligibility Step 2A: Prong One.
	Second, Applicant asserts practical applications of abstract ideas with improvement to a technical field, to "generat[ing], utilizing the structure-phenomics relationship neural network, a phenomic similarity prediction for the compound structure feature representation and a target perturbation" is a practical application, at least because it improves drug discovery technology and computer functionality… it provides new capabilities for accurately predicting phenomic similarity without requiring extensive laboratory testing.”
	However this is not persuasive because the improvement to a technical field (structure- phenomic relationship similarity predictions) is an improvement to the judicial exception itself of abstract ideas executed with mental (predicting, comparing, generating, applying) and mathematical processes (difference metrics/measure of loss/thresholds).  Implementation of a task-optimized neural network on a computer does not modify the structure of the computer itself or improve computer technology. That is, a given generic computer with generic functional capabilities may implement either a non-optimized or optimized neural network. Rather, it may constitute an improvement to the process carried out by the computer.  
An improvement in the process of an abstract idea itself (i.e., enhancement of a trained embedding neural network for compound structure-phenomic similarity) is not equivalent to an improvement in computer functionality or to any other technology or technical field. See Trading Technologies Int'l v. IBG LLC, 921 F.3d 1084, 1093-94, 2019 USPQ2d 138290 (Fed. Cir. 2019) (hereafter ‘Trading Tech’), wherein the court determined that the claimed user interface simply provided a trader with more information to facilitate market trades, which improved the business process of market trading (i.e., the implemented method) but did not improve computer functionality or any other technology or technical field. The instant claims, like the ineligible claims considered in Trading Tech, are directed to an abstract data analysis procedure that results in improved information and faster data analysis of known data. The MPEP 2106.05(a) teaches applying the judicial exception with computers or devices to perform or automate an existing process, constitute insignificant extra-solution activity, as recited: “Mere automation of manual processes, such as using a generic computer to process an application for financing a purchase, Credit Acceptance Corp. v. Westlake Services, 859 F.3d 1044, 1055, 123 USPQ2d 1100, 1108-09 (Fed. Cir. 2017) or speeding up a loan-application process by enabling borrowers to avoid physically going to or calling each lender and filling out a loan application, LendingTree, LLC v. Zillow, Inc., 656 Fed. App'x 991, 996-97 (Fed. Cir. 2016) (non-precedential).”  
Finally, with respect to the arguments regarding the alleged improvement, it is unclear that the independent claims recite all the necessary and sufficient steps required to achieve that improvement. MPEP 2106.05(a): “An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome. McRO, 837 F.3d at 1314-15, 120 USPQ2d at 1102- 03; DDR Holdings, 773F.3d at 1259, 113 USPQ2d at 1107.”  Therefore, claims 1-20 are patent ineligible. 
 
Claim Rejections - 35 USC § 112
 
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
 
 
Claims 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. The dependent claims, 2-9, 11-15, and 17-20 are also rejected because they depend on and/or do not remedy the deficiencies inherited by their parent claims.    Any newly applied rejection/portion is necessitated by instant application amendment.  
Regarding independent claim 1, as well as applied to independent claims 10 and 16, which recite: 
1. (Currently Amended) A computer-implemented method comprising:
1) training a structure-phenomics relationship neural network to generate phenomic similarity predictions by:
1a) generating, utilizing a trained embedding neural network, a first phenomic embedding from a first phenomic image of a first cell exposed to a training compound;
1b) generating, utilizing the trained embedding neural network, a second phenomic embedding from a second phenomic image of a second cell exposed to a training perturbation;
1c) comparing, within a latent feature space of the trained embedding neural network, the first phenomic embedding and the second phenomic embedding to generate a phenomic image feature space similarity;
1d) generating, utilizing the structure-phenomics relationship neural network, a predicted phenomic feature space similarity between the training compound and the training perturbation from a training compound structure feature representation of the training compound; [[and]]
1e) comparing the phenomic image feature space similarity and the predicted phenomic feature space similarity to generate a phenomic feature space similarity measure of loss; and
1f) modifying parameters of the structure-phenomics relationship neural network utilizing the phenomic feature space similarity measure of loss;

The broadest reasonable interpretation of instant independent claim 1 is a method with two different neural networks A) a trained embedding neural network (NN-A) and B) a structure-phenomics relationship neural network (NN-B) for predicting compound similarity. 
Regarding the term “trained embedding neural network”, NN-A is indefinite without active steps by which NN-A is trained (e.g. the initial training set or the embedding process for what features, e.g.  what aspect of a cell image, what type of cells, or what parameters, differentiate said embedding) in order to show the instant NN-A is distinguishable from a generic NN.  Further, it is unclear whether the training of NN-A occurred within the metes and bounds of the instant application, and whether any trained embedding NN can train NN-B and produce the same outputs. The minimally sufficient steps and limitations to provide the instant NN-A and its function are not claimed.  
Regarding the term “structure-phenomics relationship neural network”, NN-B appears to be trained using NN-A until step 1d), where upon NN-B is also used to train NN-B.  Is NN-B an untrained NN or a trained NN-B being retrained? Further, how does training/retraining of NN-B produce a different phenomic similarity prediction output in the final 3) “generate” limitation, as well as, how does NN-B function (e.g. to what is NN-B comparing said query chemical compound structure feature representation and based on similarity to what feature-structure/function/effect on the same cell type…).  Further, how does NN-B generate a target perturbation and how is a perturbation linked or related to a training  or a query compound.  Looking to dependent claim 2, does the target gene knockout perturbation or a target compound perturbation relate to the query compound as — e.g. the target output with a target perturbation of 1g) is another structurally identical target compound, or does the target output functionally counteract the effects of a target gene knockout, and thus the prediction output identifies the query compound as a potential target treatment compound because of or in the presence only of a specific target perturbation.  Ultimately, the structure to phenome model, achieved with a first embedding vs. a second embedding, is indefinite without any claimed overlapping labels/features, data, or effects.  The minimally sufficient steps and limitations to provide the instant NN-B and its function are not claimed.  Limitations, such as those in claims 4 (which adds training step criteria of reducing measure of loss) and 8 (which adds NN structural criteria with task heads) at a minimum, are needed to distinguish instant NN-B with particular parameters, weights, relationships, or structures for subsequent distinct outputs.  
 Regarding claim 1. 2) which recites “query chemical compound”, it is unclear what relationship is predicted between the training compounds and the query compounds as discussed above, and this very broad limitation may also incur 112a issues regarding the claimed scope of any and all query compounds.  Adding limitations to query compound, e.g.  the model compound inputs/outputs are cancer treatments or carcinogenic toxins, can point toward an improvement to a technology or an asserted invention.  
Therefore, said limitations are indefinite as claimed.  Clarification is requested through clearer claim language.  

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Note on formatting: citations from the instant application are italicized in the following section.

A.  Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wolf  et al. US20220403335A1 (IDS cited; herein Wolf).  
Independent claim 1 recite method steps implemented with a system of independent claim 10, and a CRM of independent claim 20. 

Regarding instant claims 1, 10, and 16, instant application recites: 
A computer-implemented method comprising: training a structure-phenomics relationship neural networspace similarity measure of loss; generating a compound structure feature representation for a query chemical compound; and generating, utilizing the structure-phenomics relationship neural network machine learning model, a phenomic similarity prediction for the compound structure feature representation and a target perturbation.

The prior art to Wolf teaches computer systems and methods for modeling a chemical compound structure fingerprint with physiological conditions (phenomic, cellular bioactivity) [Abstract, claims 1 and 29] (utilizing a structure-phenomics relationship trained embedding neural network).  Wolf discloses a method/system/non-transitory computer-readable medium/NTCRM for associating a test compound fingerprint input with a physiological condition of interest as an activation score (phenomic similarity prediction), wherein the physiological condition has different states (phenomes) associated with exposure of cells (exposed to cell perturbations) to reference compounds known to affect the physiological condition. 
The prior art to Wolf teaches:
a method [Wolf  in claim 1], system [Wolf at claim 29], and NTCRM [Wolf at claim 30].  The system 100 in some embodiments includes one or more processing units (CPU(s)) 102 (e.g., a processor, a processing core, etc.), one or more network interfaces 104, a user interface 106 including (optionally) a display 108 and an input system 105 (e.g., an input/output interface, a keyboard, a mouse, etc.) for use by the user, memory (e.g., non-persistent memory 107, persistent memory 109), and one or more communication buses 103 for interconnecting the aforementioned components [0212-0214, FIG 1]…provides a non-transitory computer-readable medium storing one or more computer programs, executable by a computer, for associating a test chemical compound with a physiological condition of interest [Wolf at claim 30 and 0448]. 
training of the model 601, in this example, the SMILES representations of compounds represented in activation data structure 504 of FIG. 6 are transformed into an ECFP4 fingerprint representation (Wolf at claim 26), and additionally a graph representation. Subsequentially two models are trained (training the structure-phenomics relationship trained embedding neural network). That is, the model 601 is an ensemble of two different models (trained embedding neural network) in this example: A) a fully connected neural network architecture is used to train on ECFP4 representation, B) a message-passing neural network (MPNN) is used to train on graph representations. The untrained model 601 is trained using, for each respective chemical structure (compound structure representation) of each respective compound in the training set, for each respective cellular constituent module (phenome) in the set of cellular constituent modules, a respective difference between: (i) a respective calculated activation score (phenomic similarity prediction) for the respective cellular constituent module upon input of the fingerprint of the chemical structure of the respective compound into the untrained model (generating, utilizing the structure-phenomics relationship trained embedding neural network, a predicted phenomic similarity between the training compound and the training perturbation from a training compound structure feature representation of the training compound); and (ii) the respective numerical activation score of the respective cellular constituent module for the respective compound in the set of cellular constituent modules (obtained from the activation data structure 504), where the training adjusts a plurality of parameters associated with the untrained model 601 responsive to the difference (modifying parameters of the structure-phenomics relationship trained embedding neural network by determining a measure of loss utilizing the first phenomic embedding, the second phenomic embedding, and the predicted phenomic similarity) and where the plurality of parameters comprises 100 or more parameters, thereby obtaining a trained model [0562].
obtaining a fingerprint of a chemical structure of the test chemical compound (query chemical compound) [Wolf at claims 1, 7-9: fingerprint from a simplified molecular-input line-entry system (SMILES) string representation of the test chemical compound]
trained embedding neural network [Wolf  in claim 11: responsive to inputting the fingerprint of the chemical structure into the corresponding fully connected neural network].
cellular constituent data (e.g., abundances of genes and/or perturbation signatures) corresponding to physiological conditions of interest (e.g., phenotypes, diseases, cell states, and/or cellular processes of interest), and using latent representations and machine learning to determine associations (e.g., weights and/or correlations) between modules (e.g., subsets) of cellular constituents and the physiological condition of interest…elucidating molecular mechanisms underlying various physiological conditions, such as disease …  inputting the fingerprint of the chemical structure into a model retrieving, as output from the model, a respective activation score (phenomic similarity prediction) which if it meets a threshold, the compound (query compound) is identified as associated with the physiological condition (phenome, bioactive/cell state) (Wolf at [0009, 0011]).
associating perturbations (e.g. chemical compounds) with phenomic images, experimental data of a physiological condition is a phenotype, from cytometry readouts, imaging and microscopy annotations (medical imaging) (Wolf at [0228, 237:  Referring to FIGS . 3A - 3E , one aspect of the present disclosure provides a method 300 of associating a test chemical compound with a physiological condition of interest (embedding a phenomic image)  …the physiological condition is a phenotype measured using experimental data including , but not limited to , flow cytometry readouts , imaging and microscopy annotations ( e.g. , H & E slides , IHC slides , radiology images , and / or other medical imaging ) , and / or cellular constituent data]).
different states (cell states) associated with the physiological condition (phenomes)  is derived by exposing cells to reference compounds known to affect the physiological condition (perturbations) in addition to a control state of unexposed cells (Wolf at claim 4)…an annotated cell state (phenomic image) in the plurality of annotated cell states is an exposure of a cell in the first plurality of cells to a compound under an exposure condition (a duration of exposure, a concentration of the compound, or a combination of a duration of exposure and a concentration of the compound) (Wolf at claim 16-17) (utilizing training compound structure feature representations and phenomic image similarities generated by comparing phenomic images of cells exposed to cell perturbations)… 
training a candidate cellular constituent model using, for each respective covariate in the plurality of covariates, a difference between (i) a calculated activation against each cellular constituent module represented by the candidate cellular constituent model upon input of a fingerprint of the covariate into the candidate cellular constituent model and (ii) actual activation against each cellular constituent module represented by the candidate cellular constituent model, wherein the training adjusts a plurality of covariate parameters associated with the candidate cellular constituent model responsive to the difference (Wolf at claim 14) (predicted phenomic similarity with the phenomic image similarity to determine a measure of loss).  
 the plurality of covariates comprises cell batch, cell donor, cell type, disease status, exposure to a chemical compound…the training the candidate cellular constituent model is performed using a categorical cross-entropy loss in a multi-task formulation, in which each covariate in the plurality of covariates corresponds to a cost function in plurality of cost functions and each respective cost function in the plurality of cost functions has a common weighting factor [0038-0039] (modifying parameters of the structure-phenomics relationship trained embedding neural network based on the measure of loss).

 Regarding instant claim 2, instant application recites: 
wherein the target perturbation comprises a target gene knockout perturbation or a target compound perturbation.

The prior art to Wolf teaches perturbations.  
derived by exposing cells to reference compounds (target compound perturbation) known to affect the physiological condition in addition to a control state of unexposed cells (Wolf at claim 4).  
… physiological condition of interest is characterized by an activation of a set of cellular constituents (e.g., a cellular constituent module) and/or a perturbation signature (e.g., a differential expression profile of a plurality of analytes in response to a perturbation)… [using] any type of analyte (e.g., a gene, a transcript, a carbohydrate, a lipid, an epigenetic feature, a metabolite, a protein, or a combination thereof)…in a respective cellular constituent module [0239-0240].
… a perturbation refers to any exposure of the cell to one or more conditions, such as a treatment by one or more compounds. In some embodiments, a perturbation signature is a change in the expression or abundance level of one or more cellular constituents in the cell induced by a perturbation…. not limited to, gene knockdowns (i.e. gene knockouts), cellular responses to stimuli, tissue growth and regeneration, and/or treatment with or exposure to compounds. Example perturbagens include, but are not limited to, a small molecule, a biologic, a therapeutic, a protein, a protein combined with a small molecule, an ADC, a nucleic acid, such as an siRNA or interfering RNA, a cDNA over-expressing wild-type and/or mutant shRNA, a cDNA over-expressing wild-type and/or mutant guide RNA (e.g., Cas9 system or other gene editing system) [0335-0336]. 

Regarding instant claim 3, instant application recites: 
wherein generating the phenomic similarity prediction comprises generating a similarity classification from a set of classifications comprising: a pheno-similar classification, a pheno-dissimilar classification, and a pheno-independent classification.

The prior art to Wolf teaches an activation score (phenomic similarity prediction) for its corresponding cellular constituent module, for any compound, whether part of the training set or not by a model 601 is capable of reporting out whether its corresponding cellular constituent module is associated with a test compound (similarity classification). If it is, the model outputs a score that indicates that its corresponding cellular constituent module is associated with a test compound… this score is categorical (e.g. a “1” if the corresponding cellular constituent module is associated with a test compound” and a “0” if it is not) (a set of classifications)…[or] a probability or likelihood, e.g., on a scale of 0 to 1 where numbers closer to 1 (e.g., 0.85) indicate the likelihood that the corresponding cellular constituent module is associated with a test compound [0565] (a set of classifications comprising: a pheno-similar classification, a pheno-dissimilar classification, and a pheno-independent classification). The instant specific classification categories are generally known by those having skill in the art of statistical analysis/data classification and are not considered a matter of invention given the instant disclosure.  

 Regarding instant claims 4, and 17, instant application recites: 
the system …the computing device …The computer-implemented method of claim 1
wherein training the structure-phenomics relationship neural network, further comprises modifying the parameters of the structure-phenomics relationship neural network to reduce the phenomic feature space similarity measure of loss on a subsequent training iteration.

The prior art to Wolf teaches:
system/computing device (FIG 1: processing core 102)
this example, model 601 is an ensemble (trained embedding neural network) of (i) a fully connected network on standard fingerprints of the SMILES strings, where the network architecture is a 3-layer network with ReLU activations and (ii) an MPNN network out of the DGL library. Upon input of the chemical structural information, model 601 provides the activation score of each cellular constituent module 132 that it was trained on [0563]. 
For training of the model 601, in this example, the SMILES representations of compounds represented in activation data structure 504 of FIG. 6 are transformed into an ECFP4 fingerprint representation (Wolf at claim 26), and additionally a graph representation. Subsequentially two models are trained (training the structure-phenomics relationship trained embedding neural network). That is, the model 601 is an ensemble of two different models (trained embedding neural network) in this example: A) a fully connected neural network architecture is used to train on ECFP4 representation, B) a message-passing neural network (MPNN) is used to train on graph representations. The untrained model 601 is trained using, for each respective chemical structure (compound structure representation) of each respective compound in the training set, for each respective cellular constituent module (phenome) in the set of cellular constituent modules, a respective difference between: (i) a respective calculated activation score (phenomic similarity prediction) for the respective cellular constituent module upon input of the fingerprint of the chemical structure of the respective compound into the untrained model and (ii) the respective numerical activation score of the respective cellular constituent module for the respective compound in the set of cellular constituent modules (obtained from the activation data structure 504), where the training adjusts a plurality of parameters associated with the untrained model 601 responsive to the difference, and where the plurality of parameters comprises 100 or more parameters, thereby obtaining a trained model [0562].
provides an activation score for its corresponding cellular constituent module, for any compound, whether part of the training set or not. That is, each model 601 is capable of reporting out whether its corresponding cellular constituent module is associated with a test compound. If it is, the model outputs a score that indicates that its corresponding cellular constituent module is associated with a test compound. In some embodiments, this score is categorical (e.g. a “1” if the corresponding cellular constituent module is associated with a test compound” and a “0” if it is not). In some embodiments, this score is a probability or likelihood, e.g., on a scale of 0 to 1 where numbers closer to 1 (e.g., 0.85) indicate the likelihood that the corresponding cellular constituent module is associated with a test compound [0565] (generating a phenomic image similarity by comparing the first machine learning embedding and the second machine learning embedding).
The respective perturbation signature comprises an identification of a respective plurality of cellular constituents and, for each respective cellular constituent in the respective plurality of cellular constituents, a corresponding significance score that quantifies an association between a change in abundance of the respective cellular constituent (identifying a second machine learning embedding of a second phenomic image of a second cell exposed to a training perturbation) and a change in cell state between a respective first cell state and a respective second cell state. One of the respective first cell state and second cell state is an unperturbed cell state while the other is a respective perturbed cell state caused by exposure of cells to the compound corresponding to the respective perturbation signature [0420]….a measure of differential cellular constituent abundance between an unaltered cell state and an altered cell state. Here, the altered cell state occurs through the cellular transition from the unaltered cell state to the altered cell state. Moreover, at least one of (i) the unaltered cell state, (ii) the altered cell state, and (iii) the transition from the unaltered cell state to the altered cell state is associated with the physiological condition of interest. The single-cell transition signature comprises an identification of a reference plurality of cellular constituents and, for each respective cellular constituents in the plurality of reference cellular constituents, a corresponding first significance score that quantifies an association between a change in abundance of the respective cellular constituent and a change in cell state between the unaltered cell state and the altered cell state [0420-0422] (identifying a first machine learning embedding of a first phenomic image of a first cell exposed to a training compound… generating a phenomic image similarity by comparing the first machine learning embedding and the second machine learning embedding.


Regarding instant claim 5, instant application recites: 
generating the phenomic image feature space similarity by: generating a difference metric between the first phenomic embedding and the second phenomic embedding; and applying a pheno-similarity threshold to the difference metric to generate a pheno- similarity classification between the training compound and the training perturbation.

The prior art to Wolf wherein the
a first output of the corresponding fully connected neural network and a second output of the corresponding message passing neural network is combined, responsive to inputting the fingerprint of the chemical structure into the corresponding fully connected neural network and the corresponding message passing neural network, to determine an activation score in the one or more calculated activation scores (phenomic similarity prediction) for the corresponding cellular constituent module in the set of cellular constituent modules [Wolf  in claim 11] (generating the phenomic image similarity).
a measure of differential cellular constituent abundance between an unaltered cell state and an altered cell state. Here, the altered cell state occurs through the cellular transition from the unaltered cell state to the altered cell state. Moreover, at least one of (i) the unaltered cell state, (ii) the altered cell state, and (iii) the transition from the unaltered cell state to the altered cell state is associated with the physiological condition of interest. The single-cell transition signature comprises an identification of a reference plurality of cellular constituents and, for each respective cellular constituents in the plurality of reference cellular constituents, a corresponding first significance score that quantifies an association between a change in abundance of the respective cellular constituent and a change in cell state between the unaltered cell state and the altered cell state [0420-0422] (generating a difference metric between the first machine learning embedding and the second machine learning embedding). 
the model is an unsupervised clustering model. In some embodiments, the model is a supervised clustering model… The clustering problem can be described as one of finding natural groupings in a dataset. To identify natural groupings, two issues can be addressed. First, a way to measure similarity (or dissimilarity) between two samples can be determined. This metric (e.g., similarity measure) can be used to ensure that the samples in one cluster are more like one another than they are to samples in other clusters. Second, a mechanism for partitioning the data into clusters using the similarity measure can be determined. One way to begin a clustering investigation can be to define a distance function and to compute the matrix of distances between all pairs of samples in the training set. If distance is a good measure of similarity, then the distance between reference entities in the same cluster can be significantly less than the distance between the reference entities in different clusters (difference metric). However, clustering may not use of a distance metric. For example, a nonmetric similarity function s(x, x′) can be used to compare two vectors x and x′. s(x, x′) can be a symmetric function whose value is large when x and x′ are somehow “similar.” Once a method for measuring “similarity” or “dissimilarity” between points in a dataset has been selected, clustering can use a criterion function that measures the clustering quality of any partition of the data. Partitions of the data set that extremize the criterion function (difference metric) can be used to cluster the data [0204] (generating a difference metric between the first machine learning embedding and the second machine learning embedding; and applying a pheno-similarity threshold to the difference metric to generate a pheno-similarity classification between the training compound and the training perturbation)
… wherein the correlation model includes a graph clustering, wherein the graph clustering is Leiden clustering on a Pearson-correlation-based distance metric or a Louvain clustering (Wolf at claims 20- 22) (difference metric).
the set of cellular constituent modules is a plurality of cellular constituent modules,
a first subset of the plurality of cellular constituent modules, including the first cellular constituent module, is associated with the physiological condition of interest (training perturbation), a second subset of the plurality of cellular constituent modules is not associated with the physiological condition of interest (not with training perturbation), and the test chemical compound (training compound) is identified with the physiological condition of interest when the respective calculated activation score (phenomic similarity prediction) for the first cellular constituent module satisfies the first threshold criterion and the respective calculated activation score for a cellular constituent module in the second subset of the plurality of cellular constituent modules satisfies a second threshold criterion…(Wolf at claims 1, 13, 19, and 20) (applying a pheno-similarity threshold to the difference metric to generate a pheno-similarity classification between the training compound and the training perturbation).
  
Regarding instant claim 6, instant application recites: 
determining the pheno-similarity threshold from a distribution of difference metrics between the training perturbation and a plurality of additional perturbations.

The prior art to Wolf teaches: 
the set of cellular constituent modules is a plurality of cellular constituent modules,
a first subset of the plurality of cellular constituent modules, including the first cellular constituent module, is associated with the physiological condition of interest (training perturbation), a second subset of the plurality of cellular constituent modules is not associated with the physiological condition of interest (not with training perturbation), and the test chemical compound (training compound) is identified with the physiological condition of interest when the respective calculated activation score (phenomic similarity prediction) for the first cellular constituent module satisfies the first threshold criterion and the respective calculated activation score for a cellular constituent module in the second subset of the plurality of cellular constituent modules satisfies a second threshold criterion…(Wolf at claims 1, 13, 19, and 20) (training perturbation and a plurality of additional perturbations).
[For example]: the trained model 601 for Module 78 was used to predict cellular constituent activation scores for cellular constituent module 78 (Module 78) for a random subset of 200,000 compounds sampled from five million compounds in a public database. From this, the top 50 compounds predicted to highly activate cellular constituent module 78 were selected and compared to a set of compounds in a database including compounds from the LINCS L1000 dataset and Synthetic Hit analogs derived from the chemical structure of a known compound, referred to herein as the Known Piperidine-Containing Compound (“KPCC”). The distribution of this comparison is illustrated in FIG. 10E. At the tail end of the distribution, the predictions obtained for the trained model 601 for cellular constituent module 78 identified a compound that significantly exceeded all compounds in LINCS and the Synthetic Hits. This approach highlights a method for optimizing chemical structure against specific desired cellular processes [0570].

Regarding instant claim 7, instant application recites: 
wherein training the structure-phenomics relationship neural network machine learning model further comprises generating, utilizing the structure-phenomics relationship neural network machine learning model, an additional predicted phenomic feature space similarity from the training compound structure feature representation and an additional training perturbation.

The prior art to Wolf wherein the
[For example]: the trained model 601 for Module 78 was used to predict cellular constituent activation scores for cellular constituent module 78 (Module 78) for a random subset of 200,000 compounds sampled from five million compounds in a public database… the top 50 compounds predicted to highly activate cellular constituent module 78 were selected and compared to a set of compounds in a database…derived from the chemical structure of a known compound (compound structure feature representation)… The distribution of this comparison [FIG. 10E] with tail end of the distribution showing the predictions obtained for the trained model 601 for cellular constituent module 78 identified a compound that significantly exceeded all compounds in LINCS and the Synthetic Hits. This approach highlights a method for optimizing chemical structure against specific desired cellular processes (phenomic) [0570] (generating a training compound structure feature representation for the training compound).
the set of cellular constituent modules is a plurality of cellular constituent modules,
a first subset of the plurality of cellular constituent modules, including the first cellular constituent module, is associated with the physiological condition of interest (training perturbation), a second subset of the plurality of cellular constituent modules is not associated with the physiological condition of interest (not with training perturbation), and the test chemical compound (training compound) is identified with the physiological condition of interest when the respective calculated activation score (phenomic similarity prediction) …(Wolf at claims 1, 13, and 20) (generating, utilizing the structure-phenomics relationship trained embedding neural network, a predicted phenomic similarity from the training compound structure feature representation and the training perturbation).


Regarding instant claim 8, instant application recites: 
wherein training the structure-phenomics relationship neural network machine learning model further comprises: modifying parameters of a first task head of the structure-phenomics relationship neural network based on the predicted phenomic feature space similarity corresponding to the training perturbation; and modifying parameters of a second task head of the structure-phenomics relationship neural network based on the additional predicted phenomic feature space similarity corresponding to the additional training perturbation.

The prior art to Wolf teaches a machine learning loss function based on differences of probability distributions. 
training a candidate cellular constituent model using, for each respective covariate in the plurality of covariates, a difference between (i) a calculated activation against each cellular constituent module represented by the candidate cellular constituent model upon input of a fingerprint of the covariate into the candidate cellular constituent model and (ii) actual activation against each cellular constituent module represented by the candidate cellular constituent model, wherein the training adjusts a plurality of covariate parameters associated with the candidate cellular constituent model responsive to the difference (Wolf at claim 14) (comparing the predicted phenomic similarity with the phenomic image similarity to determine a measure of loss).  
 the plurality of covariates comprises cell batch, cell donor, cell type, disease status, exposure to a chemical compound…the training the candidate cellular constituent model is performed using a categorical cross-entropy loss in a multi-task formulation, in which each covariate in the plurality of covariates corresponds to a cost function in plurality of cost functions and each respective cost function in the plurality of cost functions has a common weighting factor [0038-0039] (modifying parameters of the structure-phenomics relationship trained embedding neural network based on the measure of loss).

Regarding instant claims 9, 15, and 20, instant application recites: 
receiving the query chemical compound and the target perturbation via a user interface of a client device; and
providing the phenomic similarity prediction for display via the user interface of the client device.

The prior art to Wolf wherein:
neural network ensemble output is responsive to inputting the fingerprint of the chemical structure into the corresponding fully connected neural network and the corresponding message passing neural network, to determine an activation score (phenomic similarity prediction) in the one or more calculated activation scores for the corresponding cellular constituent module in the set of cellular constituent modules [Wolf at claim 11]. 
The system 100 in some embodiments includes one or more processing units (CPU(s)) 102 (e.g., a processor, a processing core, etc.), one or more network interfaces 104, a user interface 106 including (optionally) a display 108 and an input system 105 (e.g., an input/output interface, a keyboard, a mouse, etc.) for use by the user, memory (e.g., non-persistent memory 107, persistent memory 109), and one or more communication buses 103 for interconnecting the aforementioned components [0212-0214, FIG 1].

Regarding instant claim 11, instant application recites: 
receive the target perturbation by receiving a target gene knockout perturbation or a target compound perturbation; and
generate the phenomic similarity prediction by generating a similarity classification from a set of classifications comprising: a pheno-similar classification, a pheno-dissimilar classification, and a pheno-independent classification. 

The instant specific use of classification categories are generally known by those having skill in the art of statistical analysis/data classification and are not considered a matter of invention given the instant disclosure.  

The prior art to Wolf teaches:
derived by exposing cells to reference compounds (target compound perturbation) known to affect the physiological condition in addition to a control state of unexposed cells (Wolf at claim 4).  
… a perturbation refers to any exposure of the cell to one or more conditions, such as a treatment by one or more compounds. In some embodiments, a perturbation signature is a change in the expression or abundance level of one or more cellular constituents in the cell induced by a perturbation…. not limited to, gene knockdowns (i.e. gene knockouts), cellular responses to stimuli, tissue growth and regeneration, and/or treatment with or exposure to compounds. Example perturbagens include, but are not limited to, a small molecule, a biologic, a therapeutic, a protein, a protein combined with a small molecule, an ADC, a nucleic acid, such as an siRNA or interfering RNA, a cDNA over-expressing wild-type and/or mutant shRNA, a cDNA over-expressing wild-type and/or mutant guide RNA (e.g., Cas9 system or other gene editing system) [0335-0336]. 
…an activation score (phenomic similarity prediction) for its corresponding cellular constituent module, for any compound, whether part of the training set or not by a model 601 is capable of reporting out whether its corresponding cellular constituent module is associated with a test compound (similarity classification). If it is, the model outputs a score that indicates that its corresponding cellular constituent module is associated with a test compound… this score is categorical (e.g. a “1” if the corresponding cellular constituent module is associated with a test compound” and a “0” if it is not) (a set of classifications)…[or] a probability or likelihood, e.g., on a scale of 0 to 1 where numbers closer to 1 (e.g., 0.85) indicate the likelihood that the corresponding cellular constituent module is associated with a test compound [0565] (a set of classifications comprising: a pheno-similar classification, a pheno-dissimilar classification, and a pheno-independent classification). 


 Regarding instant claim 12, instant application recites: 
wherein the at least one non- transitory computer-readable storage medium stores additional instructions that, when executed by the at least one processor, cause the system to train the structure-phenomics relationship neural network by: 
modifying parameters of a first task head of the structure-phenomics relationship neural network based on the predicted phenomic feature space similarity corresponding to the training perturbation; and
modifying parameters of a second task head of the structure-phenomics relationship neural network based on an additional predicted phenomic feature space similarity corresponding to an additional training perturbation

The prior art to Wolf teaches:
system/computing device (FIG 1: processing core 102)
Activation scores corresponding to each respective cellular constituent module for each respective compound serve as labels (e.g., numerical activation scores indicating an actual presence or absence of association between modules and compounds) for training a multi-task model (task head of the structure-phenomics relationship neural network)  to identify associations (e.g., weights and/or correlations) between modules and compounds  [0382]…In an example embodiment, the training the model is performed using a categorical cross-entropy loss in a multi-task formulation, in which each covariate in the plurality of covariates corresponds to a cost function in plurality of cost functions and each respective cost function in the plurality of cost functions has a common weighting factor [0400].
Activation scores corresponding to each respective perturbation signature for each respective compound serve as labels (e.g., numerical activation scores indicating an actual presence or absence of association between perturbation signatures and compounds) for training a multi-task model to identify associations (e.g., weights and/or correlations) between perturbation signatures and compounds. For instance, as described above, in some embodiments, a first subset of the plurality of perturbation signatures is associated with the physiological condition of interest, and a second subset of the plurality of perturbation signatures is not associated with the physiological condition of interest. Thus, in some such embodiments, an actual presence of association can be included in the training dataset using the first subset of the plurality of perturbation signatures as labels, and an actual absence of association can be included in the training dataset using the second subset of the plurality of perturbation signatures as labels. [0444] (additional predicted phenomic feature space similarity corresponding to an additional training perturbation).
this example, model 601 is an ensemble (trained embedding neural network) of (i) a fully connected network on standard fingerprints of the SMILES strings, where the network architecture is a 3-layer network with ReLU activations and (ii) an MPNN network out of the DGL library. Upon input of the chemical structural information, model 601 provides the activation score of each cellular constituent module 132 that it was trained on [0563]. 
For training of the model 601, in this example, the SMILES representations of compounds represented in activation data structure 504 of FIG. 6 are transformed into an ECFP4 fingerprint representation (Wolf at claim 26), and additionally a graph representation. Subsequentially two models are trained (training the structure-phenomics relationship trained embedding neural network). That is, the model 601 is an ensemble of two different models (trained embedding neural network) in this example: A) a fully connected neural network architecture is used to train on ECFP4 representation, B) a message-passing neural network (MPNN) is used to train on graph representations. The untrained model 601 is trained using, for each respective chemical structure (compound structure representation) of each respective compound in the training set, for each respective cellular constituent module (phenome) in the set of cellular constituent modules, a respective difference between: (i) a respective calculated activation score (phenomic similarity prediction) for the respective cellular constituent module upon input of the fingerprint of the chemical structure of the respective compound into the untrained model and (ii) the respective numerical activation score of the respective cellular constituent module for the respective compound in the set of cellular constituent modules (obtained from the activation data structure 504), where the training adjusts a plurality of parameters associated with the untrained model 601 responsive to the difference, and where the plurality of parameters comprises 100 or more parameters, thereby obtaining a trained model [0562].

Regarding instant claims 13 and 18, instant application recites: 
cause the system to train the structure-phenomics relationship neural network by: generating a difference metric between the first phenomic embedding and the second phenomic embedding; and
applying a pheno-similarity threshold to the difference metric to generate a pheno-similarity classification between the training compound and the training perturbation, wherein the pheno-similarity threshold is determined from a distribution of difference metrics between the training perturbation and a plurality of additional perturbations. 

The prior art to Wolf teaches: 
a measure of differential cellular constituent abundance between an unaltered cell state and an altered cell state. Here, the altered cell state occurs through the cellular transition from the unaltered cell state to the altered cell state. Moreover, at least one of (i) the unaltered cell state, (ii) the altered cell state, and (iii) the transition from the unaltered cell state to the altered cell state is associated with the physiological condition of interest. The single-cell transition signature comprises an identification of a reference plurality of cellular constituents and, for each respective cellular constituents in the plurality of reference cellular constituents, a corresponding first significance score that quantifies an association between a change in abundance of the respective cellular constituent and a change in cell state between the unaltered cell state and the altered cell state [0420-0422] (generating a difference metric between the first machine learning embedding and the second machine learning embedding). 
the set of cellular constituent modules is a plurality of cellular constituent modules,
a first subset of the plurality of cellular constituent modules, including the first cellular constituent module, is associated with the physiological condition of interest (training perturbation), a second subset of the plurality of cellular constituent modules is not associated with the physiological condition of interest (not with training perturbation), and the test chemical compound (training compound) is identified with the physiological condition of interest when the respective calculated activation score (phenomic similarity prediction) for the first cellular constituent module satisfies the first threshold criterion and the respective calculated activation score for a cellular constituent module in the second subset of the plurality of cellular constituent modules satisfies a second threshold criterion…(Wolf at claims 1, 13, and 20) (training perturbation and a plurality of additional perturbations).
[For example]: the trained model 601 for Module 78 was used to predict cellular constituent activation scores for cellular constituent module 78 (Module 78) for a random subset of 200,000 compounds…the top 50 compounds predicted to highly activate cellular constituent module 78 were selected and compared to a set of compounds in a database including compounds …derived from the chemical structure of a known compound…The distribution of this comparison is illustrated in FIG. 10E. At the tail end of the distribution, the predictions obtained for the trained model 601 for cellular constituent module 78 identified a compound that significantly exceeded all compounds in LINCS and the Synthetic Hits. This approach highlights a method for optimizing chemical structure against specific desired cellular processes [0570].

Regarding instant claims 14 and 19, instant application recites: 
cause the system to train the structure-phenomics relationship neural network
modifying the parameters of the structure-phenomics relationship neural to reduce a difference between the predicted phenomic feature space similarity and the phenomic image feature space similarity on a subsequent training iteration based on the phenomic feature space similarity measure of loss.

The prior art to Wolf teaches a machine learning loss function based on compound structures and differences of probability distributions. 
constituent model and (ii) actual activation against each cellular constituent module represented by the candidate cellular constituent model, wherein the training adjusts a plurality of covariate parameters associated with the candidate cellular constituent model responsive to the difference (Wolf at claim 14) (comparing the predicted phenomic similarity with the phenomic image similarity to determine a measure of loss).  
 the plurality of covariates comprises cell batch, cell donor, cell type, disease status, exposure to a chemical compound…the training the candidate cellular constituent model is performed using a categorical cross-entropy loss in a multi-task formulation, in which each covariate in the plurality of covariates corresponds to a cost function in plurality of cost functions and each respective cost function in the plurality of cost functions has a common weighting factor [0038-0039] (modifying parameters of the structure-phenomics relationship trained embedding neural network based on the measure of loss).

Response to Remarks – 102
	
For instance, while Wolf describes outputting a calculated activation score, Wolf does not appear to describe comparing phenomic embeddings within a latent feature space to generate a phenomic image feature space similarity. Moreover, Wolf does not appear to describe generating the phenomic embeddings from phenomic images of cells exposed to a training compound or training perturbation. While Wolf discusses cellular constituent modules arranged in a latent representation (see Wolf at Figure 14B, paragraph [0206]) and latent representations of physiologically relevant chemical structures (see Wolf at paragraph [0150]), these latent representations are not phenomic embeddings. More particularly, Wolf does not teach utilizing a trained embedding neural network to generate a phenomic embedding from a phenomic image of a cell. Additionally, Wolf does not teach comparing phenomic embeddings within a latent feature space of the trained embedding neural network to generate a phenomic image feature space similarity, as more particularly recited by currently amended independent claims 1, 10, and 16.

Applicant's remarks (p.17-18), filed 07/09/2025, have been fully considered in light of the previously stated rejection of record.  The Applicant’s asserted, regarding currently amended independent claims 1, 10, and 16, that the prior art to Wolf did not teach “comparing phenomic embeddings within a latent feature space to generate a phenomic image feature space similarity”.  
However, Wolf discloses using a first plurality of cells with annotated cell states (first embedding) corresponding to vectors which are arranged in a latent representation dimensioned (latent space) by candidate cellular constituents and then, a second plurality of cells representing a plurality of covariates (second embedding) informative of a target physiological condition (phenomic embeddings) entered into an activation data structure.  Combining the cellular constituent count data structure and the latent representation (latent space) using the cellular constituents or the representation as a common dimension (phenomic image feature space similarity), to determine activation weightings. A candidate cellular constituent model (structure-phenomics relationship neural network) is trained using a difference between (i) a prediction of an absence or presence of each covariate in the plurality of covariates in each cellular constituent module represented in the activation data structure upon input of the activation data structure into the candidate model and (ii) actual absence or presence of each covariate in each cellular constituent module (similarity measure of loss/difference metric). This training adjusts a plurality of covariate parameters associated with the candidate cellular constituent model responsive to the difference (modifying parameters of the structure-phenomics relationship neural network).  [0082, 0192: weighting factors/thresholds].  Moreover, Wolf also discloses generating the phenomic embeddings from phenomic images of cells exposed to a training compound or training perturbation as follows: 
In some embodiments, the physiological condition of interest is a phenotype. For instance, in some embodiments, the physiological condition of interest is a physiological manifestation of a compound, a small molecule, and/or a therapeutic, such as toxicity and/or resolution of a disease. In some embodiments, the physiological condition is a phenotype measured using experimental data including, but not limited to, flow cytometry readouts, imaging and microscopy annotations (e.g., H&E slides, IHC slides, radiology images, and/or other medical imaging), and/or cellular constituent data [0237 and 0241: perturbation signatures].
.
Finally, Wolf teaches an ensemble of two different models (trained embedding neural network) in this example: A) a fully connected neural network architecture is used to train on ECFP4 representation, B) a message-passing neural network (MPNN) is used to train on graph representations [Wolf claims 10-12 and 14]. The untrained model 601 is trained using, for each respective chemical structure (compound structure representation) of each respective compound in the training set, for each respective cellular constituent module (phenome) in the set of cellular constituent modules, a respective difference between: (i) a respective calculated activation score (phenomic similarity prediction) for the respective cellular constituent module upon input of the fingerprint of the chemical structure of the respective compound into the untrained model and (ii) the respective numerical activation score of the respective cellular constituent module for the respective compound in the set of cellular constituent modules (obtained from the activation data structure 504), where the training adjusts a plurality of parameters associated with the untrained model 601 responsive to the difference, and where the plurality of parameters comprises 100 or more parameters, thereby obtaining a trained model [0562; Wolf claims 1, 9-12 and 14].  Therefore, Applicant’s assertions were not persuasive, as set forth in above rejection with additions necessitated by claim amendments.  

Double Patenting 
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. 
Note: references to the instant application are italicized in the following Double Patent section.  

A. Instant independent claims 1, 10 and 16 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of Recursion et al. US Patent Application 19/331,878.  The reference claims of ‘261 are obvious variants (generating, utilizing a perturbation machine learning model, a plurality of compound perturbation concentration embeddings from functional response data measured from applying a query compound to a plurality of cells; generating, utilizing the perturbation machine learning model, a target perturbation embedding from functional response data measured from applying a target perturbation to a cell; comparing the plurality of compound perturbation concentration embeddings to the target perturbation embedding to generate a plurality of concentration-target embedding similarity measures; generating, from the plurality of concentration-target embedding similarity measures, a potency metric for the query compound) of the claims of the instant application.  This is a provisional nonstatutory double patenting rejection.  

B.  Instant independent claims 1, 10 and 16 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of Recursion et al. US Patent Application 19/351,052.  The reference claims of ‘261 are obvious variants (inputting the compound-protein machine learning representation comprising the plurality of binding scores between the target compound and the plurality of proteins into the target machine learning model; or generating a biological perturbation program prediction by inputting the compound-protein machine learning representation comprising the plurality of binding scores between the target compound and the plurality of proteins into the target machine learning model;  selecting the plurality of proteins as a subset of proteins from a larger set of proteins based on biological similarity; generating, utilizing the compound-protein interaction machine learning model, the plurality of binding scores between the target compound and the subset of proteins from the larger set of proteins selected based on the biological similarity; and generating the compound-protein machine learning representation comprising the plurality of binding scores between the target compound and the subset of proteins selected based on the biological similarity) of the claims of the instant application.  This is a provisional nonstatutory double patenting rejection.  

C. Instant independent claims 1, 10 and 16 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of Recursion et al. US Patent Application 18,990,261.  The reference claims of ‘261 are obvious variants (perturbation pairs: first perturbation and a second perturbation; generating a tuned map prediction machine learning model by comparing a similarity prediction for the perturbation pair; comparing the similarity prediction with the ground truth similarity to determine a measure of loss; and adjusting parameters of the map prediction machine learning model to reduce the measure of loss for a subsequent tuning iteration; and generating an additional similarity prediction for an unobserved perturbation pair utilizing the tuned map prediction machine learning model) of the claims of the instant application.  This is a provisional nonstatutory double patenting rejection.  


Conclusion
No claims are allowed. 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

E-mail Communications Authorization
Per updated USPTO Internet usage policies, Applicant and/or applicant’s representative is encouraged to authorize the USPTO examiner to discuss any subject matter concerning the above application via Internet e-mail communications.  See MPEP 502.03. To approve such communications, Applicant must provide written authorization for e-mail communication by submitting following form via EFS-Web or Central Fax (571-273-8300): PTO/SB/439.  Applicant is encouraged to do so as early in prosecution as possible, so as to facilitate communication during examination.
Written authorizations submitted to the Examiner via e-mail are NOT proper. Written authorizations must be submitted via EFS-Web or Central Fax (571-273-8300). A paper copy of e-mail correspondence will be placed in the patent application when appropriate. E-mails from the USPTO are for the sole use of the intended recipient, and may contain information subject to the confidentiality requirement set forth in 35 USC § 122. See also MPEP 502.03.

Inquiries
Papers related to this application may be submitted to Technical Center 1600 by facsimile transmission.  Papers should be faxed to Technical Center 1600 via the PTO Fax Center.  The faxing of such papers must conform to the notices published in the Official Gazette, 1096 OG 30 (November 15, 1988), 1156 OG 61 (November 16, 1993), and 1157 OG 94 (December 28, 1993) (See 37 CFR § 1.6(d)).  The Central Fax Center Number is (571) 273-8300.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Vy Rossi, whose telephone number is (703) 756-4649.  The examiner can normally be reached on Monday-Friday from 8:30AM to 5:30PM ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Olivia Wise can be reached on (571) 272-2249.  Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to (571) 272-0547.
Patent applicants with problems or questions regarding electronic images that can be viewed in the Patent Application Information Retrieval system (PAIR) can now contact the USPTO’s Patent Electronic Business Center (Patent EBC) for assistance.  Representatives are available to answer your questions daily from 6 am to midnight (EST). The toll free number is (866) 217-9197. When calling please have your application serial or patent number, the type of document you are having an image problem with, the number of pages and the specific nature of the problem.  The Patent Electronic Business Center will notify applicants of the resolution of the problem within 5-7 business days. Applicants can also check PAIR to confirm that the problem has been corrected.  The USPTO’s Patent Electronic Business Center is a complete service center supporting all patent business on the Internet. The USPTO’s PAIR system provides Internet-based access to patent application status and history information. It also enables applicants to view the scanned images of their own application file folder(s) as well as general patent information available to the public. 

/VR/
Examiner
Art Unit 1685
/MARY K ZEMAN/Primary Examiner, Art Unit 1686
Read full office action
Prosecution Timeline

Jun 25, 2024
Application Filed
Nov 14, 2024
Non-Final Rejection — §101, §102, §112
Feb 04, 2025
Interview Requested
Feb 13, 2025
Examiner Interview Summary
Feb 20, 2025
Response Filed
Apr 04, 2025
Final Rejection — §101, §102, §112
Jun 13, 2025
Interview Requested
Jun 24, 2025
Examiner Interview Summary
Jul 09, 2025
Request for Continued Examination
Jul 15, 2025
Response after Non-Final Action
Feb 02, 2026
Non-Final Rejection — §101, §102, §112
Apr 10, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

16/708,572
Patent 12507960
USING BIOMARKER INFORMATION FOR HEART FAILURE RISK COMPUTATION
2y 5m to grant Granted Dec 30, 2025
17/334,958
Patent 12508077
Method and System for Simulating Surgical Procedures
2y 5m to grant Granted Dec 30, 2025
17/308,106
Patent 12482539
Robustness of Hydrolases by Combining High-pressure Molecular Dynamics Simulation and Free Energy Calculation
2y 5m to grant Granted Nov 25, 2025
18/634,443
Patent 12462941
PAN-CANCER TUMOR MICROENVIRONMENT CLASSIFICATION BASED ON IMMUNE ESCAPE MECHANISMS AND IMMUNE INFILTRATION
2y 5m to grant Granted Nov 04, 2025
16/218,694
Patent 12440169
DETERMINING PROSPECTIVE RISK OF HEART FAILURE HOSPITALIZATION
2y 5m to grant Granted Oct 14, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
33%
Grant Probability
80%
With Interview (+46.6%)
4y 7m
Median Time to Grant
High
PTA Risk
Based on 39 resolved cases by this examiner. Grant probability derived from career allow rate.
DETERMINING PHENOMIC RELATIONSHIPS BETWEEN COMPOUNDS AND CELL PERTURBATIONS UTILIZING MACHINE LEARNING MODELS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email