Office Action Analysis: 17792521 — APPLICATION OF PATHOGENICITY MODEL AND TRAINING THEREOF

Office Action

§101 §102 §103 §112
DETAILED ACTION

CLAIM STATUS
Claims 1-41 are rejected.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
This application is a 371 of the PCT application PCT/GB2021/050086. The instant application claims foreign priority to United Kingdom Patent Applications 2000649.0, 2013386.4, and 2013387.2,  filed 01/16/2020, 08/26/2020, and 08/26/2020 respectively. Foreign priority is acknowledged. As such, the effective filing date of claims 1-41 is 01/16/2020. 

Information Disclosure Statement
The Information Disclosure Statement(s) filed on 1/31/2024, 09/22/25 are in compliance with the provisions of 37 CFR 1.97 and have been considered in full. A signed copy of list of references cited from each IDS is included with this Office Action. 
Nucleotide and/or Amino Acid Sequence Disclosures
REQUIREMENTS FOR PATENT APPLICATIONS CONTAINING NUCLEOTIDE AND/OR AMINO ACID SEQUENCE DISCLOSURES

Items 1) and 2) provide general guidance related to requirements for sequence disclosures.
37 CFR 1.821(c) requires that patent applications which contain disclosures of nucleotide and/or amino acid sequences that fall within the definitions of 37 CFR 1.821(a) must contain a "Sequence Listing," as a separate part of the disclosure, which presents the nucleotide and/or amino acid sequences and associated information using the symbols and format in accordance with the requirements of 37 CFR 1.821 - 1.825. This "Sequence Listing" part of the disclosure may be submitted:
In accordance with 37 CFR 1.821(c)(1) via the USPTO patent electronic filing system (see Section I.1 of the Legal Framework for Patent Electronic System (https://www.uspto.gov/PatentLegalFramework), hereinafter "Legal Framework") as an ASCII text file, together with an incorporation-by-reference of the material in the ASCII text file in a separate paragraph of the specification as required by 37 CFR 1.823(b)(1) identifying:
the name of the ASCII text file;
ii) the date of creation; and
iii) the size of the ASCII text file in bytes;
In accordance with 37 CFR 1.821(c)(1) on read-only optical disc(s) as permitted by 37 CFR 1.52(e)(1)(ii), labeled according to 37 CFR 1.52(e)(5), with an incorporation-by-reference of the material in the ASCII text file according to 37 CFR 1.52(e)(8) and 37 CFR 1.823(b)(1) in a separate paragraph of the specification identifying:
the name of the ASCII text file;
the date of creation; and
the size of the ASCII text file in bytes;
In accordance with 37 CFR 1.821(c)(2) via the USPTO patent electronic filing system as a PDF file (not recommended); or
In accordance with 37 CFR 1.821(c)(3) on physical sheets of paper (not recommended).
When a “Sequence Listing” has been submitted as a PDF file as in 1(c) above (37 CFR 1.821(c)(2)) or on physical sheets of paper as in 1(d) above (37 CFR 1.821(c)(3)), 37 CFR 1.821(e)(1) requires a computer readable form (CRF) of the “Sequence Listing” in accordance with the requirements of 37 CFR 1.824.
If the "Sequence Listing" required by 37 CFR 1.821(c) is filed via the USPTO patent electronic filing system as a PDF, then 37 CFR 1.821(e)(1)(ii) or 1.821(e)(2)(ii) requires submission of a statement that the "Sequence Listing" content of the PDF copy and the CRF copy (the ASCII text file copy) are identical.
If the "Sequence Listing" required by 37 CFR 1.821(c) is filed on paper or read-only optical disc, then 37 CFR 1.821(e)(1)(ii) or 1.821(e)(2)(ii) requires submission of a statement that the "Sequence Listing" content of the paper or read-only optical disc copy and the CRF are identical.
Specific deficiencies and the required response to this Office Action are as follows:
Specific deficiency - This application fails to comply with the requirements of 37 CFR 1.821 - 1.825 because it does not contain a "Sequence Listing" as a separate part of the disclosure or a CRF of the “Sequence Listing.”.
Required response - Applicant must provide:
A "Sequence Listing" part of the disclosure; together with 
An amendment specifically directing its entry into the application in accordance with 37 CFR 1.825(a)(2);
A statement that the "Sequence Listing" includes no new matter as required by 37 CFR 1.821(a)(4); and
A statement that indicates support for the amendment in the application, as filed, as required by 37 CFR 1.825(a)(3).
If the "Sequence Listing" part of the disclosure is submitted according to item 1) a) or b) above, Applicant must also provide:
A substitute specification in compliance with 37 CFR 1.52, 1.121(b)(3) and 1.125 inserting the required incorporation-by-reference paragraph, consisting of:
A copy of the previously-submitted specification, with deletions shown with strikethrough or brackets and insertions shown with underlining (marked-up version);
A copy of the amended specification without markings (clean version); and
A statement that the substitute specification contains no new matter.
If the "Sequence Listing" part of the disclosure is submitted according to item 1) c) or d) above, applicant must also provide:
A CRF in accordance with 37 CFR 1.821(e)(1) or 1.821(e)(2) as required by 1.825(a)(5); and
A statement according to item 2) a) or b) above.
Specific deficiency – Nucleotide and/or amino acid sequences appearing in the specification are not identified by sequence identifiers in accordance with 37 CFR 1.821(d).
Required response – Applicant must provide:
A substitute specification in compliance with 37 CFR 1.52, 1.121(b)(3) and 1.125 inserting the required sequence identifiers, consisting of:
A copy of the previously-submitted specification, with deletions shown with strikethrough or brackets and insertions shown with underlining (marked-up version); 
A copy of the amended specification without markings (clean version); and
A statement that the substitute specification contains no new matter.

Specification
The disclosure is objected to because of the following informalities: The specification includes sequence disclosures without corresponding SEQ ID Nos, see above.  
Appropriate correction is required.

Claim Objections
Claim 19 is objected to because of the following informality: “each subsets” is an improper plural. Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: “a processing component,” and “a prediction component” in claim 30.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. The following are the corresponding structures found the specification for each 112(f) invocation:
Processing component: Specification ¶ 82 indicates that the methods and processes described in the specification are performed in a computer environment, therefore this limitation corresponds to a computer implemented means-plus-function limitation. For a computer-implemented means-plus-function limitation, the structure is the algorithm for performing the function coupled with a computer or microprocessor. In this instance, the instant specification merely reiterates the function of the processing component without providing the algorithm to achieve this function. The component is mentioned in ¶ 11 of the specification, but uses the same language as claim 30 to describe it, and does not provide further structure.
Prediction component: Specification ¶ 82 indicates that the methods and processes described in the specification are performed in a computer environment, therefore this limitation corresponds to a computer implemented means-plus-function limitation. For a computer-implemented means-plus-function limitation, the structure is the algorithm for performing the function coupled with a computer or microprocessor. In this instance, the instant specification merely reiterates the function of the prediction component without providing the algorithm to achieve this function. The component is mentioned in ¶ 11 of the specification, but uses the same language as claim 30 to describe it, and does not provide further structure.

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 30-32 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claim 30 recite(s) a limitation for a “processing component” and a “prediction component” that has been interpreted to invoke 35 U.S.C. 112(f)/35 U.S.C. 112, sixth paragraph. However, as discussed in the Claim Interpretation section above, the instant specification does not describe the algorithm associated with the ”processing component”/”prediction component” but rather merely restates the function of the components. MPEP § 2181.IV sets forth that mere restatement of function in the specification without description of the means to accomplish the function fails to provide adequate written description under 35 U.S.C. 112(a). Therefore, the “processing component”/”prediction component” does not meet the written description requirement for means-plus-function limitations. Since claims 31 and 32 depend from claim 30 and do not resolve the failure to meet the written description requirement, they are rejected for the same reason as claim 30.
	

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 2, 4, 30-32 and 40 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In claim 2, applicant writes: The computer-implemented method of claim 1, wherein the data representation of the at least one genetic condition cluster is derived from the collection of learned variants and weighted in relation to a set of phenotypic information of patients.  
The terms “is derived, is weighted” in claim 2 are indefinite. The use of the passive voice makes it unclear whether the derivation or weighting is to occur within the scope of the claimed method or if this activity is meant to occur before the start of the method, being accomplished by others. Since claims 4 and 40 depend from claim 2 and do not resolve the indefiniteness of claim 2, they are rejected for the same reason as claim 2. 
In claim 4, “the phenotypic information of the patient” lacks antecedent basis. Claim 2 only recites “phenotypic information of patients”, which is distinct from “the patient” being evaluated in claim 1. Amendment of claim 4 to depend from claim 3, could possibly resolve this rejection since it recites receiving phenotypic information of the patient.
Claim limitations “a processing component” and “a prediction component” in claim 30 invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The specification only restates the claim in which these limitations are put forward, and does not limit the components in any way. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph. Since claims 31-32 depend from claim 30 and do not resolve the indefiniteness of claim 30, they are rejected for the same reason as claim 30.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 27 and 39 are non-statutory as they recite “a computer readable medium”. The claims as instantly recited read on carrier waves, which are transitory propagating signals and therefore are not proper patentable subject matter because they do not fit within any of the four statutory categories of invention (In re Nuijten, Federal Circuit, 2007). It is noted that the recitation of a "non-transitory computer readable medium" would overcome the rejection with respect to claims 27 and 39 reading on signals. However, the amendment to only "non-transitory computer readable medium" would not overcome the rejection under 35 U.S.C. 101 since the claims would still be directed to a judicial exception without significantly more (see below). 

Claims 1-41 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea/law of nature/natural phenomenon without significantly more.
In accordance with MPEP § 2106, claims found to recite statutory subject matter ( Step 1 : YES) are then analyzed to determine if the claims recite any concepts that equate to an abstract idea, law of nature or natural phenomenon (Step 2A, Prong 1). While it is noted that claims 27 and 39 do not recite a statutory category of invention (see above), the analysis of these claims under the remaining steps of the subject matter eligibility analysis is continued in the interest of compact prosecution. In the instant application, the claims recite the following limitations that equate to an abstract idea/law of nature/natural phenomenon:
Claims 1 and 27-29 recite determining at least one probability for the variant in relation to pathogenic metrics based on a collection of learned variants wherein the pathogenic metrics comprise a data representation of at least one genetic condition cluster for determining the at least one probability for the variant
Determining a probability of a variant’s pathogenicity based on pathogenic metrics is a mathematical concept because calculating a probability is a simple mathematical calculation that can be performed with pen and paper. 
Claim 2 recites that the data representation of the at least one genetic condition cluster is derived from the collection of learned variants and weighted in relation to a set of phenotypic information of patients.  
Deriving and weighting a dataset is a mathematical concept because it involves organizing information by representing mathematical relationships. A human with a pen and paper could write down information from a list of variants and weight them in relation to other information. and it is a mental process because deriving and weighting a dataset is a series of arithmetic calculations and decision steps that could occur within the human mind. 
Claim 3 recites determining a contribution associated with each of the at least one genetic condition cluster based on the phenotypic information of the patient; and adjusting the at least one probability for the variant based on the contribution determined in accordance with the data representation of the at least one genetic condition cluster.  
This claim recites determining information by calculating over the points of a cluster of data, and adjusting a probability based on that calculation. A human with a pen and paper could determine the overall contribution of a data cluster by performing calculations over its points, and then adjust the probability. This makes it a mathematical relationship. 
Claim 4 recites assessing an availability of the phenotypic information of the patient; and determining, based on the availability, whether to adjust the at least one genetic condition cluster for outputting the combined representation.  
 This claim generically recites evaluating and making decisions about data, which can be practically performed in the human mind, making it a mental process.
Claim 5 recites portioning each of the at least one genetic condition cluster using one or more regression models, wherein the one or more regression models predict the contribution to each of the at least one genetic condition cluster given the phenotypic information of the patient.
Portioning a cluster of data via regression is a mathematical concept because  Regression is an organization of information that can be practically performed by a human with a pen and paper, as is portioning a data cluster based on its results.
Claim 6 recites identifying at least one proximal variant from the collection of learned variants in relation to the variant; identifying a nearest variant based on the set of side information; and applying the nearest variant as the variant when determining the at least one probability for the variant in relation to the pathogenic metrics
 This generically recites the evaluation of data, which can be practically performed in the human mind, making it a mental process.
Claim 7 recites that the nearest variant is identified by applying similarity metrics associated with the at least one proximal variant based on the set of side information; and/or wherein the similarity metrics are weighted in relation to the set of side information.  
This generically recites the evaluation of data, which can be practically performed in the human mind, making it a mental process.
Claim 8 recites that when the similarity metrics identify at least one other variant from the collection of learned variants to have an equivalent similarity score, the at least one probability for the variant is determined by averaging each of the at least one proximal variant.  
Determining an average in cases where a collection of data is being compared to itself and the two points of data are equal could be performed by a human being with a pen and paper, making it a mathematical calculation. 
Claim 9 recites determining a data representation for the annotated data of at least one patient, wherein the data representation is derived using one or more generative models; and generating the at least one genetic condition cluster based on the data representation.  
The scope of “generative model” includes statistical probability models which could be practically performed by a human being with a pen and paper. Therefore the determining a data representation and generating a genetic condition cluster is a mathematical calculation.  
Claim 12 recites adjusting a set of weights associated with the at least one genetic condition cluster based on the set of phenotypic information, wherein the set of weights corresponds to a contribution of the at least one genetic condition cluster to the set of phenotypic information; and configuring one or more regression models based on the adjusted set of weights to determine the contribution in relation to the pathogenic metrics.  
Adjusting a set of numbers based on another number, and adjusting the parameters of a regression model based on the first adjustment is a mathematical calculation because it could be carried out by a human being with a pen and paper. 
Claim 14 recites that the set of side information is applied, when the variant is not included in the collection of variants, to identify a nearest variant from the collection of variants used for determining the at least one probability of the variant; and/or wherein the at least one probability of the variant is determined using a supervised learning framework provided the set of side information.  
The scope of “supervised learning framework” includes statistical probability models that can be performed by a human being with a pen and paper. Identifying a nearest variant could be performed by a human being looking and pointing at a representation of a list of variants having structure information.  Therefore it is a mathematical calculation.
Claim 15 recites that the variant is included in the collection of variants for updating the least one genetic condition cluster by applying annotation associated with the nearest variant.
This claim generically recites annotating a list of data, which could be practically performed by a human writing on the list, making it a mental process. 
Claim 16 recites determining an optimal set of the at least one genetic condition cluster based on the annotated data; and applying the optimal set of the at least one genetic condition cluster during prediction to determine the at least one probability of a variant in relation to the pathogenic metrics.  
This claim generically recites making an informed choice based on data, which could be practically performed by a human looking at the data and making a choice, which is a mental process.
Claim 17 recites that the optimal set of the at least one genetic condition cluster is configured to be updated iteratively with new annotated data.  
This claim generically recites Iterative updating of a data cluster, which is a mental process because  A human being with the data cluster on paper could annotate it with a pen.
Claim 18 recites using the set of side information corresponding to each of a subset of the collection of learned variants to train a supervised learning framework; and assessing the pathogenicity of the unknown variant based on the trained supervised learning framework.
The scope of “supervised learning framework” includes statistical probability models that are mathematical functions, such as linear regression, SVM, &c. Assessing the pathogenicity based on the framework could be included in this algorithm. Therefore, this is a mathematical concept. 
Claim 19 recites comparing the set of side information corresponding to each of a subset of the collection of learned variants, wherein the set of side information corresponding to each subsets of the collection of learned variants is compared in relation to similarity scores associated with the subsets of the collection of learned variants.
Comparing subsets in relation to their similarity scores is a mathematical concept because it describes a summation operation that could be represented as an equation. It is a mental process because the calculations could be practically performed by a human using a pen and paper. 
Claim 20 recites assessing the pathogenicity of the unknown variant in relation to the pathogenicity of a nearest variant further comprising: determining at least one probability for the nearest variant in relation to pathogenic metrics based on a collection of learned variants, wherein the pathogenic metrics comprise a data representation of at least one genetic condition cluster for computing the at least one probability for the nearest variant; and generating a combined representation of the at least one probability, 
Determining a probability and generating a combined representation of the probability is a mathematical concept because it Can be performed by a human being using a pen and paper.
Claim 21 recites generating the combined representation by averaging the at least one probability for each variant of a subset of the collection of learned variants, in response to the subset of the collection of learned variants comprise two or more variants with equivalent similarity score such that the nearest variant cannot be determined; and/s generating the combined representation using the supervised learning framework based on at least one probability for each variant of a subset of the collection of learned variants given the set of side information, wherein the supervised learning framework comprises one or more supervised prediction models. 
The scope of “supervised learning framework”, including the generation based on data herein, includes statistical probability models that can be performed by a human being with a pen and paper. Therefore, this is a mathematical concept. 
Claim 23 recites that the one or more generative models are configured to decompose the data presentation of annotated data in relation to the pathogenic metrics.
The scope of “generative model”, including the decomposition configuration in relation to data, includes statistical probability models that can be performed by a human being with a pen and paper. Therefore, this is a mathematical concept. 
Claim 24 recites that the one or more generative models comprise at least one formulation based on a matrix factorization algorithm.
The human mind is capable of performing matrix factorization. Thus, it is a mental process. Since it is a matrix factorization algorithm, it is a mathematical relationship because it is verbal describing the mathematical function.
Claim 25 recites the computer-implemented method of claim 1, wherein the pathogenic metrics comprises at least one classification indicative of a degree of pathogenicity.
The classification (A.K.A. regression) indicating pathogenicity is the result of a regression in relation to pathogenic metrics, making it a mathematical concept because it involves organizing information by representing mathematical relationships.
Claim 26 recites the computer-implemented method of claim 25, wherein each of the at least one classification is associated with a different optimal set of the at least one genetic condition cluster.  
The classification (A.K.A. regression) being associated with an optimal set involves organizing information by representing mathematical relationships, because an association is a mathematical relationship between the classification and the cluster, thus making it a mathematical concept.
Claim 30 sets forth an apparatus configured to determine whether the variant is within a collection of learned variants; to generate at least one probability for the variant in relation to pathogenic metrics, wherein the pathogenic metrics comprise a data representation of at least one genetic condition cluster for determining the at least one probability for the variant; the at least one probability is normalized
Determining if a variant is in a collection, generating a probability, and normalizing that probability is a mathematical concept because the human mind is capable of performing  these generically recited calculations using a pen and paper. 
Claim 31 recites that the side information is used to identify, in relation to the variant, a nearest variant that is applied as the variant to generate the at least one probability.
Identifying a nearest variant via side information and using it to generate a probability is a mental process because the human mind is capable of performing it using a list of variants and the side information by looking and pointing.
Claim 32 recites that the phenotypic information is applied to adjust the at least one probability for the variant in relation to the at least one genetic condition cluster. 
Using phenotypic information to adjust a probability is a mathematical relationship because A human being with access to all the data could make the adjustment using calculation on pen and paper.
Claim 33, 39, 41 recites assessing the pathogenicity of the unknown gene variant by using a supervised learning framework based on the set of side information; and determining the probability distribution of pathogenicity based on the assessment.
The scope of “supervised learning framework” includes statistical probability models that can be performed by a human being with a pen and paper. Determining the probability distribution could be added as a final step of such an algorithm. Therefore, this is a mathematical concept. 
Claim 34 recites computing a probability of the unknown variant associated with a set of pathogenic metrics given the set of side information.
This is a generic recitation of calculating a probability based on data associations, which can be performed by a human with a pen and paper, making it both a mental process and a mathematical concept. 
Claim 35 recites determining at least one probability for the unknown variant in relation to pathogenic metrics based on a collection of learned variants; and generating a combined representation of the at least one probability, wherein the combined representation is outputted with respect to the pathogenic metrics.  
This is a generic recitation of calculating a probability based on data associations, and calculating a representation of that probability, which can be performed by a human with a pen and paper, making it a mathematical concept. 
Claim 36 recites that the supervised learning framework comprises one or more prediction models.
The scope of “prediction models” includes statistical probability models that can be performed by a human being with a pen and paper. This makes it a mathematical concept. 
Claim 37 recites that the supervised learning framework comprises a non-parametric classifier.  
The human mind is capable of computing the steps of a non-parametric classifier, so it is a mental process.  A non-parametric classifier is also a type of mathematical calculation, making it a mathematical relationship.
Claims 1-41 recite performing some aspects of the analysis with apparatuses, mediums, and systems, there are no additional limitations that indicate that these apparatuses, mediums, and systems require anything other than carrying out the recited mental process or mathematical concept in a generic computer environment. Merely reciting that a mental process is being performed in a generic computer environment does not preclude the steps from being performed practically in the human mind or with pen and paper as claimed. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then if falls within the “Mental processes” grouping of abstract ideas. As such, claim(s) 1-41 recite an abstract idea ( Step 2A, Prong 1 : YES).
Claims found to recite a judicial exception under Step 2A, Prong 1 are then further analyzed to determine if the claims as a whole integrate the recited judicial exception into a practical application or not (Step 2A, Prong 2). This judicial exception is not integrated into a practical application because the claims do not recite an additional element that reflects an improvement to technology or applies or uses the recited judicial exception to affect a particular treatment for a condition. Rather, the instant claims recite additional elements that amount to mere instructions to implement the abstract idea in a generic computing environment or mere instructions to apply the recited judicial exception via a generic treatment. Specifically, the claims recite the following additional elements:
Claim 1 recites a computer-implemented method comprising receiving a variant; and outputting a combined representation of the at least one probability of the variant for the patient.  
Claim 3 recites receiving phenotypic information of the patient
Claim 6 recites receiving a set of side information
Claim 9 recites receiving annotated data of at least one patient associated with a collection of variants, wherein the annotated data comprise interpretation information with associated observations corresponding to the pathogenic metrics.
Claim 10 recites the computer-implemented method of claim 9, wherein the annotated data further comprises at least one of a set of phenotypic information of patients and/or a set of side information.
Claim 18 recites receiving the unknown variant;
Claim 20 recites wherein the combined representation is outputted with respect to the pathogenic metrics
Claims 22 and 40 recite the phenotypic information comprises phenotypic ontology associated with one or more diseases;
Claim 27 recites A computer-readable medium comprising computer-readable code or instructions stored thereon, which when executed on a processor, causes the processor to implement the computer-implemented method according to claim 1.
Claim 28 recites A system comprising at least one circuitry that is configured to execute the computer-implemented method according to claim 1.
Claim 29 recites an apparatus comprising a processor, a memory and a communication interface, the processor connected to the memory and communication interface, wherein the apparatus is adapted or configured to implement the computer-implemented method according to claim 1.
Claim 30 recites an input component configured to receive the variant; processing component; a prediction component, in response to a determination that the variant is present in the collection of the learned variant; and a display component
Claim 32 recites further limitations on the type of data received by the input component. 
Claim 33 recites receiving the unknown variant of a patient
Claim 39 recites A computer-readable medium comprising computer-readable code or instructions stored thereon, which when executed on a processor, causes the processor to implement the computer-implemented method of claim 33.
Claim 41 recites an apparatus comprising a processor, a memory and a communication interface, the processor connected to the memory and communication interface, wherein the apparatus is adapted or configured to implement the computer- implemented method according to claim 33.
There are no limitations that indicate that the apparatuses, mediums, and systems of claims 27-30, 39, 41 require anything other than generic computing systems. As such, these limitations equate to mere instructions to implement the abstract idea on a generic computer that the courts have stated does not render an abstract idea eligible in Alice Corp., 573 U.S. at 223, 110 USPQ2d at 1983. See also 573 U.S. at 224, 110 USPQ2d at 1984. 
The additional elements of claims 1, 3, 6, 9, 10, 18, 20, and 22, 32, 33, and 40 do not add a meaningful limitation to the abstract idea because they amount to mere data gathering/output steps that would be required for the claimed mental processes. These limitations serve to gather data that is used as input/output for the abstract idea and there is no indication that the abstract idea has any impact on those data gathering steps. The courts have indicated that mere data gathering/outputting activity is insignificant extra-solution activity that does not provide a practical application (see MPEP 2106.05(g)).
As such, claims 1-41 are directed to an abstract idea ( Step 2A, Prong 2 : NO).
Claims found to be directed to a judicial exception are then further evaluated to determine if the claims recite an inventive concept that provides significantly more than the judicial exception itself (Step 2B). The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims recite additional elements that equate to mere instructions to apply the recited exception in a generic way or in a generic computing environment. The instant claims recite the following additional elements:
Claim 1 recites receiving a variant; and outputting a combined representation of the at least one probability of the variant for the patient.  
Claim 3 recites receiving phenotypic information of the patient
Claim 6 recites receiving a set of side information
Claim 9 recites receiving annotated data of at least one patient associated with a collection of variants, wherein the annotated data comprise interpretation information with associated observations corresponding to the pathogenic metrics.
Claim 10 recites the computer-implemented method of claim 9, wherein the annotated data further comprises at least one of a set of phenotypic information of patients and/or a set of side information.
Claim 18 recites receiving the unknown variant;
Claim 20 recites wherein the combined representation is outputted with respect to the pathogenic metrics
Claims 22 and 40 recite the phenotypic information comprises phenotypic ontology associated with one or more diseases;
Claim 27 recites A computer-readable medium comprising computer-readable code or instructions stored thereon, which when executed on a processor, causes the processor to implement the computer-implemented method according to claim 1.
Claim 28 recites A system comprising at least one circuitry that is configured to execute the computer-implemented method according to claim 1.
Claim 29 recites an apparatus comprising a processor, a memory and a communication interface, the processor connected to the memory and communication interface, wherein the apparatus is adapted or configured to implement the computer-implemented method according to claim 1.
Claim 30 recites an input component configured to receive the variant; processing component; a prediction component, in response to a determination that the variant is present in the collection of the learned variant; and a display component
Claim 32 recites further limitations on the type of data received by the input component. 
Claim 33 recites receiving the unknown variant of a patient
Claim 39 recites A computer-readable medium comprising computer-readable code or instructions stored thereon, which when executed on a processor, causes the processor to implement the computer-implemented method of claim 33.
Claim 41 recites an apparatus comprising a processor, a memory and a communication interface, the processor connected to the memory and communication interface, wherein the apparatus is adapted or configured to implement the computer- implemented method according to claim 33.
As discussed above, there are no additional limitations to indicate that the claimed apparatuses, mediums, and systems in claims 27-30, 39 and 41 require anything other than generic computer components in order to carry out the recited abstract idea in the claims. Claims that amount to nothing more than an instruction to apply the abstract idea using a generic computer do not render an abstract idea eligible. Alice Corp., 573 U.S. at 223, 110 USPQ2d at 1983. See also 573 U.S. at 224, 110 USPQ2d at 1984. MPEP 2106.05(f) discloses that mere instructions to apply the judicial exception cannot provide an inventive concept to the claims. 
Performing clinical tests on individuals to obtain input for an equation has been identified as awell-understood, routine, conventional activity that amounts to insignificant extra-solution activity, In re Grams, 888 F.2d 835, 839-40; 12 USPQ2d 1824, 1827-28 (Fed. Cir. 1989). Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information) has also been identified as a well-understood, routine, conventional activity that amounts to insignificant extra-solution activity. The other additional elements of claims 3, 6, and 9, 22, and 40, which claim the receipt of phenotypic information, side information, and interpretation information, would be an example of such a clinical test/internet data gathering, and therefore would classify as well-understood, routine, conventional activity, and thus insignificant extra-solution activity.
	The other additional elements of claims 1 and 18 place limitations upon the extraction of read data from the biological cell (“receiving a variant”). Analyzing DNA to provide sequence information or detect allelic variants has been found to be a conventional laboratory technique that amounts to insignificant extra-solution activity, Genetic Techs. Ltd., 818 F.3d at 1377; 118 USPQ2d at 1546. (see MPEP 2106.05(g)). Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information) has also been identified as a well-understood, routine, conventional activity that amounts to insignificant extra-solution activity. The sequence information (“variant”) is a required input for the claimed interaction analyses, so its extraction is insignificant extra-solution activity.
The additional elements do not comprise an inventive concept when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception. Therefore, the claims do not amount to significantly more than the judicial exception itself ( Step 2B : No). As such, claims 1-41 are not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-5, 9-23, and 25-41 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Araya et al. (US 20180365372, IDS reference) (hereafter “Araya”) as evidenced by Ding (ICML, 2005).
Regarding claim 1, the following limitations are recited:
A computer-implemented method
“A computer implemented method” (Araya claim 1)
receiving a variant;
“Receiving molecular variants” (Araya claim 1)
determining at least one probability for the variant in relation to pathogenic metrics based on a collection of learned variants, wherein the pathogenic metrics comprise a data representation of at least one genetic condition cluster for determining the at least one probability for the variant
“In some embodiments , the molecular scores from a plurality of single - cells , cellular compartments , subcellular compartments , or synthetic compart ments , harboring the same molecular variants 906 ( e . g . , V1 , V2 , and v3 ) may be accessed with a variant sampling layer 908 and analyzed in a variant scoring layer 910 to derive ( e . g . , directly measure or model ) summary statistics relating to the tendency ( e . g . , mean , median , mode ) , dispersion ( e . g . , variance , standard deviation ) , shape ( e . g . , skewness , kurto sis ) , probability ( e . g . , quantiles ) , range ( e . g . , confidence interval , minimum , maximum ) , error ( e . g . , standard error ) , or covariation ( e . g . , covariance ) of molecular scores asso ciated with individual molecular variants . In some embodiments , as illustrated in FIG . 9 , summary statistics relating to the tendency , dispersion , shape , range , or error of molecular scores may be used to create a database of ( e . g . , quality controlled ) molecular signals 912 associated with individual molecular variants 906 . In some embodiments , molecular measurements , molecular processes , molecular features , and molecular scores” (Arayaspec ¶ 57)
 outputting a combined representation of the at least one probability of the variant for the patient.
“determining functional scores or functional classifications for the molecular variants based on statistical learning, wherein the statistical learning associates the molecular signals, the phenotype signals, or the population signals of molecular variants with phenotypic impacts of the molecular variants;” (Araya claim 1)
determining the phenotypic impacts of the molecular variants based on the functional scores, the functional classifications, the evidence scores, or the evidence classifications.” (Araya claim 1). In some embodiments, the present disclosure describes systems and methods that may continuously evaluate, validate, and optimize (e.g., select, remove, or modify) diverse evidence datasets on the basis of the above described evaluation metrics, and distribute the best-performing (e.g., independent) evidence datasets to client systems via an Application Program Interface (API) for use in variant interpretation and prioritization practices determining the phenotypic impact (e.g., pathogenicity, functionality, or relative effect) of molecular variants identified within a biological sample or record thereof of a subject. (Araya spec ¶ 119)
The molecular scores of Araya can include probabilities, which are determined from a set of variants. The combined representation or “phenotypic impacts” in Araya is derived from the functional scores which are derived from probabilities in the molecular scores. The phenotypic impacts are outputted to clients via an API. 
Therefore claim 1 is taught by Araya.
Regarding claim 2:
the data representation of the at least one genetic condition cluster is derived from the collection of learned variants 
In some embodiments, the present disclosure describes systems and methods for the identification of SMRs/SMNs by measuring and scoring the phenotype-associated mutation density (e.g., number of observed phenotype-associated variants per residue) within spatially-proximal residues of functional elements (e.g., protein-coding genes) through the application of spatial clustering techniques across a plurality of spatial distance metrics, where the phenotype-associated variants may be defined on the basis of the functional scores and functional classifications herein described. As would be appreciated by a person of ordinary skill in the art, these methods may allow the determination of clusters of residues in which variants with specifically-defined phenotypic impacts occur. (Araya spec ¶ 116)
The genetic condition clusters (phenotype-associated mutation density score) are derived from the functional scores and functional classifications of claim 1, which are found based on statistical learning over variants (i.e. “learned variants”).
and weighted in relation to a set of phenotypic information of patients, 
“Determining functional scores or functional classifications for the molecular variants based on statistical learning, wherein the statistical learning associates the molecular signals, the phenotype signals, or the population signals of molecular variants with phenotypic impacts of the molecular variants” (Araya claim 1)
The phenotype-associated mutation density scores are based on the phenotypic information/signals that informed the functional scores.
Therefore,  claim 2 is taught by Araya.
Regarding claim 3:
receiving phenotypic information of the patient
“The phenotypic impacts of the molecular variants are derived based on clinical databases, phenotype databases” (Araya claim 6)
determining a contribution associated with each of the at least one genetic condition cluster based on the phenotypic information of the patient
“Determining functional scores or functional classifications for the molecular variants based on statistical learning, wherein the statistical learning associates the molecular signals, the phenotype signals, or the population signals of molecular variants with phenotypic impacts of the molecular variants” (Araya claim 1)
 adjusting the at least one probability for the variant based on the contribution determined in accordance with the data representation of the at least one genetic condition cluster
“Deriving evidence scores or evidence classifications of the molecular variants based on the functional scores or functional classifications” (Araya claim 1)
A functional score is equivalent to the contribution in this case, and the evidence score is a probability, so claim 3 is taught by Araya.
Regarding claim 4:
assessing an availability of the phenotypic information of the patient; and determining, based on the availability, whether to adjust the at least one genetic condition cluster for outputting the combined representation.
In some embodiments, the classification (or regression) may relate to (e.g., likely) disease-causing (e.g., pathogenic) and neutral (e.g., benign) variants for disorders with genetic components, or predictions of the severity thereof, on the basis of the molecular variants identified within a biological sample or record thereof of a subject (Araya spec ¶ 48)
Since regressions adjust their outcome based on the presence or absence of the term, claim 4 is taught by Araya.
Regarding claim 5:
The computer-implemented method of claim 3, wherein the determining a contribution associated with each of the at least one genetic condition cluster based on the phenotypic information of the patient, further comprising: portioning each of the at least one genetic condition cluster using one or more regression models, wherein the one or more regression models predict the contribution to each of the at least one genetic condition cluster given the phenotypic information of the patient.
In some embodiments, the classification (or regression) may relate to (e.g., likely) disease-causing (e.g., pathogenic) and neutral (e.g., benign) variants for disorders with genetic components, or predictions of the severity thereof, on the basis of the molecular variants identified within a biological sample or record thereof of a subject (Araya spec ¶ 48)
The regression is portioning and predicting contributions to the cluster, so claim 5 is taught by Araya.
Regarding claim 9: 
receiving annotated data of at least one patient associated with a collection of variants, wherein the annotated data comprise interpretation information with associated observations corresponding to the pathogenic metrics
“Receiving molecular variants associated with one or more functional elements within a model system, wherein the model system comprises single-cells…” (Araya claim 1)
“Interpretation information” does not have a standard meaning in the art. Applying the broadest reasonable interpretation, the interpretation information can be analogized to the “annotation features” of Araya. Fig. 13 of Araya shows phenotypic impacts (observations corresponding to the pathogenic metrics) being associated with annotation features. determining a data representation for the annotated data of at least one patient, wherein the data representation is derived using one or more generative models
“In some embodiments, as illustrated in FIG. 8, the molecular measurements, molecular processes, molecular features, or (e.g., lower-order) molecular scores 806 from single-cells, cellular compartments, subcellular compartments, or synthetic compartments harboring the same molecular variants 802 may be fed through a series of artificial neuron layers (e.g., convolutional or perceptron layers) in an Artificial Neural Network 804 (ANN) to derive increasingly complex (e.g., higher-order) molecular scores 806, and generate autoencoders with learned features (Araya spec ¶ 56)”
 generating the at least one genetic condition cluster based on the data representation.
“In some embodiments, and as illustrated in FIG. 18, the identification of SMRs/SMNs may apply a Training/Validation Layer 1804 to identify spatial clustering among phenotypically-related or functionally-related molecular variants” (Araya spec ¶ 111)
An ANN is a type of generative model. Therefore claim 9 is taught by Araya. 
Regarding claim 10:
the annotated data further comprises a set of phenotypic information of patients and/or a set of side information
“The present disclosure provides system, apparatus, device, method and/or computer program product embodiments for the classification (or regression) of likely phenotypic impacts in a subject on the basis of one or more molecular signals, phenotype signals, or population signals measured in in vivo or in vitro functional model systems” (Araya spec ¶ 33)
Regarding claim 11:
the set of phenotypic information is associated with the interpretation information in relation to the at least one patient; andwherein the set of side information is associated with the interpretation information in relation to the collection of variants
See Araya Fig. 13
“Interpretation information” does not have a standard meaning in the art. Applying the broadest reasonable interpretation, the interpretation information can be analogized to the “annotation features” of Araya. Fig. 13 of Araya shows phenotypic impacts (information) being associated with annotation features. Therefore, claim 11 is taught by Araya.
Regarding claim 12:
adjusting a set of weights associated with the at least one genetic condition cluster based on the set of phenotypic information, wherein the set of weights corresponds to a contribution of the at least one genetic condition cluster to the set of phenotypic information; and configuring one or more regression models based on the adjusted set of weights to determine the contribution in relation to the pathogenic metrics
“Determining functional scores or functional classifications for the molecular variants based on statistical learning, wherein the statistical learning associates the molecular signals, the phenotype signals, or the population signals of molecular variants with phenotypic impacts of the molecular variants” (Araya claim 1)
Classification is another word for regression in Araya (spec ¶ 48). The regression associating phenotype signals with phenotypic impacts is equivalent to configuring the regression model based on adjusted weights associated weights of the genetic condition cluster to determine the contribution to the phenotypic information. Therefore claim 12 is taught by Araya. 
Regarding claim 13:
the set of side information comprises a data representation of indicators associated with the collection of variants
“independent or disjoint estimates of molecular signals or phenotype signals can be used to create a database of (quality-controlled) molecular or phenotype signals associated with individual molecular variants.” (Araya spec ¶ 67)
The phenotypic signals are transformed values of the side information, and it is associated with multiple variants in Araya, so claim 13 is taught by Araya.
Regarding claim 14:
the set of side information is applied, when the variant is not included in the collection of variants, to identify a nearest variant from the collection of variants used for determining the at least one probability of the variant; and/or wherein the at least one probability of the variant is determined using a supervised learning framework provided the set of side information
“SMR/SMN-detection techniques 1805 can comprise a series of steps including but not limited to: (1.1) projection 1810 of phenotype-associated molecular variants 1806 in functional, sequence, structural, or (co)evolutionary dimensions (or combinations thereof), (1.2) application of spatial clustering techniques 1812 (e.g., DBSCAN) to detect clusters of spatially-proximal phenotype-associated variants” (Araya spec ¶ 113)
“In some embodiments, as illustrated in FIG. 8, the molecular measurements, molecular processes, molecular features, or (e.g., lower-order) molecular scores 806 from single-cells, cellular compartments, subcellular compartments, or synthetic compartments harboring the same molecular variants 802 may be fed through a series of artificial neuron layers (e.g., convolutional or perceptron layers) in an Artificial Neural Network 804 (ANN) to derive increasingly complex (e.g., higher-order) molecular scores 806, and generate autoencoders with learned features (Araya spec ¶ 56)”
the variant is included in the collection of variants for updating the least one genetic condition cluster by applying annotation associated with the nearest variant
“SMR/SMN-detection techniques 1805 can comprise a series of steps including but not limited to: (1.1) projection 1810 of phenotype-associated molecular variants 1806 in functional, sequence, structural, or (co)evolutionary dimensions (or combinations thereof),” (Araya spec ¶ 113)
Spatial clustering must identify nearest variants of a given variant and determine a probability, and Araya has provisions for determining these probabilities through supervised learning (ANN). Therefore claim 14 is taught by Araya.
Regarding claim 15:
The computer-implemented method of claim 14, wherein the variant is included in the collection of variants for updating the least one genetic condition cluster by applying annotation associated with the nearest variant.  
“SMR/SMN-detection techniques 1805 can comprise a series of steps including but not limited to: (1.1) projection 1810 of phenotype-associated molecular variants 1806 in functional, sequence, structural, or (co)evolutionary dimensions (or combinations thereof),” (Araya spec ¶ 113)
Applying (projecting) annotation (functional/structural) information is taught by Araya.
Regarding claims 16-17 the following limitations are put forth:
determining an optimal set of the at least one genetic condition cluster based on the annotated data; and applying the optimal set of the at least one genetic condition cluster during prediction to determine the at least one probability of a variant in relation to the pathogenic metrics
the optimal set of the at least one genetic condition cluster is configured to be updated iteratively with new annotated data
Claim 121 of Araya states “A computer implemented method for scoring phenotypic impacts of molecular variants, comprising: evaluating an evidence dataset based on an accuracy of the evidence dataset; validating the evidence dataset based on the accuracy of the evidence dataset; optimizing the evidence dataset based on the accuracy of the evidence dataset; and
determining the phenotypic impacts of the molecular variants based on the evaluating, validating, and optimizing of the evidence dataset.”
Evaluating, Validating, Optimizing, and determining phenotypic impacts reads on the determining & applying optimal set to determine a probability. Therefore claims 16-17 are taught by Araya.
Regarding claim 18, the following limitations are recited and taught by Araya: 
A computer-implemented method for assessing pathogenicity of an unknown variant for a patient using a set of side information
“A computer implemented method for scoring phenotypic impacts of molecular variants… determining the phenotypic impacts of the molecular variants based on the evaluating, validating, and optimizing of the evidence dataset.” (Araya claim 121)
receiving the unknown variant, wherein the unknown variant is not identified in the collection of learned variants
“receiving molecular variants associated with one or more functional elements within a model system, wherein the model system comprises single-cells, cellular compartments, subcellular compartments, or synthetic compartments” (Araya claim 1)
selecting a second set of genotypes with unknown, putative, or known phenotypic impacts using a sampling model; (Araya claim 94)
using the set of side information corresponding to each of a subset of the collection of learned variants to train a supervised learning framework; and assessing the pathogenicity of the unknown variant based on the trained supervised learning framework.
“FIGS. 1A-1C illustrate integrated functional assay and computational Deep Mutational Learning (DML) processes and systems for determining the phenotypic impact of molecular variants, as well as example (e.g., intermediate) data generated from the application of processes and systems in two genes of the RAS/MAPK family of disorders, according to some embodiments.” (Araya spec ¶ 10, see figs. 1a-1c) (see also Figs. 10, 13)
“In some embodiments, a training/validation layer 710 generates and quality-controls Phenotype Models (mP) that can predict the phenotypic impact 706 of individual single-cells 702.” (Araya spec ¶ 61)
Figs. 1A, 10 and 13 show variants and side information (phenotypic scores) being used as input for a supervised learning framework. Therefore, claim 18 is taught by Araya.
Regarding claim 19:
comparing the set of side information corresponding to each of a subset of the collection of learned variants, wherein the set of side information corresponding to each subset of the collection of learned variants is compared in relation to similarity scores associated with the subsets of the collection of learned variants
“In some embodiments, as illustrated in FIG. 8, the molecular measurements, molecular processes, molecular features, or (e.g., lower-order) molecular scores from single-cells, cellular compartments, subcellular compartments, or synthetic compartments harboring the same molecular variants may be fed through a series of artificial neuron layers (e.g., convolutional or perceptron layers) in an Artificial Neural Network  (ANN) to derive increasingly complex (e.g., higher-order) molecular scores , and generate autoencoders with learned features.” (Araya spec ¶ 56)
The claimed comparison is inherently carried out in an ANN run. Therefore, claim 19 is taught by Araya.
Regarding claim 20:
assessing the pathogenicity of the unknown variant in relation to the pathogenicity of a nearest variant further comprising: determining at least one probability for the nearest variant in relation to pathogenic metrics based on a collection of learned variants, wherein the pathogenic metrics comprise a data representation of at least one genetic condition cluster for computing the at least one probability for the nearest variant; and generating a combined representation of the at least one probability, wherein the combined representation is outputted with respect to the pathogenic metrics
“In some embodiments, as illustrated in FIG. 8, the molecular measurements, molecular processes, molecular features, or (e.g., lower-order) molecular scores from single-cells, cellular compartments, subcellular compartments, or synthetic compartments harboring the same molecular variants may be fed through a series of artificial neuron layers (e.g., convolutional or perceptron layers) in an Artificial Neural Network  (ANN) to derive increasingly complex (e.g., higher-order) molecular scores , and generate autoencoders with learned features.” (Araya spec ¶ 56)
“the present disclosure describes systems and methods for determining the phenotypic impact (e.g., pathogenicity, functionality, or relative effect) of molecular variants through a series of modeling layers” (Araya spec ¶ 98)
The claimed assessment is an inherent part of the described ANN run. Therefore, claim 20 is taught by Araya.
Regarding claim 21:
generating the combined representation by averaging the at least one probability for each variant of a subset of the collection of learned variants, in response to the subset of the collection of learned variants comprise two or more variants with equivalent similarity score such that the nearest variant cannot be determined; and/or generating the combined representation using the supervised learning framework based on at least one probability for each variant of a subset of the collection of learned variants given the set of side information, wherein the supervised learning framework comprises one or more supervised prediction models
“In some embodiments, as illustrated in FIG. 8, the molecular measurements, molecular processes, molecular features, or (e.g., lower-order) molecular scores from single-cells, cellular compartments, subcellular compartments, or synthetic compartments harboring the same molecular variants may be fed through a series of artificial neuron layers (e.g., convolutional or perceptron layers) in an Artificial Neural Network  (ANN) to derive increasingly complex (e.g., higher-order) molecular scores , and generate autoencoders with learned features.” (Araya spec ¶ 56)
The generated combined representation is an inherent part of the described ANN run. Therefore, claim 21 is taught by Araya.
	Regarding claim 22:
the phenotypic information comprises phenotypic ontology associated with one or more diseases
Araya (spec ¶ 78): In some other embodiments, the present disclosure provides systems and methods for deriving functional scores and functional classifications via statistical (e.g., machine) learning to generate a Functional Model (mF) that associates molecular, phenotype, or population signals (e.g., features)—derived from one or more molecular measurements, molecular processes, molecular features, and/or molecular scores—with phenotypic impacts (e.g., labels) of molecular variants computed directly from distinct molecular, phenotype, or population signals, via regression and classification techniques. In some embodiments, this approach may permit, for example, deriving functional scores and functional classifications that predict the relative mutation burden, mutation rate, or mutation signatures of samples from subjects harboring specific molecular variants. In some embodiments, functional scores or functional classifications from such assays may permit informing on the lifetime risk of developing cancer in test subjects.
A phenotypic ontology, when the broadest reasonable interpretation is applied, is a set of information and labels defining the ground truth, or nature of being, of phenotypes and genes. Therefore, it is taught by Araya.
Regarding claim 23:
the one or more generative models are configured to decompose the data presentation of annotated data in relation to the pathogenic metrics
“FIGS. 1A-1C illustrate integrated functional assay and computational Deep Mutational Learning (DML) processes and systems for determining the phenotypic impact of molecular variants” (Araya spec ¶ 10, see figs. 1a-1c)
Figs. 1A-1C show the decomposition of the data presentation. Therefore, claim 23 is taught by Araya.
Regarding claim 24:
The computer-implemented of claim 9, wherein the one or more generative models comprise at least one formulation based on a matrix factorization algorithm.
“These molecular states (e.g., sub-populations) or phenotype scores may be associated with, but not limited to, subpopulations of cells defined by… (c) unsupervised or supervised machine learning methods, including… Principal Component Analysis (PCA)” (Araya spec ¶ 65)
PCA is partly based on a matrix factorization algorithm. Ding (ICML, 2005) states: “From matrix perspective, PCA/SVD are matrix factorization (approximations by lower rank matrices with clear meaning). “  Therefore, claim 24 is taught by Araya.
Regarding claim 25:
the pathogenic metrics comprises at least one classification indicative of a degree of pathogenicity
“In some embodiments, the classification (or regression) may relate to (e.g., likely) disease-causing (e.g., pathogenic) and neutral (e.g., benign) variants for disorders with genetic components” (Araya spec ¶ 48)
The pathogenic metrics are the outcome of the regression, and classify the variants based on pathogenicity. Therefore, claim 25 is taught by Araya.
Regarding claim 26:
The computer-implemented method of claim 25, wherein each of the at least one classification is associated with a different optimal set of the at least one genetic condition cluster
This is stating that each of the regressions are associating the clusters with different levels of pathogenicity. 
“the present disclosure describes systems and methods for determining the phenotypic impact (e.g., pathogenicity, functionality, or relative effect) of molecular variants through a series of modeling layers” (Araya spec ¶ 98)
“generating an optimal functional model by applying statistical learning techniques that associate molecular signals, phenotype signals, or population signals from the optimal dimensionality reduction model with phenotypic impacts for the first set of genotypes using the optimal molecular reads and the optimal molecular measurements;” (Araya claim 120)
The regression to different clusters based on pathogenicity is provided for by Araya, as is the optimization of the set. Therefore, claim 26 is taught by Araya.
Regarding claim 27:
A computer-readable medium comprising computer-readable code or instructions stored thereon, which when executed on a processor, causes the processor to implement the computer-implemented method according to claim 1.
Claim 137 of Araya asserts “a computer-readable device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations comprising…” 
Claim 137 continues to list steps equivalent to that of claim 1 of the instant application (also equivalent to claim 1 of Araya). Therefore, claim 27 is taught by Araya.
Regarding claim 28:
A system comprising at least one circuitry that is configured to execute the computer-implemented method according to claim 1.
Claim 136 of Araya sets forth “A system, comprising: a memory; and at least one processor coupled to the memory and configured to…” 
Claim 136 continues to list a number of steps to execute a method equivalent to that of claim 1 of the instant application (also equivalent to claim 1 of Araya). Therefore, claim 28 is taught by Araya.
Regarding claim 29:
An apparatus comprising a processor, a memory and a communication interface, the processor connected to the memory and communication interface, wherein the apparatus is adapted or configured to implement the computer-implemented method according to claim 1.
Figure 19 of Araya shows a communications interface connected to the memory and processor.
Claim 136 of Araya sets forth “A system, comprising: a memory; and at least one processor coupled to the memory and configured to…” 
Claim 136 continues to list a number of steps to execute a method equivalent to that of claim 1 of the instant application (also equivalent to claim 1 of Araya). Therefore, claim 29 is taught by Araya.
Regarding claim 30: 
an apparatus for determining pathogenicity of a variant for a patient, the apparatus comprising: an input component configured to receive the variant; a processing component configured to determine whether the variant is within a collection of learned variants; a prediction component, in response to a determination that the variant is present in the collection of the learned variant, configured to generate at least one probability for the variant in relation to pathogenic metrics, wherein the pathogenic metrics comprise a data representation of at least one genetic condition cluster for determining the at least one probability for the variant 
Claim 136 of Araya sets forth “A system, comprising: a memory; and at least one processor coupled to the memory and configured to…” 
Claim 136 continues to list a number of steps to execute a method equivalent to that of claim 1 of the instant application (also equivalent to claim 1 of Araya), which this list of steps is also equivalent to. 
 a display component configured to display the at least one probability for the variant with respect to the pathogenic metrics,
 “Computer system 1900 also includes user input/output device(s) 1903, such as monitors”. (Araya spec ¶ 132)
wherein the at least one probability is normalized
“FIGS. 3A and 3B illustrates data obtained with a logistic regression (LR) classifier trained for binary classification of cells harboring disease-associated molecular variants and cells harboring wildtype MAP2K2, on the basis of higher-order molecular scores computed as the top 100 principal components from (e.g., scaled and or normalized) lower-order molecular scores.” (Araya spec ¶ 89)
each of the at least one classification is associated with a different optimal set of the at least one genetic condition cluster
“In some embodiments, the classification (or regression) may relate to (e.g., likely) disease-causing (e.g., pathogenic) and neutral (e.g., benign) variants for disorders with genetic components” (Araya spec ¶ 48)
Therefore, claim 30 is taught by Araya.
Regarding claim 31:
the prediction component, in response to a determination that the variant is absent in the collection of the learned variant, configured to receive a set of side information, wherein the side information is used to identify, in relation to the variant, a nearest variant that is applied as the variant to generate the at least one probability
“SMR/SMN-detection techniques 1805 can comprise a series of steps including but not limited to: (1.1) projection 1810 of phenotype-associated molecular variants 1806 in functional, sequence, structural, or (co)evolutionary dimensions (or combinations thereof), (1.2) application of spatial clustering techniques 1812 (e.g., DBSCAN) to detect clusters of spatially-proximal phenotype-associated variants” (Araya spec ¶ 113)
Araya’s methods of finding spatially-proximal residues based on spatial distance metrics, which are a type of side information (side information has no standard meaning in the art, and is not defined by the specification, so when viewed in the broadest reasonable interpretation can be any kind of information beyond the letters of the genome), reads on the identification of nearest variants based on side information.  
Regarding claim 32:
the input component is configured to receive phenotypic information associated with the patient, wherein the phenotypic information is applied to adjust the at least one probability for the variant in relation to the at least one genetic condition cluster
“A system, comprising: a memory; and at least one processor coupled to the memory and configured to… determine molecular scores or phenotype scores of the single-cells, the cellular compartments, the subcellular compartments, or the synthetic compartments… determine the phenotypic impacts of the molecular variants based on the functional scores, the functional classifications, the evidence scores, or the evidence classifications.” (Araya claim 136)
The system described in Araya is receiving the phenotypic information in order to determine it, so claim 32 is taught by Araya. 
Regarding claim 33:
A computer-implemented method for determining a probability distribution of pathogenicity for an unknown gene variant using a set of side information
Claim 121 of Araya states “A computer implemented method for scoring phenotypic impacts of molecular variants, comprising: evaluating an evidence dataset based on an accuracy of the evidence dataset; validating the evidence dataset based on the accuracy of the evidence dataset; optimizing the evidence dataset based on the accuracy of the evidence dataset; and determining the phenotypic impacts of the molecular variants based on the evaluating, validating, and optimizing of the evidence dataset.”
receiving the unknown variant of a patient, wherein the unknown variant is not identified in or is new to the collection of learned variants associated with a plurality of patients; 
“selecting a third set of genotypes with unknown, putative, or known phenotypic impacts using a sampling model; generating a functional model by applying statistical learning techniques that associates molecular signals, phenotype signals, or population signals of the first set of genotypes with putative or known phenotypic impacts;” (Araya spec ¶ 94)
assessing the pathogenicity of the unknown gene variant by using a supervised learning framework based on the set of side information;
“FIGS. 1A-1C illustrate integrated functional assay and computational Deep Mutational Learning (DML) processes and systems for determining the phenotypic impact of molecular variants” (Araya spec ¶ 10, see figs. 1a-1c) (see also figs. 10, 13)
The “phenotypic impact” of Araya includes pathogenicity (Araya spec ¶ 48: “the present disclosure describes systems and methods for determining the phenotypic impact (e.g., pathogenicity, functionality, or relative effect) of molecular variants”).
 and determining the probability distribution of pathogenicity based on the assessment.
“generating predicted phenotypic impacts of the third set of genotypes by applying the inference model to make predictions based on non-assayed features of the third set of genotypes.” (Araya spec ¶ 94)
The “phenotypic impact” of Araya includes pathogenicity (Araya spec ¶ 48: “the present disclosure describes systems and methods for determining the phenotypic impact (e.g., pathogenicity, functionality, or relative effect) of molecular variants”).

A series of predicted phenotypic impacts is equivalent to a probability distribution of pathogenicity. Figs. 1A, 10, and 13 show side information (phenotype scores) being used to train an ANN (supervised learning) to assess pathogenicity of variants. Therefore claim 33 is taught by Araya.
Regarding claim 34:
computing a probability of the unknown variant associated with a set of pathogenic metrics given the set of side information
Claim 94 of Araya states “selecting a third set of genotypes with unknown, putative, or known phenotypic impacts using a sampling model… generating a functional model by applying statistical learning techniques that associates molecular signals, phenotype signals, or population signals of the first set of genotypes with putative or known phenotypic impacts… generating predicted phenotypic impacts of the third set of genotypes by applying the inference model to make predictions based on non-assayed features of the third set of genotypes.”
In Araya, the “unknown variant” is having its probability computed (predicting) for its pathogenic metrics (phenotypic impacts) by applying the phenotype signals (transformed side information). Therefore, claim 34 is taught by Araya.
Regarding claim 35:
determining at least one probability for the unknown variant in relation to pathogenic metrics based on a collection of learned variants; and generating a combined representation of the at least one probability, wherein the combined representation is outputted with respect to the pathogenic metrics
Claim 121 of Araya states “A computer implemented method for scoring phenotypic impacts of molecular variants, comprising: evaluating an evidence dataset based on an accuracy of the evidence dataset; validating the evidence dataset based on the accuracy of the evidence dataset; optimizing the evidence dataset based on the accuracy of the evidence dataset; and determining the phenotypic impacts of the molecular variants based on the evaluating, validating, and optimizing of the evidence dataset.”
The score of claim 121 is a determined probability, so claim 35 is taught by Araya.
	Regarding claim 36: 
the supervised learning framework comprises one or more prediction models
“The present disclosure provides system, apparatus, device, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof that improve cost-efficiency in the classification of molecular variants through (i) the directed deployment of DML processes and systems with lower-cost prediction models (see FIG. 16)” (spec ¶ 45)
The learning process of Araya is stated here to include prediction models, so claim 36 is taught by Araya.
	Regarding claim 37:
the supervised learning framework comprises a non-parametric classifier 
“synthetic compartments harboring the same molecular variants 802 may be fed through a series of artificial neuron layers (e.g., convolutional or perceptron layers) in an Artificial Neural Network 804 (ANN) to derive increasingly complex (e.g., higher-order) molecular scores 806, and generate autoencoders with learned features.” (spec ¶ 56).
ANNs are a non-parametric classifier. Therefore, claim 37 is taught by Araya.
	Regarding claim 38:
the set of side information is associated with the unknown gene variant
 “In some embodiments, the annotation features 1306 and 1320 may encompass a plurality of independent (e.g., non-assayed) features (e.g., evolutionary, population, functional (e.g., annotation-based), structural, dynamical, and physicochemical features associated with variants, genomic coordinates” (Araya spec ¶ 76)
Population/functional features are types of side information. Therefore, claim 38 is taught by Araya.
	Regarding claim 39:
computer readable code or instructions which causes the processor to implement the computer implemented method of claim 33
“Computer system 1900 also includes a main or primary memory 1908, such as random access memory (RAM). Main memory 1908 may include one or more levels of cache. Main memory 1908 has stored therein control logic (e.g., computer software) and/or data.” (Araya spec ¶ 133)
Claim 136 of Araya sets forth “A system, comprising: a memory; and at least one processor coupled to the memory and configured to…” 
Claim 136 continues to list a number of steps to execute a method equivalent to that of claim 33 of the instant application (also equivalent to claim 1 of Araya). The processor is receiving control logic from memory (see also Araya Fig. 19) about the configuration to perform the claimed steps. 
Therefore, claim 33 is taught by Araya.
	Regarding claim 40:
the phenotypic information comprising phenotypic ontology associated with one or more diseases
(Araya spec ¶ 78): “In some other embodiments, the present disclosure provides systems and methods for deriving functional scores and functional classifications via statistical (e.g., machine) learning to generate a Functional Model (mF) that associates molecular, phenotype, or population signals (e.g., features)—derived from one or more molecular measurements, molecular processes, molecular features, and/or molecular scores—with phenotypic impacts (e.g., labels) of molecular variants computed directly from distinct molecular, phenotype, or population signals, via regression and classification techniques. In some embodiments, this approach may permit, for example, deriving functional scores and functional classifications that predict the relative mutation burden, mutation rate, or mutation signatures of samples from subjects harboring specific molecular variants. In some embodiments, functional scores or functional classifications from such assays may permit informing on the lifetime risk of developing cancer in test subjects.”
A phenotypic ontology is simply information and labels about the “ground truth” of phenotypes and genes, thus being read on by the specification.
	Regarding claim 41, an apparatus comprising a processor, a memory, and a communication interface, the processor connected to the memory and communication interface wherein the apparatus is adapted or configured to implement the computer implemented method according to claim 33 is taught by Araya (Fig. 5, spec ¶ 23). Such an apparatus is entirely shown in figure 5 and it is stated that it implements the claimed invention.
 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Araya, as applied to claim 9 above, in view of Hwang et al. (“Co-clustering phenome-genome for phenotype classification and disease gene discovery.” Nucleic acids research vol. 40,19 (2012), e146, 1-16) (hereafter “Hwang”).
Regarding claim 24, Araya teaches the one or more generative models (spec ¶ 56).
Regarding claim 24, Araya is silent as to the generative models having a formulation based on a matrix factorization algorithm. 
Regarding claim 24, Hwang teaches a non-negative matrix trifactorization framework for co-clustering phenotypes and genes based on space phenotype—gene association matrices (summary, figure 1).
Regarding claim 24, An invention would have been prima facie obvious to one of ordinary skill in the art at the time of the effective filing date of the invention if some suggestion in the prior art would have led that person to combine the prior art teachings to arrive at the claimed invention. There is a suggestion to use matrix factorization for co-clustering phenotypes and genes in situations with sparse phenotype-gene association matrices (see Fig. 1). in the text of Hwang (summary). There would be a reasonable expectation of success for this combination to a person of ordinary skill in the art, as the suggested modification is explained in the text of Hwang. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time to modify the method of Araya with the suggested modification of Hwang, in order to optimize the phenotype-gene co-clustering in cases of data sparsity.

Claims 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over Araya as applied to claim 1 above, as evidenced by Jacquez et al. (The Handbook of Geographic Information Science, Chapter 22, Blackwell Publishing, 395-416).
Regarding claim 6:  
The computer- implemented method of claim 1, wherein the variant is not included in the collection of learned variants
Araya teaches: “selecting a second set of genotypes with unknown, putative, or known phenotypic impacts using a sampling model;” (Araya claim 94)
identifying at least one proximal variant from the collection of learned variants in relation to the variant; 
Araya teaches: “The hotspot scores or hotspot classifications of the molecular variants are derived from Significantly Mutated Regions and Networks (SMRs/SMNs) computed applying spatial clustering techniques to detect regions and networks of residues with high densities of molecular variants” (Araya claim 14)
receiving a set of side information corresponding to each of the at least one proximal variant, wherein the set of side information comprises one or more indicators;
Araya teaches: “The present disclosure provides system, apparatus, device, method and/or computer program product embodiments for the classification (or regression) of likely phenotypic impacts in a subject on the basis of one or more molecular signals, phenotype signals, or population signals measured in in vivo or in vitro functional model systems” (Araya spec ¶ 33)
identifying a nearest variant based on the set of side information; and applying the nearest variant as the variant when determining the at least one probability for the variant in relation to the pathogenic metrics.  
Araya is silent as to this.
There is the following suggestion in Araya: “In some embodiments, the phenotypic impacts of known molecular variants, high-confidence predicted molecular variants, and functionally-modeled molecular variants can be leveraged by an Inference Model (mI) 1609 that models the relationship between phenotypic impacts and a plurality of dependent (e.g., assayed) features (e.g., molecular, phenotype, or population signals) or independent (e.g., non-assay) features (e.g., evolutionary, population, functional (e.g., annotation-based), structural, dynamical, and physicochemical features associated with variants, genomic coordinates, transcript (e.g., RNA) coordinates, translated (e.g., protein) coordinates, amino acids, and various others, as would be appreciated by a person of ordinary skill in the art) to yield an augmented sequence-function of functional scores 1610. As would be appreciate by a person of ordinary skill in the art, such Inference Model (mI) 1609 may permit estimating the phenotypic impacts of molecular variants with or without the explicit use of molecular, phenotype, or population signals.” (Araya spec ¶ 102)
Regarding claim 7:
the nearest variant is identified by applying similarity metrics associated with the at least one proximal variant based on the set of side information; and/or wherein the similarity metrics are weighted in relation to the set of side information.
“SMR/SMN-detection techniques 1805 can comprise a series of steps including but not limited to: (1.1) projection 1810 of phenotype-associated molecular variants 1806 in functional, sequence, structural, or (co)evolutionary dimensions (or combinations thereof), (1.2) application of spatial clustering techniques 1812 (e.g., DBSCAN) to detect clusters of spatially-proximal phenotype-associated variants” (Araya spec ¶ 113)
Spatial clustering will inherently identify proximal variants via similarity metrics, and nearest variants will impact the probability of a given variant in relation to pathogenic metrics. as evidenced by Jacquez et al. Jacquez et al. gives an example of a type of spatial clustering statistic that determines probabilities based on averages of nearby data points:  “For example, Moran’s global autocorrelation statistic… the term in the summation is the average within those areas immediately adjacent to the i th area.” (Jacquez pg 7 ¶ 1) Therefore claim 7 is taught by Araya.
Regarding claim 8:
when the similarity metrics identify at least one other variant from the collection of learned variants to have an equivalent similarity score, the at least one probability for the variant is determined by averaging each of the at least one proximal variant
“SMR/SMN-detection techniques 1805 can comprise a series of steps including but not limited to: (1.1) projection 1810 of phenotype-associated molecular variants 1806 in functional, sequence, structural, or (co)evolutionary dimensions (or combinations thereof), (1.2) application of spatial clustering techniques 1812 (e.g., DBSCAN) to detect clusters of spatially-proximal phenotype-associated variants” (Araya spec ¶ 113)
Spatial clustering will inherently determine variant probabilities based on averages of proximal variants. 
Regarding claims 6-8,, An invention would have been prima facie obvious to one of ordinary skill in the art at the time of the effective filing date of the invention if some suggestion in the prior art would have led that person to combine the prior art teachings to arrive at the claimed invention. There is a teaching to use spatial clustering to identify nearest variants in the text of Araya. There is a suggestion to use an inference model incorporating side information and known phenotypic impacts to determine phenotypic impacts of unknown variants. One version of this inference model could be identifying the nearest variant based on the side information and applying it as “the variant” in determining the probability as in the steps of claim 1. There would be a reasonable expectation of success in employing this technique for a person of ordinary skill in the art, as the embodiment is a simple mathematical technique that could be integrated into the calculated method steps. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time to modify the method of Araya with the suggestions of Araya, in order to improve the model in cases where there are many unknown variants.

Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Gracelyn M. Hill, whose telephone number is (571)272-9871. The examiner can normally be reached Monday-Friday 8:30-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Olivia M. Wise can be reached at (571) 272-2249. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/G.M.H./Examiner, Art Unit 1685       

/OLIVIA M. WISE/Supervisory Patent Examiner, Art Unit 1685
Read full office action
APPLICATION OF PATHOGENICITY MODEL AND TRAINING THEREOF

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

APPLICATION OF PATHOGENICITY MODEL AND TRAINING THEREOF

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email