Last updated: May 29, 2026
Application No. 16/460,588
MACHINE LEARNING VARIANT SOURCE ASSIGNMENT

Non-Final OA §101§103§112
Filed
Jul 02, 2019
Priority
Jul 05, 2018 — provisional 62/694,375
Examiner
LIU, GUOZHEN
Art Unit
1686
Tech Center
1600 — Biotechnology & Organic Chemistry
Assignee
Grail, Inc.
OA Round
5 (Non-Final)
Interview Optional

— +24.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 49% grant rate with +24.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 96 resolved cases, 2023–2026
Examiner Intelligence

LIU, GUOZHEN View full profile →
Grants 49% of resolved cases
Career Allowance Rate
47 granted / 96 resolved
-11.0% vs TC avg
Strong +25% interview lift
Without
With
+24.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
31 currently pending
Career history
135
Total Applications
across all art units
Statute-Specific Performance

§101
33.2%
-6.8% vs TC avg
§103
51.5%
+11.5% vs TC avg
§102
3.0%
-37.0% vs TC avg
§112
2.4%
-37.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 96 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 8/15/2025 has been entered.
 

Election/Restrictions
Newly submitted claim 29 is directed to an invention that is independent or distinct from the invention originally claimed for the following reasons: 
New claim 29 is directed to a patentably separate and distinct invention. New claim 29 is a differing method, requiring differing method steps, to achieve a differing result: a trained source classification model. Claim 29 is classified into G06N 5/022 (for knowledge engineering and knowledge acquisition) and G16B 40/20 (for supervised data analysis), while claims 21-22 are classified into G06N 20/00 (for machine learning). New claim 29 requires steps (such as feature engineering and feature ranking by their classification power, cross-validation, model fine-tuning, etc.) and elements (claim 29 needs to extract and test many features) not required to carry out the elected invention of claims 21-22 (which is the application of a delivered classification model). Each method requires separate search and consideration of the steps and elements therein. It would be a big burden to examine claim 29.

Since applicant has received an action on the merits for the originally presented invention, this invention has been constructively elected by original presentation for prosecution on the merits. Accordingly, claim 29 is withdrawn from consideration as being directed to a non-elected invention. See 37 CFR 1.142(b) and MPEP § 821.03.
To preserve a right to petition, the reply to this action must distinctly and specifically point out supposed errors in the restriction requirement. Otherwise, the election shall be treated as a final election without traverse. Traversal must be timely. Failure to timely traverse the requirement will result in the loss of right to petition under 37 CFR 1.144. If claims are subsequently added, applicant must indicate which of the subsequently added claims are readable upon the elected invention.
Should applicant traverse on the ground that the inventions are not patentably distinct, applicant should submit evidence or identify such evidence now of record showing the inventions to be obvious variants or clearly admit on the record that this is the case. In either instance, if the examiner finds one of the inventions unpatentable over the prior art, the evidence or admission may be used in a rejection under 35 U.S.C. 103 or pre-AIA  35 U.S.C. 103(a) of the other invention.


Status of Claims

Claims 1, 8, 10, 12, 15, 19-20 and 23-24 are cancelled. 
Claim 29 is added and restricted
Claims 2-7, 9, 11, 13-14, 16-18, 21-22 and 25-29 are pending.
Claims 2-7, 9, 11, 13-14, 16-18, 21-22 and 25-28  are examined on the merits.


Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged. Priority to US application 62/694,375 filed 7/5/2018 is acknowledged.


Claim Interpretation
	In claims 21-22, the recited “classifier” in "inputting the values of the covariates into a source assignment classifier to determine a source for the sequence variant being one of a plurality of possible sources, wherein the source assignment classifier includes a tree-based machine learning model that takes the covariates as input and has been trained” is interpreted as a product-by-process element, i.e. the recited "classifier" limited according to any structure clearly required by the recited product-by-process limitation of having been "trained."  The recited "been trained" is not itself claimed and is limiting only to the extent that the structure of the "classifier" is clearly required to be limited.  Regarding product-by-process limitations within a claim, MPEP 2113 pertains, as well as, for example, Biogen MA, Inc. v. EMD Serono, Inc. (Fed. Cir. 9-28-2020, precedential).

Claim Objections
Claim 28 is objected to because of the following informalities: “hematopoeisis” should read as “hematopoiesis”.  Appropriate correction is required.


Claim Rejections - 35 USC § 112 –Second Paragraph
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-7, 9, 11, 13-14, 16-18, 21-22 and 25-28 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 21-22 both end with inputting data, and do not identify any source. There are no positive active method steps which require the trained ML model act upon the input data, classify the result, and identify the source of the variant.  
Claims 21-22 both provide a previously trained ML model within the source classifier. This classifier is trained outside the boundaries of the positive active steps of claim 21. The trained ML classifier is not trained with the sufficient data required to carry out the invention, for at least the following reasons: 
10000 tumor samples is not a description of the types of tumors, stages or grades of tumors, or any other relevant label.
No reference data is used for sources that are not the 10000 tumors. 
The description of the previously trained ML model does not set forth how the model acts on the data to achieve the identification and classification of a SPECIFIC source of a specific variant. 
The trained ML model as described in claim 21/11 does not have the capability to determine a variant comes from any or all of the various sources as claimed in the dependent claims, as it was only trained on tumor data. 
There is no explanation as to how any specific source is identified using the ML model. The model is PART of a source classifier, however claims 21-22 do not set forth a classifier, or a step of classification. 
	Therefore:
The metes and bounds of claims 21-22 are indefinite with respect to the specific identification of Clonal Hematopoiesis (CH, claim 28). Claims 21 and 22 do not comprise any samples with labels clearly linked to CH, nor is it clear how to identify the presence of CH based merely on 2 pieces of information from the variant in the sample, and the previously trained ML model has no description on how it was trained nor what the training data comprised of.  One of ordinarily skilled person would not be appraised of how to take the data at hand in claims 21-22, and arrive at a determination that a variant’s source is CH.
The metes and bounds of claims 21-22 are indefinite with respect to the specific identification of blood source (Claim 6). Claims 21 and 22 do not comprise any samples with labels clearly linked to blood source , nor is it clear how to identify the presence of blood source based merely on 2 pieces of information from the variant in the sample, and the previously trained ML model has no description on how it was trained nor what the training data comprised of. One of ordinarily skilled person would not be appraised of how to take the data at hand in claims 21-22, and arrive at a determination that a variant’s source is blood.
The metes and bounds of claims 21-22 are indefinite with respect to the specific identification of germline source (Claim 6). Claims 21 and 22 do not comprise any samples with labels clearly linked to germline source , nor is it clear how to identify the presence of germline source based merely on 2 pieces of information from the variant in the sample, and the previously trained ML model has no description on how it was trained nor what the training data comprised of. One of ordinarily skilled person would not be appraised of how to take the data at hand in claims 21-22, and arrive at a determination that a variant’s source is germline.
The metes and bounds of claims 21-22 are indefinite with respect to the specific identification of “another source” (Claim 6). Claims 21 and 22 do not comprise any samples with labels clearly linked to another source, nor is it clear how to identify the presence of another source based merely on 2 pieces of information from the variant in the sample, and the previously trained ML model has no description on how it was trained nor what the training data comprised of. One of ordinarily skilled person would not be appraised of how to take the data at hand in claims 21-22, and arrive at a determination that a variant’s source is “another” (on the other hand, “unknown source” is defined in claim 7 which does not rely on input data labeling).

 
Claim Rejections - 35 USC § 101
The instant rejection is maintained from the previous Office action. Modification is necessitated by Applicant’s amendments filed 8/15/2025. 
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 2-7, 9, 11, 13-14, 16-18, 21 and 25-28 are rejected under 35 USC 101 because the claimed inventions are directed to non-statutory subject matter. 
Step 1: Process, Machine, Manufacture or Composition  
Claim 2-7, 9, 11, 13-14, 16-18, 21 and 25-2 are directed to a 101 process, here a "method," for determining a source of a sequence variant, with functional steps like “receiving,” “creating,” “generating,”  and “inputting.”  
Claim 22 is directed to another 101 machine or manufacturer, here a “non-transitory computer-readable storage medium,” with a well-known structure.
Step 2A Prong One: Identification of an Abstract Idea
Mental processes recited in the claims include:
creating pileup reads by stitching two or more overlapping sequence reads selected from the plurality of sequence reads (claims 21-22: Under a BRI, this step can be achieved in human mind with the help of a pen/paper. Therefore, this step equates to an abstract idea of mental processes); 

inputting the values of the covariates into a source assignment classifier to determine a source for the sequence variant being one of a plurality of possible sources (claims 21-22: Under a BRI, this assignment classification can be achieved in human mind with the help of a pen/paper (such as a logistic regression). Therefore, this step equates to an abstract idea of mental processes); and

detecting, based on the source of the sequence variant, presence or absence or stage of a cancer, particular types of the cancer, or presence or absence of clonal hematopoiesis (CH) (claim 28: this step can be achieved in human mind with the help of a pen/paper (like an expert decision). Therefore, this step equates to an abstract idea of mental processes);

Mathematical concepts recited in the claims include:
generating values of covariates from the sequence reads and pileup reads (claims 21-22: the generation of the five listed values all require mathematical calculations. Therefore, this step equates to an abstract idea of mathematical concepts);
Law of Nature recited in the claims include:
detecting, based on the source of the sequence variant, presence or absence or stage of a cancer, particular types of the cancer, or presence or absence of clonal hematopoiesis (CH) (claim 28: correlating the source of the sequence variant to the presence/absence/stage of a cancer, particular types of the cancer, or presence/absence of clonal hematopoiesis is directed to law of nature).

The above recited elements “creating,” “generating,” and  “inputting,” recite the step for covariates generating (aka “feature engineering” in machine learning, claims 21-22) and covariates input for the machine learning model (claims 21-22). Such steps require data observation, manipulations and decision making. The data manipulation steps are more obvious in the steps that describe how the covariates are generated (claims 21-22 and claims 2-5, 7, 13). Hence claims recite many elements that are typical data observation and decision making processes that might require mathematical calculations. 
Because the steps do not clearly require more than instructions for a user to manually manipulate data using mental processes and mathematical concepts. The cited elements in their simplest embodiments, can be performed by a human, with the help of a pen and a paper. They match the criteria for abstract idea of mental processes and mathematical concepts. Claim 28 is implicit on the correlation between the source of sequence variant and presence or absence or stage of a cancer, particular types of the cancer, or presence or absence of clonal hematopoiesis (CH). Claim 28 thus recites law of nature. The claims must therefore be examined further to determine whether the claims integrate the above-identified abstract ideas into a practical application (MPEP 2106.04(d)). 
Step 2A Prong Two: Consideration of Practical Application
	The claims landed at determining a source for the sequence variant, which is data and information. This judicial exception is not integrated into a practical application because the claims do not meet any of the following criteria: 
An improvement in the functioning of a computer, or an improvement to other technology or technical field, as discussed in MPEP §§ 2106.04(d)(1) and 2106.05(a);
Applying or using a judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, as discussed in MPEP § 2106.04(d)(2);
Implementing a judicial exception with, or using a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim, as discussed in MPEP § 2106.05(b);
Effecting a transformation or reduction of a particular article to a different state or thing, as discussed in MPEP § 2106.05(c); and
Applying or using the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception, as discussed in MPEP § 2106.05(e).
Step 2B: Consideration of Additional Elements and Significantly More 
The claimed method also recites "additional elements" that are not limitations drawn to an abstract idea. The recited additional elements are drawn to:
receiving a plurality of sequence reads obtained from sequencing cell free DNA (cfDNA) and genomic DNA (gDNA) from a biological sample (claims 21-22);
A non-transitory computer-readable storage medium (claim 22); and
inputting the values of the covariates into a source assignment classifier to determine a source for the sequence variant being one of a plurality of possible sources (claims 21-22).

Among these additional elements, the first group is necessary for sample/data collection, the second group (last two elements) are associated to a generic computer, and the third group is about a field application. 
Downloading data and re-analyzing data from a different angle is well‐understood, routine, and conventional. The data itself is described and explored by Zehir ("Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients." Nature medicine 23.6 (2017): 703-713. Cited on the 3/31/2023 IDS). Zehir provides “between January 2014 and May 2016, we obtained 12,670 tumors from 11,369 individuals for prospective MSK-IMPACT sequencing. DNA isolated from tumor tissue and, in 98% of cases, matched normal peripheral blood was subjected to hybridization capture and deep-coverage NGS to detect somatic mutations, small insertions and deletions, CNAs and chromosomal rearrangements, all of which were manually reviewed and reported to patients and physicians in the electronic medical record (Fig. 1)” (page 704, col 1, 4th para).

Inputting data with at least one covariate regarding gDNA and at least one covariate regarding cfDNA to a tree-based classifier is described and explored by Kang ("CancerLocator: non-invasive cancer diagnosis and tissue-of-origin prediction using methylation profiles of cell-free DNA." Genome biology 18.1 (2017): 1-12. Previously cited). Kang provides “We show that having a mixture of plasma cfDNAs can completely defeat standard machine learning methods for cancer type predictions
when the proportion of tumor-derived DNA is lower than 50%”. In contrast, CancerLocator successfully overcomes this obstacle. The poor performance of the standard methods is largely caused by their treatment of the samples in each tumor class as independent and identically distributed, following some class-specific distribution, while in our model the samples from the same class can still be very different due to different ctDNA percentages in the blood. In addition, our results show that our method is robust to CNA events, possibly because the genome-wide features outweigh the local aberrations “ (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7), which suggests covariates (1)-(4) in claims 21-22.
Kang teaches using Random Forest (RF), a machine learning algorithm to predict cancer tissue origin (Table 1, pg. 7). The Random Forest classifier reads on the tree-based machine learning model.
Additionally, Hu ("Identifying circulating tumor DNA mutation profiles in metastatic breast cancer patients with multiline resistance." EBioMedicine 32 (2018): 111-118. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”) explored mutations in circulating tumor DNA, which reads cfDNA mutations. Ainscough (“Knowledge Driven Approaches and Machine Learning Improve the Identification of Clinically Relevant Somatic Mutations in Cancer Genomics”. Washington University in St. Louis, 2017. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”) explored Random Forest classification (Fig. 4.2, page 90; Fig. 4.5, page 95; Fig. 4.7, page 99) and cell-free DNA multiple alignments (page 157, 1st para), genomic DNA (page 5, 2nd para) as analytic inputs.
Using a generic computer in data analysis and modeling is considered well‐understood, routine, and conventional. As this happens every day in pertinent industry.
 
Hence, the recited additional elements constitute no inventive concepts.
	Therefore, the claims are not eligible under 35 USC 101. 

Response to Applicant’s Arguments:
In the remarks filed 15 August 2025, applicant argued that claims are 101 eligible at Sep 2A/Prong one and at Step 2B (page 11, 2nd para).
More specifically, Applicant argues (page 11, penultimate para through page 12, 1st para) that “’the tree-based machine learning model comprises the covariates and corresponding coefficients trained based at least on training sequence variants obtained from cancerous tumors of at least 10,000 subjects’ recited in claim 21 cannot be characterized as reciting any judicial exceptions.” Applicant’s argument is not persuasive. Tree-based machine learning model here is interpreted as a product-by-process, the limitations describing how the ML was trained all occur outside the bounds of the claim and cannot provide 1) integration or 2) inventive concept. These are not positive active method steps occurring in a step-wise fashion within the bounds of claim 21. There is no tangible action associated with the tree-based machine learning model. 

Applicant further argues (page 12, 2nd para) that tree-based machine learning model do not recite organized human activity, nor does it recite a mental process (page 12, 3rd para). Applicant’s arguments are not persuasive as discussed above, tree-based machine learning model here is interpreted as a product-by-process, the limitations describing how the ML was trained all occur outside the bounds of the claim and cannot provide 1) integration or 2) inventive concept. These are not positive active method steps occurring in a step-wise fashion within the bounds of claim 21. There is no tangible action associated with the tree-based machine learning model. 

Applicant argues that claims are integrated into a practical application at Sep 2A/Prong one (page 12, last para through page 14, 2nd para) due to technical improvement. The argument is not persuasive. The generic computer stays the same after running of the tree-based machine learning classifier. These claims are not limited to a computer, so they cannot use the argument that they provide an improvement to a computer itself. Claim 21 does not set forth any computer elements. Even if a computer runs the tree-based classifier under a BRI, there is no change nor improvement to the computer functioning. The alleged improvement to the data labelling is not persuasive. Steps of labelling the data (associate the tissue origin to the variants) are all classified as abstract ideas of mental processes.

Applicant argues that claims are 101 eligible at Sep 2B (page 14, 3rd-5th paras). The argument is not persuasive. As discussed above the 1010 rejection, the additional elements identified in the instant claims are classified into two groups:  either 1) insignificant extra-solution activity for sample/data gathering (MPEP §2106.05(g)), or 2) providing generic computing environment for conducting abstract idea. Regarding the 1st part, MPEP 2106.05(d).II list several activities related to DNA sequence analysis as well‐understood, routine, and conventional, or as insignificant extra-solution activity. As to the generic computer, it is one of the most routinely used equipment in pertinent industry, and is well-known.
 Hence, the 101 rejection is maintained to all the claims. 

Claim Rejections - 35 USC § 103
The instant rejection is recycled from a previous Office action (11/27/2023). Modification is necessitated by Applicant’s amendments filed 8/15/2025, and the Product-by-Process interpretation applied to claims 21-22. 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 21-22, 2, 4, 9, 11, 13 and 24-25 are rejected under 35 U.S.C. 103 as being unpatentable over Cohen ("Detection and localization of surgically resectable cancers with a multi-analyte blood test." Science 359.6378 (2018): 926-930. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”), in view of Kang ("CancerLocator: non-invasive cancer diagnosis and tissue-of-origin prediction using methylation profiles of cell-free DNA." Genome biology 18.1 (2017): 1-12. Previously cited), Hu ("Identifying circulating tumor DNA mutation profiles in metastatic breast cancer patients with multiline resistance." EBioMedicine 32 (2018): 111-118. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”), Ainscough (“Knowledge Driven Approaches and Machine Learning Improve the Identification of Clinically Relevant Somatic Mutations in Cancer Genomics”. Washington University in St. Louis, 2017. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”), and Zehir ("Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients." Nature medicine 23.6 (2017): 703-713. Cited on the 3/31/2023 IDS)

Claims 21-22 are drawn to receiving variants from biological sequences of samples, extracting features (here “covariates”) from the variants, classifying the variants. The recited “the source assignment classifier includes a tree-based machine learning model that takes the covariates as input and has been trained” is interpreted as a product-by-process element, i.e. the recited "classifier" limited according to any structure clearly required by the recited product-by-process limitation of having been "trained."  The recited "been trained" is not itself claimed and is limiting only to the extent that the structure of the "classifier" is clearly required to be limited. 

Regarding claims 21-22, Cohen teaches receiving ctDNA from 153 patients for variants identification (3rd para, pg. 6), ctDNA reads on the cfDNA in the claim limitation. Cohen does not teach receiving genomic DNA. Hu teaches receiving cfDNA as well as genomic DNA (2nd para, col 2, pg. 112). 
Hu provides “The BWA (version 0.7.12-r1039) tool aligned clean reads to the reference human genome (hg19), and Picard (version 1.98) marked PCR duplicates. Realignment and recalibration was performed using GATK (version 3.4–46-gbc02625). Single nucleotide variants (SNV) were called using MuTect (version 1.1.4) and NChot, a
software developed in-house to review hotspot variants [28]. Small insertions
and deletions (Indels) were called using GATK” ((3rd para, col 2, pg. 112), which teaches multiple alignment with pileups and overlapping sequence reads.
Hu provides “Significant copy number variation was expressed as the ratio of adjusted depth between ctDNA and control gDNA.” (last para  line 9-11, col 2, pg. 112), which teaches (3) an allele frequency of the sequence variates and suggests (2) measuring the sequence depth of both cfDNA and gDNA. 
 	Ainscough teaches a feature representing variant frequency that may be used in a filtering step (“VAF”, pg. 10), reading on the variant frequency (covariate number (3)). 
Kang provides “We show that having a mixture of plasma cfDNAs can completely defeat standard machine learning methods for cancer type predictions
when the proportion of tumor-derived DNA is lower than 50%”. In contrast, CancerLocator successfully overcomes this obstacle. The poor performance of the standard methods is largely caused by their treatment of the samples in each tumor class as independent and identically distributed, following some class-specific distribution, while in our model the samples from the same class can still be very different due to different ctDNA percentages in the blood. In addition, our results show that our method is robust to CNA events, possibly because the genome-wide features outweigh the local aberrations “ (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7), which suggests covariates (1)-(4) in claim 21.
Kang teaches using Random Forest (RF), a machine learning algorithm to predict cancer tissue origin (Table 1, pg. 7). The Random Forest classifier reads on the tree-based machine learning model, Because a Random Forest classifier is a big decision-tree. Covariates added dynamically, tree node splitting dynamically, coefficients  iteratively tuned are all internal part of a RF model training. However, in the instant claim, these processes happened before the product (trained machine learning classification model) is used for classifying sample variates, is not necessarily limiting. 
	Zehir provides “between January 2014 and May 2016, we obtained 12,670 tumors from 11,369 individuals for prospective MSK-IMPACT sequencing. DNA isolated from tumor tissue and, in 98% of cases, matched normal peripheral blood was subjected to hybridization capture and deep-coverage NGS to detect somatic mutations, small insertions and deletions, CNAs and chromosomal rearrangements, all of which were manually reviewed and reported to patients and physicians in the electronic medical record (Fig. 1)” (page 704, col 1, 4th para), which teaches tissue biopsy samples from cancerous tumors of at least 10,000 subject as well as matching gDNA samples.
 
Claims 22 is the storage disk version for the method implemented in claim 21.
Regarding claim 22, Cohen teaches computer programs in the form of supervised machine learning (pg. 6), which necessarily run on a generalized computer, to perform the methods described in claims 21-22. 
Regarding claim 2, Cohen teaches computing a numerical score for the source of the variant (Table S8). 

Regarding claim 4, Cohen teaches computing numerical scores for the possible sources of the variant (Table S8). Cohen teaches sequencing ctDNA, a type of cfDNA, for a supervised machine learning method to predict underlying cancer type in a patient (p. 6).

Regarding claims 9 and 11, Ainscough teaches a feature representing the count of reference reads (p. 22), reading on the limitation of the count of reference reads of claims 9 and 11. 

Claim 13 is drawn to an extracted feature that indicates the ratio of a variant allele's frequency in gDNA to a variant allele's frequency in cfDNA. Hu teaches comparing copy number variation expressed as the ratio of adjusted depth between ctDNA and control gDNA in a variant filtering method (p. 112). 

Regarding claim 24, Kang teaches using Random Forest, a machine learning algorithm to predict cancer tissue origin (Table 1, pg. 7). 
Regarding claim 25, Kang provides “We show that having a mixture of plasma cfDNAs can completely defeat standard machine learning methods for cancer type predictions when the proportion of tumor-derived DNA is lower than 50%. In contrast, CancerLocator successfully overcomes this obstacle. The poor performance of the standard methods is largely caused by their treatment of the samples in each tumor class as independent and identically distributed, following some class-specific distribution, while in our model the samples from the same class can still be very different due to different ctDNA percentages in the blood. In addition, our results show that our method is robust to CNA events, possibly because the genome-wide features outweigh the local aberrations “ (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7), which suggests covariates (1)-(4) in claim 21.

It would have been prima facie obvious to modified the variants identification pipeline of Cohen which use supervised machine learning (3rd para, pg. 6) with Kang’s  teachings of random forest machine learning algorithm (p. 6) because the random forest model is a popular multi-class classification method (Kang: last para, col 1, pg 5); and Kang’s random forest algorithm (p. 6) is also a supervised machine learning algorithm.
It would have been prima facie obvious to combine the features of allele frequency of the sequence variates (last para  line 9-11, col 2, pg. 112) and the ratio of adjusted depth between ctDNA and control gDNA in a variant filtering method (p. 112) as taught by Hu into the combined Cohen and Kang supervised machine learning method;  
It would have been prima facie obvious to combine the features of variant frequency (“VAF”, pg. 10) and the count of reference reads (p. 22) as taught by Ainscough (Ainscough: pg. 22 ; and “VAF”, pg. 10) into the combined Cohen and Kang supervised machine learning method, because Cohen teaches that additional features could be used to increase their machine learning method's sensitivity (p. 7) and Kang suggests four additional features might affect the machine prediction power (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7). 
It would have been prima facie obvious to combine the variates detected in over 10,000 samples (Zehir: page 704, col 1, 4th para) as taught by Zehir into the combined Cohen and Kang supervised machine learning method. Because more samples is good for model training.
One would reasonably expect success because the features are all related to sequence variants and Kang’s model, like Cohen’s model are both about variant source prediction.

Claims 3 and 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over Cohen, Kang, Hu, Zehir and Ainscough as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above, and further in view of Markou (“Novelty detection: a review-part 2”:: neural network based approaches, Signal Processing, Volume 83, Issue 12, 2003, 2499-2521, ISSN 0165-1684, doi.org/10.1016/j.sigpro.2003.07.019. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”) and Konnick (“Germline, hematopoietic, mosaic, and somatic variation: interplay between inherited and acquired genetic alterations in disease assessment”. Genome Med. 2016 Oct 5; 8(1):100. doi: 
10.1186/s13073-016-0350-8. PMID: 27716394; PMCID: PMC5050638. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”). 

Cohen in view of Kang, Hu, Zehir and Ainscough teaches a method of receiving variants from a biological sample, extracting features from the variants, classifying the variants, predicting variate source using a machine learning method as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above. 
Cohen in view of Kang, Hu, Zehir and Ainscough do not explicitly teach determining confidence values associated with a numerical classification score as in claims 3 and 5. 
However, Markou teaches assigning error bars to the classification output for individual data points (p. 2501), reading on the limitation of confidence values in claims 3 and 5. Markou teaches testing inputs against threshold values and then categorizes inputs, including categorizing inputs as unknown conditions (p. 2507), reading on the limitation of the unknown source variant label of claim 6 .. 
Konnick teaches a tumor source (i.e., neoplasia), blood source (i.e., hematopoietic clones), germ line source, and another source (i.e., somatic), reading on the limitation of at least five variant source labels, including a tumor source, a germline source, a blood source, and another source of claim 6. 

Markou teaches categorizing inputs in an unknown class using a threshold (p. 2501), reading on the limitation of assigning an unknown label determined by a threshold of claim 7. 

It would have been prima facie obvious to modified the teachings of combined Cohen, Kang, Hu, Zehir and Ainscough by assigning confidence values to numerical classification scores of Markou and by including the additional variant source labels presented in Konnick because a network "trained to discriminate between a number of classes coming from a set of distributions, will be completely confused when confronted with data coming from an entirely new distribution. It is necessary for most applications for the system to output along with the classification of a data input, a measure of confidence of this decision" (Markou, p. 2501) and Cohen states that for their method "to actually establish the clinical utility ... and to demonstrate that it can save lives" needs additional source variant labels (p. 7).  One would reasonably expect success as Markou’s and Konnick’s method are effective ways to fine tune the machine leaning models.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Kang, Hu, Zehir and Ainscough as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above, and further in view of Fulop (“Computational methods for annotation analysis of genetic variations”. MS thesis. lta-Suomen yliopisto, 2015. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”). 
Claim 14 is drawn to an extracted feature that indicates a category of an allele change from one allele to another allele. 
Cohen in view of Kang, Hu, Zehir and Ainscough teach a method of receiving variants from a biological sample, extracting features from the variants and classifying the variants as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above. 
Cohen in view of Kang, Hu, Zehir and Ainscough does not explicitly teach a feature indicating a category of an allele change from one allele to another allele.
However, Fulop teaches annotating variations that indicate allele change (p. 20).

It would have been prima facie obvious to modified the teachings of combined Cohen, Kang, Hu, Zehir and Ainscough to incorporate the feature of Fulop because Cohen teaches that additional features could be used to increase their machine learning method's sensitivity (p. 7) and Kang suggests class-specific distribution features might affect the machine prediction power (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7). One would reasonably expect success because Fulop’s feature, like other feature presented by Kang, Hu and Ainscough, is related to sequence variants and the combined machine learning model of Cohen, Kang, Hu, Zehir and Ainscough, is about variant source prediction.

Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Kang, Hu, Zehir and Ainscough as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above, and further in view of Caria (“Orchid: a novel management, annotation and machine learning framework for analyzing cancer mutations, Bioinformatics, Volume 34, Issue 6, 15 March 2018, Pages 936-942. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”).

Claim 26 is drawn to an extracted feature that indicates a trinucleotide context for a variant. Cohen in view of Ainscough teaches a method of receiving variants from a biological sample, extracting features from the variants, classifying the variants, and displaying the results as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above.
Cohen in view of Kang, Hu, Zehir and Ainscough does not explicitly teach a feature indicating a variant's trinucleotide context.
However, Caria teaches features that include trinucleotide contexts (p. 938).
Kang provides “We show that having a mixture of plasma cfDNAs can completely defeat standard machine learning methods for cancer type predictions when the proportion of tumor-derived DNA is lower than 50%. In contrast, CancerLocator successfully overcomes this obstacle. The poor performance of the standard methods is largely caused by their treatment of the samples in each tumor class as independent and identically distributed, following some class-specific distribution, while in our model the samples from the same class can still be very different due to different ctDNA percentages in the blood. In addition, our results show that our method is robust to CNA events, possibly because the genome-wide features outweigh the local aberrations “ (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7), which suggests covariates (1)-(4) in claim 21.

It would have been prima facie obvious to modified the teachings of combined Cohen, Kang, Hu, Zehir and Ainscough to incorporate the feature of Caria because Cohen teaches that additional features could be used to increase their machine learning method's sensitivity (Cohen: p. 7) and Kang suggests class-specific distribution features might affect the machine prediction power (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7). One would reasonably expect success because Caria’s feature, like other feature presented by Kang, Hu, Zehir and Ainscough, is related to sequence variants and the combined machine learning model of Cohen, Kang, Hu, Zehir and Ainscough, is about variant source prediction.

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Kang, Hu, Zehir and Ainscough as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above, and further in view of He (“Characterization and machine learning prediction of allele-specific DNA methylation”, Genomics, Volume 106, Issue 6, 2015, Pages 331-339, ISSN 0888-7543. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”). 
Claim 16 is drawn to an extracted feature that indicates a variant overlaps a segmental duplication. 
Cohen in view of Kang, Hu, Zehir and Ainscough teaches a method of receiving variants from a biological sample, extracting features from the variants and classifying the variants as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above.
Cohen in view of Kang, Hu, Zehir and Ainscough does not explicitly teach a feature indicating if a position of one of the variants overlaps a segmental duplication. 
However, He teaches a feature indicating if a variant is or is not in segmental duplication (p. 335). 
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of combined Cohen, Kang, Hu, Zehir and Ainscough to incorporate the feature of He because Cohen teaches that additional features could be used to increase their machine learning method's sensitivity (Cohen: p. 7) and Kang suggests class-specific distribution features might affect the machine prediction power (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7). One would reasonably expect success because He’s feature, like other feature presented by Kang, Hu, Zehir and Ainscough, is related to sequence variants and the combined machine learning model of Cohen, Kang, Hu, Zehir and Ainscough, is about variant source prediction.

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Kang, Hu, Zehir and Ainscough as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25  above, and further in view of Acuna-Hidalgo (“Ultra-sensitive Sequencing Identifies High Prevalence of Clonal Hematopoiesis-Associated Mutations throughout Adult Life”, The American Journal of Human Genetics, Volume 101, Issue 1, 2017, Pages 50-64. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”). 
Claim 17 is drawn to an extracted feature that indicates a variant overlaps a known clonal hematopoiesis gene. 
Cohen in view of Kang, Hu, Zehir and Ainscough teaches a method of receiving variants from a biological sample, extracting features from the variants and classifying the variants as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above. 
Cohen in view of Kang, Hu, Zehir and Ainscough does not explicitly teach a feature indicating if a gene associated with a variant overlaps a known clonal hematopoiesis gene. 
However, Acuna-Hidalgo teaches a feature indicating overlap of variants with previously identified clonal hematopoiesis-driver mutations (p. 51).
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of combined Cohen, Kang, Hu, Zehir and Ainscough to incorporate the feature of Acuna-Hidalgo because Cohen teaches that additional features could be used to increase their machine learning method's sensitivity (Cohen: p. 7) and Kang suggests class-specific distribution features might affect the machine prediction power (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7). One would reasonably expect success because Acuna-Hidalgo’s feature, like other feature presented by Kang, Hu, Zehir and Ainscough, is related to sequence variants and the combined machine learning model of Cohen, Kang, Hu, Zehir and Ainscough, is about variant source prediction.

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Kang, Hu, Zehir and Ainscough as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above,  and further in view of Paicu (“miRNA detection and analysis from high-throughput small RNA sequencing data”. Diss. University of East Anglia, 2016. Cited on the 12/1/2022 PTO-892 form “Notice of References Cited”). 
Claim 18 is drawn to an extracted feature that indicates if a variant position is overlapped by a threshold number of mapping locations. 
Cohen in view of Kang, Hu, Zehir and Ainscough teaches a method of receiving variants from a biological sample, extracting features from the variants and classifying the variants as applied to claims 21-22, 2, 4, 9, 11, 13 and 24-25 above. 
Cohen in view of Kang, Hu, Zehir and Ainscough does not explicitly teach a feature indicating if a variant position is overlapped by a threshold number of mapping locations. 
However, Paicu teaches filtering variants based on number of mapping locations (p. 68). 
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of combined Cohen, Kang, Hu, Zehir and Ainscough to incorporate the feature of Paicu because Cohen teaches that additional features could be used to increase their machine learning method's sensitivity (Cohen: p. 7) and Kang suggests class-specific distribution features might affect the machine prediction power (last para last line, col 2, pg. 6 through 1st para, col 1, pg. 7). One would reasonably expect success because Paicu’s feature, like other feature presented by Kang, Hu, Zehir and Ainscough, is related to sequence variants and the combined machine learning model of Cohen, Kang, Hu, Zehir and Ainscough, is about variant source prediction.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GUOZHEN LIU whose telephone number is (571)272-0224. The examiner can normally be reached Monday-Friday 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Larry D Riggs can be reached at (571) 270-3062. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/GL/
Patent Examiner
Art Unit 1686



/MARY K ZEMAN/Primary Examiner, Art Unit 1686
Read full office action
Prosecution Timeline

Show 9 earlier events
Feb 18, 2025
Final Rejection mailed — §101, §103, §112
Jul 01, 2025
Examiner Interview Summary
Aug 15, 2025
Request for Continued Examination
Aug 20, 2025
Response after Non-Final Action
Dec 03, 2025
Non-Final Rejection mailed — §101, §103, §112
Apr 23, 2026
Interview Requested
May 04, 2026
Response Filed
May 05, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

18/705,594
Patent 12626824
METHOD AND SYSTEM OF MOLECULAR TYPING AND SUBTYING CLASSIFIER FOR IMMUNE-RELATED DISEASES
2y 0m to grant Granted May 12, 2026
16/244,966
Patent 12590326
METHODS FOR FRAGMENTOME PROFILING OF CELL-FREE NUCLEIC ACIDS
7y 2m to grant Granted Mar 31, 2026
17/275,994
Patent 12535399
CELL SORTING DEVICE AND METHOD
4y 10m to grant Granted Jan 27, 2026
16/337,846
Patent 12499971
SYSTEMATIC SCREENING AND MAPPING OF REGULATORY ELEMENTS IN NON-CODING GENOMIC REGIONS, METHODS, COMPOSITIONS, AND APPLICATIONS THEREOF
6y 8m to grant Granted Dec 16, 2025
18/665,197
Patent 12393660
DNA Access Control Systems
1y 3m to grant Granted Aug 19, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

5-6
Expected OA Rounds
49%
Grant Probability
74%
With Interview (+24.7%)
4y 3m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 96 resolved cases by this examiner. Grant probability derived from career allowance rate.