DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant’s amendments and arguments, filed 8/18/2025 have been entered and carefully considered, but are not persuasive.
Claims 1-3 and 5-7 are pending and under examination.
Claim Interpretation
The claims in this application are given their broadest reasonable interpretation (BRI) using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-3 and 5-7 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of mental steps, mathematic concepts, organizing human activity, or a natural law without significantly more.
Applicant is directed to MPEP 2106 and the Federal Register notice (FR89, no 137 (7/17/2024) p 58128-58138) for the most current and complete guidelines in the analysis of patent- eligible subject matter. The current MPEP is the primary source for the USPTO’s patent eligibility guidance.
With respect to step (1): YES. The claims are drawn to statutory categories: processes.
With respect to step (2A) (1): YES. The claims recite an abstract idea, law of nature and/or natural phenomenon. The claims recite an abstract idea of enriching a pool of DNA samples, using a target enrichment technology, sequencing the enriched samples, and analyzing the raw sequence data to identify copy number variations. The analysis includes cleaning the data, alignment to a reference, and an iterative process for estimating copy numbers for each amplicon, including the application of data to a hidden Markov model. The iteration ceases when a condition is met. (See MPEP 2106.07(a)). The claims also embrace the natural law describing the naturally occurring correlations between naturally occurring genetic variations in a genome and a phenotype. (MPEP 2106.04). The claims explicitly recite elements that, individually and in combination, constitute one or more judicial exceptions (JE).
Mathematic concepts, Mental Processes or Elements in Addition (EIA) in the claim(s) include:
1. (Currently Amended) A method for detecting copy-number values (CNV), wherein the detection of CNVs is integrated into a single, targeted high-throughput sequencing experiment, the method comprising the steps of:
(EIA- preamble, setting forth a method, and the goal of the method: detecting copy number variations.)
(A) enriching a pool of DNA samples with a target enrichment technology, each enriched DNA sample being associated with a library of pooled fragments from a set of amplicons/regions;
(EIA- data gathering, performing a known laboratory process: routine pooled targeted enrichment. P2 specification provides known targeting protocols and processes.)
(B) sequencing each amplicon/region with a high-throughput sequencer to generate raw sequencing data; and
(EIA- data gathering, performing routine multiplex high throughput sequencing, to generate raw data; p2 provides prior art known sequencers and multiplex high throughput sequencing, See also Fig 1, spec p5: Raw files are in FASTQ format, and output to genomic analyzer.)
(C) analyzing the raw sequencing data with a genomic data analyzer to determine the CNVs for each sample and each amplicon/region, comprising:
(i) cleaning the sequencing data to remove low quality bases and adapter sequences, and aligning the cleaned reads to a reference genome;
(Mental process in a computing environment, or using a computer as a tool; of observing low quality data and/or adaptors, and annotating or deleting data (p5 “genomic data analyzer 120” annotates/ combines data); mental process of alignment to a known reference sequence which is a comparison of text strings. P6 “alignment module 121” provides BAM/SAM files as output.)
(ii) generating from the aligned cleaned reads, with a data processing unit, a coverage count for each sample and for each amplicon/region; and
(Mathematic concept of counting the number of cleaned, aligned reads for each region/ bin/ amplicon; specification p7, EQ1 “the coverage count… may be approximately factorized into sample/plex-dependent and amplicon/ region-dependent contributions…”; p9, EQ2 “in order to remove the sample/ plex bias from the raw dataset… the coverage data may then be divided by the calculated mean…”)
(iii) repeating, with a data processing unit, over a plurality of iterations, the following steps to estimate the copy number values for each sample and for each amplicon/region:-
(a)normalizing, with a data processing unit, the coverage count associated with each sample based on a prior estimate of the copy number values for each sample
wherein, if a first iteration of a plurality of iterations, the prior estimate of the copy number values is an initial estimate of the copy number values, and
wherein, if not the first iteration of a plurality of iterations, the prior estimate of the copy number values is the copy number values calculated in the course of the previous iteration;
(Mathematic concept of normalization/ estimation, using an iterative process; p8 specification, Fig 4, element 400.)
(b) selecting, automatically, with a data processing unit, for each sample, a set of reference samples as the samples with the closest normalized coverage count to the normalized coverage count of said sample, the number of reference samples in each subset of reference samples being a function of the total number of samples and being smaller than the total number of samples,
wherein selecting a set of reference samples comprises calculating, for each sample, a distance between the coverage counts normalized both within each sample and within each amplicon/region and selecting a set of samples with coverage counts having the shortest distances as the reference samples;
(Mathematic concept of taking normalized values, making distance calculations, sorting them based on the smallest distance; p9-11, “CNV detection module 123” EQ 3-4)
( c) estimating, using a Hidden Markov Model (HMM), the copy-number values in said sample as a function of at least the coverage counts in said sample and of at least the coverage counts in the selected set of reference samples for said sample and utilizing the estimate of the copy number values calculated over previous iterations; and
(Mathematic concept of applying data values to a (untrained) HMM, a mathematic model. P8-9, p13-14, EQ9 or prior art method (Ivankhno 2010))
(d) stopping the iteration and outputting the inferred copy-number values if the estimates of the copy-number values converge over iterations, if the estimates of the copy number values reaches a cycle over multiple iterations, or if the number of iterations reaches a pre-defined limit.
(Mathematic concept of stopping the HMM when a mathematic condition is reached. P9, element 430, p14-15)
2. (Original) The method of claim 1, wherein the number of reference samples NR in each set of reference samples is given by NR= [0.25*N]+2, where N is the total number of samples.
(Mathematic concept modification, setting forth an equation for selecting the reference samples.)
3. (Currently Amended) The method of any one of the preceding claims, further comprising: for each sample and for each amplicon/region, estimating the likelihood for each possible copy-number value.
(Mathematic concept modification, adding a likelihood calculation step.)
5. (Currently Amended) The method of claim 1, further comprising: excluding possible copy number values for which the confidence level is below a minimum threshold.
(Mental process in a computing environment, or using a computer as a tool, of observing data that does not meet a threshold, and excluding it from analysis. Alternatively, a mathematic concept of comparing a set of data values to a threshold data value, and removing those below the threshold value.)
6. (Currently Amended) The method of claim 1, wherein the estimate of the copy-number values is calculated using information on the single nucleotide polymorphism (SNP} fractions and coverage, and wherein a percentage of the SNP fractions is indicative of a duplication.
(Mathematic concept modification, defining data to be used in the estimation step.)
7. (Currently Amended) The method of claim 1, further comprising: applying a principal- component filter to the coverage count generated for each sample and for each amplicon/region.
(Mathematic concept of applying PCA to the coverage count data. A mathematic algorithm.)
With respect to step 2A (2): NO. The claims were examined further to determine whether they integrated any JE into a practical application (MPEP 2106.04(d)). The claimed additional elements are analyzed alone, or in combination to determine if the JE is integrated into a practical application (MPEP 2106.05(a-c, e, f and h)).
Claim(s) 1 recite(s) the additional non-abstract element(s) of data gathering, or a description of the data gathered.
Data gathering steps are not an abstract idea, they are extra-solution activity, as they collect the data needed to carry out the JE. The data gathering does not impose any meaningful limitation on the JE, or how the JE is performed. The additional limitation (data gathering) must have more than a nominal or insignificant relationship to the identified judicial exception. (MPEP 2106.04/.05, citing Intellectual Ventures LLC v. Symantec Corp, McRO, TLI communications, OIP Techs. Inc. v. Amason.com Inc., Electric Power Group LLC v. Alstrom S.A.).
Claim(s) 1 recite(s) the additional non-abstract element (EIA) of a general-purpose computer system or parts thereof. (the high-throughput sequencer, and the genomic data analyzer, with a data processing unit)
The EIA do not provide any details of how specific structures of the computer elements are used to implement the JE.
The claims require nothing more than a general-purpose computer, or routine laboratory equipment, to perform the functions that constitute the judicial exceptions.
The computer elements of the claims do not provide improvements to the functioning of the computer itself (as in DDR Holdings, LLC v. Hotels.com LP).
They do not provide improvements to any other technology or technical field (as in Diamond v. Diehr).
Nor do they utilize a particular machine (as in Eibel Process Co. v. Minn. & Ont. Paper Co.).
Hence, these are mere instructions to apply the JE using a computer, and therefore the claim does not recite integrate that JE into a practical application.
Dependent claim(s) 2-3, 5-7 recite(s) an abstract limitation to the JE reciting additional mathematic concepts, or mental processes. Additional abstract limitations cannot provide a practical application of the JE as they are a part of that JE.
In combination, the limitations of data gathering, for the purpose of carrying out the JE, using a general-purpose computer merely provide extra-solution activity, and fail to integrate the JE into a practical application.
With respect to step 2B: NO. The claims recite a JE, do not integrate that JE into a practical application, and thus are probed for a specific inventive concept. The judicial exception alone cannot provide that inventive concept or practical application (MPEP 2106.05). The additional elements were considered individually and in combination to determine if they provide significantly more than the judicial exception. (MPEP 2106.05.A i-vi).
With respect to claim(s) 1: The limitation(s) identified above as non-abstract elements (EIA) related to data gathering do not rise to the level of significantly more than the judicial exception.
With respect to the use of a high-throughput sequencer:
Fromer (of record) provides high-throughput sequencers, and their use to sequence enriched samples.
Amarasinghe (2013; of record) provides next-generation sequencing, including of enriched samples using a high-throughput sequencer.
Illumina (2015) provides MiSeq, a high-throughput sequencer for sequencing multiplex samples that outputs raw sequencing data, and contains onboard computing elements such as processors, memory, storage and display.
With respect to performing targeted enrichment:
Fromer provides targeted enrichment protocols.
Amarasinghe provides targeted enrichment, particularly whole exome sequencing.
Kadalayil (2014; of record) provides known targeted enrichment protocols, on samples to be sequenced by high-throughput sequencing and identification of CNV.
Illumina (2012-2016) provide targeted enrichment protocols for pooled samples, for use in their MiSeq platform sequencer.
With respect to performing sequencing, and outputting raw sequence data:
Fromer, Amarasinghe, Kadalayil, Backeroth (2014; of record), and Jiang (2015), as well as Illumina all obtain raw sequence data from the performance of high-throughput sequencing processes on pooled enriched samples.
These elements meet the BRI of the identified data gathering limitations. As such, the prior art recognizes that this data gathering element is routine, well understood and conventional in the art (as in Alice Corp., CyberSource v. Retail Decisions, Parker v. Flook).
In the specification at [p2-5] the steps attributed as data gathering can be carried out using commercially available kit technology such as: Agilent SureSelect Target Enrichment system, Roche NimbleGen SeqCap EZ, Illumina Nextera® Rapid Capture, Agilent Haloplex and Multiplicom MASTR, Illumina TruSeq, and Illumina MiSeq® sequencers (p2).
The Illumina MiSeq® is a high throughput sequencer which carries out sequencing of each enriched library generated by a target enrichment protocol (Illumina; 2 published documents from 2015).
One illustration of an Illumina Enrichment protocol is the Illumina Nextera® Rapid Capture protocol (Illumina; 3 Nextera references from 2013-2016).
As these commercially available kits carry out one or more steps of the data gathering limitations for the same purpose, these limitations do not rise to the level of significantly more than the judicial exception.
High throughput sequencing as an overall process is completely routine, conventional and well- understood in the arts of genetics and bioinformatics (see citations above). The sequencers are not changed or modified by carrying out the JE. The high throughput sequencers are not a “particular machine” as multiple types of HT Sequencers are well-known in the prior art, and are widely commercially available for the same process of high-throughput sequencing.
Activities such as data gathering do not improve the functioning of a computer, or comprise an improvement to any other technical field.
The limitations do not require or set forth a particular machine, they do not effect a transformation of matter, nor do they provide an unconventional step (citing McRO and Trading Technologies Int’l v. IBG).
Data gathering steps constitute a general link to a technological environment.
Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception are insufficient to provide significantly more (as discussed in Alice Corp.,).
With respect to claim(s) 1: the limitations identified above as non-abstract elements (EIA) related to general-purpose computer systems do not rise to the level of significantly more than the judicial exception.
Each of Fromer, Amarasinghe, Backenroth and Jiang as well as Illumina (2015) each disclose computer systems or computing elements which meet the BRI of the claimed computer system or computer system elements, comprising input, output/ display, a processor, and memory.
As such, the prior art recognizes that these computing elements are routine, well understood and conventional in the art.
These elements do not improve the functioning of the computer itself, or comprise an improvement to any other technical field (Trading Technologies Int’l v IBG, TLI Communications).
They do not require or set forth a particular machine (Ultramercial v. Hulu, LLC., Alice Corp. Pty. Ltd v. CLS Bank Int’l).
They do not effect a transformation of matter, nor do they provide an unconventional step.
Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception are insufficient to provide significantly more (as discussed in Alice Corp., CyberSource v. Retail Decisions, Parker v. Flook, Versata Development Group v. SAP America).
Dependent claim(s) 2-3, 5-7 each recite a limitation requiring additional mathematic concepts or mental processes. Additional abstract limitations cannot provide significantly more than the JE as they are a part of that JE (MPEP 2106.05).
In combination, the data gathering steps providing the information required to be acted upon by the JE, performed in a generic computer or generic computing environment fail to rise to the level of significantly more than that JE. The data gathering steps provide the data for the JE, which is carried out by the general-purpose computers. No non-routine step or element has clearly been identified.
The claims have all been examined to identify the presence of one or more judicial exceptions. Each additional limitation in the claims has been addressed, alone and in combination, to determine whether the additional limitations integrate the judicial exception into a practical application. Each additional limitation in the claims has been addressed, alone and in combination, to determine whether those additional limitations provide an inventive concept which provides significantly more than those exceptions. For these reasons, the claims, when the limitations are considered individually and as a whole, are rejected under 35 USC § 101 as being directed to non-statutory subject matter.
New Citations:
Illumina (2015) Sequencing power for every scale. 16 pages. Discloses the MiSeq®, HiSeq®, and NextSeq® high-throughput sequencers.
Illumina (2015) MiSeq® system specification sheet. 4 pages. Details the MiSeq® system, including the onboard computing aspects (Instrument control computer (internal) p2).
Illumina (2015) Nextera® Rapid Capture Enrichment Protocol Guide. 18 pages, detailing how targeted enrichment is carried out on samples.
Illumina (2016) Nextera® Rapid Capture Enrichment Reference Guide, 60 pages. Revision history illustrates capabilities added between 2013 and 2016. The disclosure includes quantifying enriched targets in the Appendix.
Illumina (2012-2014) Nextera® Rapid Capture Exomes data sheet, 4 pages. Describing the workflow for target enrichment prior to sequencing.
Copies of these disclosures has been provided in parent application 15/777,091.
Applicant’s arguments:
Applicant’s arguments:
Applicant’s arguments and amendments have been carefully considered, and compared to the guidance provided by MPEP 2106.
Applicant argues the categorization or identification of abstract ideas, and/or a natural law in the claims. The Examiner has specifically identified each limitation in the claim, and what category of judicial exception is encompassed. The abstract ideas identified in the independent claims are the same as those identified as mathematic correlations, mathematic calculations, and mathematical relationships or as mental processes, concepts performed in the human mind including observations, evaluations, judgements and opinions, in MPEP 2106.04.
The examiner acknowledges Applicant’s arguments which set forth that the claims lead to an improvement in the technology of “automatic CNV detection for use in clinical settings”. According to the guidance set forth in MPEP 2106, this is an improvement to the judicial exception itself, and is not reflected back into a specific technological environment or practically applied process.
An improvement in the judicial exception itself is not an improvement in the technology. For example, in In re Board of Trustees of Leland Stanford Junior University, 989 F.3d 1367, 1370, 1373 (Fed. Cir. 2021) (Stanford I), Applicant argued that the claimed process was an improvement over prior processes because it ‘‘yields a greater number of haplotype phase predictions,’’ but the Court found it was not ‘‘an improved technological process’’ and instead was an improved ‘‘mathematical process.’’ The court explained that such claims were directed to an abstract idea because they describe ‘‘mathematically calculating alleles’ haplotype phase,’’ like the ‘‘mathematical algorithms for performing calculations’’ in prior cases. Notably, the Federal Circuit found that the claims did not reflect an improvement to a technological process, which would render the claims eligible (FR89 no.137, p58137, 7/17/2024). Here, the improvement is in the calculation of the CNV values, by the steps identified as the JE.
Applicant’s arguments that the improvement to the technology or technological process is that the claimed method as a whole: addresses challenges in the field, including correcting for coverage bias, standardization, detection of rare CNV in a multiplex process, corrects for variable quality of samples and sequence reads, and allows for the analysis of a small number of samples per batch. The steps which provide these alleged improvements are the same steps identified as providing the abstract ideas.
Applicant argues that the claimed process allows for improvement in identification of SNP, INDEL and CNV, however the claims do not clearly identify or use SNP/ INDEL data in the sample or references.
The pooling of samples, the enrichment of certain portions of DNA in the samples, and the sequencing of the samples, even in multiplex do not provide an improvement in the technology of CNV detection. The pooling, enrichment and sequencing steps are carried out, unchanged, whether or not the judicial exception is applied. (Cleveland Clinic Foundation: using well-known or standard laboratory techniques is not sufficient to show an improvement (MPEP2106.05(a)). The pooling, enrichment and sequencing steps, as well as the high-throughput sequencer were shown to be well-known and standard laboratory techniques. (Illumina materials, and other cited art.)
In the claims, the EIA identified as data gathering steps do not affect how the steps of the abstract idea are performed, they provide the data which is acted upon by the limitations of the JE. These data gathering steps do not apply, rely on, or use the steps identified as making up the JE. Rather, the steps avail themselves of the data gathered. The data gathering in the claims constitutes insignificant pre-solution activity. See MPEP § 2106.05(g):
MPEP2106.05(g). “The term "extra-solution activity” can be understood as activities incidental to the primary process or product that are merely a nominal or tangential addition to the claim...”
“An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered information by a series of steps in order to detect whether the transactions were fraudulent.”
See also CyberSource Corp. v. Retail Decisions, Inc., 654 F.3d 1366, 1372 (Fed. Cir. 2011) ("[E]ven if some physical steps are required to obtain information from the database ... such data-gathering steps cannot alone confer patentability.").
The improvement in automated CNV detection (achieved by the judicial exception) does not require a non-conventional interaction with a specific element of a computer as was required in Enfish. The high-throughput sequencer, the genome analyzer, and the data processor as claimed do not have non-routine interactions with the data, nor are any particular data structures, or particular improvements made in how the computer works.
The disputed claims in Enfish were patent-eligible because they were "directed to a specific improvement to the way computers operate, embodied in [a] self-referential table." Enfish, 822 F.3d at 1336. The court found that the "plain focus of the claims" there was on an improvement to computer functionality itself-a self-referential table for a computer database, designed to improve the way a computer carries out its basic functions of storing and retrieving data- not on a task for which a computer is used in its ordinary capacity. Id. at 1335-36. The court noted that the specification identified additional benefits conferred by the self-referential table (e.g., increased flexibility, faster search times, and smaller memory requirements), which further supported the court's conclusion that the claims were directed to an improvement of an existing technology. Id. at 1337 (citation omitted).
The improvement in the automated CNV detection (carried out by the judicial exception) does not improve the functionality of the computer itself as in Finjan, Visual Memory, or SRI Int’l. In Finjan, claims to virus scanning were found to be an improvement in computer technology. In Visual Memory, claims to an enhanced computer memory system were found to be directed to an improvement in computer capabilities. In SRI Int'l, claims to detecting suspicious activity by using network monitors and analyzing network packets were found to be an improvement in computer network technology.
The improvement in the automated detection of CNV (carried out by the JE) does not provide an improvement in computer animation and use rules to automate a subjective task of humans to create a sequence of synchronized, animated characters as in McRo. In McRO, it was not the mere presence of unconventional rules that led to patent eligibility. In McRO, "[t]he claimed improvement was to how the physical display operated (to produce better quality images)." SAP Am. v. InvestPic, LLC, 898 F.3d 1161, 1167 (Fed. Cir. 2018). The claims in McRO recited a step of applying the data sets generated using the specific claimed rules to a sequence of animated characters to produce lip synchronization and facial expression control of those animated characters. McRO, 837 F.3d at 1308. Thus, the claims were directed to an improvement in computer animation and used rules to automate a subjective task of humans to create a sequence of synchronized, animated characters. Id. at 1314--15.
In the claims at issue here, there is no such application of specifically claimed rules to produce an improved technological result.
The listed claims set forth the element in addition (EIA) to the JE of a high-throughput sequencer, a “genome analyzer” and a “data processor.” The high-throughput sequencer has no particular claimed qualities, beyond the ability to sequence samples and provide raw sequence read data. The high-throughput sequencer acts completely as it was designed, even when considering multiplex or pooled samples. (Illumina MiSeq® publications) Carrying out the JE, whether onboard the MiSeq equipment, or on a networked computer, does not make an improvement to the sequencer itself, nor change any functionality of either element. With respect to the “genome analyzer” and “data processor” these computing limitations are recited at such a high level of generality, they can be met by a general-purpose computer system and are not considered a particular machine or manufacture integral to the claim (MPEP 2106.05(b)). The genome analyzer and data processors are tools which implement the steps and calculations of the claims. Routine computer elements acting upon the data in a manner consistent to and according to their design are not considered to be sufficient to provide eligibility. (see, for example MPEP 2106.04(d): Gottschalk v. Benson “‘held that simply implementing a mathematical principle on a physical machine, namely a computer, was not a patentable application of that principle.”)
The processor is recited generically as simply a "processor" and the activity performed by that generic processor is normal computer functionality. The requirement of using a computer processor is not sufficient to establish integration of the JE into a practical application. One of the "examples in which a judicial exception has not been integrated into a practical application" is when "[a]n additional element ... merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea." Guidance 84 Fed. Reg. at 55 (emphasis added); FairWarning, 839 F.3d at 1096 ("[T]he use of generic computer elements like a microprocessor or user interface do not alone transform an otherwise abstract idea into patent-eligible subject matter.").
That the claimed system may result in faster and more accurate identifications in large data sets does not take the claim out of the realm of the abstract.
"[R]elying on a computer to perform routine tasks more quickly or more accurately is insufficient to render a claim patent eligible." OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363 (Fed. Cir. 2015); see also Intellectual Ventures I LLC v. Erie Indemnity Co., 711 F. App'x 1012, 1017 (Fed. Cir. 2017) (unpublished) ("Though the claims purport to accelerate the process of finding errant files and to reduce error, we have held that speed and accuracy increases stemming from the ordinary capabilities of a general-purpose computer 'do[] not materially alter the patent eligibility of the claimed subject matter."'); see also Bancorp Servs., L.L.C. v. Sun Life Assurance Co. of Can. (US.), 687 F.3d 1266, 1278 (Fed. Cir. 2012) ("[T]he fact that the required calculations could be performed more efficiently via a computer does not materially alter the patent eligibility of the claimed subject matter.").
Claims that recite performing information analysis (e.g., automatically identifying CNV present in pooled sample data), as well as the collection and manipulation of information related to such analysis, have been determined by our reviewing court to be an abstract concept that is not patent eligible. See SAP, 898 F.3d, 1165, 1167, 1168 (Claims reciting "[a] method for providing statistical analysis" (id. at 1165) were determined to be "directed to an abstract idea" (id. at 1168)); see also Content Extraction & Transmission LLC v. Wells Fargo Bank, Nat'l Ass 'n, 776 F.3d 1343, 1345, 1347 (Fed. Cir. 2014) (finding the "claims generally recite ... extracting data ... [and] recognizing specific information from the extracted data" and that the "claims are drawn to the basic concept of data recognition").
"As many cases make clear, even if a process of collecting and analyzing information is limited to particular content or a particular source, that limitation does not make the collection and analysis other than abstract." SAP, 898 F.3d at 1168 (internal quotation marks omitted)).
Further, with respect to the arguments regarding the alleged improvement, it is unclear that the independent claims recite all the necessary and sufficient steps required to achieve that improvement. MPEP 2106.05(a): “An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome. McRO, 837 F.3d at 1314-15, 120 USPQ2d at 1102- 03; DDR Holdings, 773F.3d at 1259, 113 USPQ2d at 1107.”
The MPEP sets forth that “if the examiner concludes the disclosed invention does not improve technology, the burden shifts to applicant to provide persuasive arguments supported by any necessary evidence to demonstrate that one of ordinary skill in the art would understand that the disclosed invention improves technology. Any such evidence submitted under 37 CFR 1.132 must establish what the specification would convey to one of ordinary skill in the art and cannot be used to supplement the specification.” Applicant’s arguments cannot take the place of evidence.
New Grounds of Rejection
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-3, 5-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Plagnol, Agilent, Chen and Fromer.
Plagnol, V. et al. (2012) A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics, vol 28, no 21 p2747-2754, and some supplementary information.
Agilent Technologies (2012) SureSelect RNA Target Enrichment for Illumina Paired-End Sequencing Protocol, version 2.2.1, Feb 2012. 76 pages. (herein: Agilent)
Chen, C. et al. (2014) Software for pre-processing Illumina next-generation short read sequences. Source Code for Biology and Medicine, vol 9:8, 11 pages.
Fromer, M. (24 April 2014) Using XHMM software to detect copy number variation in whole exome sequencing data. Current Protocols in Human Genetics, Vol 81: 7.23.1-7.23.21.
The claims have been amended to recite steps of obtaining the sample, enriching the sample using target enrichment, sequencing using a high-throughput sequencer to generate raw sequencing data, cleaning the data, aligning the cleaned reads to a reference, generating coverage counts, then carrying out the statistical analysis. The analysis normalizes the coverage counts based on prior estimates of the copy number for each region in an iterative process, automatically selecting a set of references that have the closest normalized coverage counts to the samples, estimating the copy number values iteratively, and stopping the iteration when certain conditions are met.
Plagnol provides the ExomeDepth program package, to identify and quantify copy number values/ variations in enriched DNA samples. Plagnol notes a deficiency in prior art algorithms for this purpose in the abstract:
“A typical read depth strategy consists of using another sample (or a combination of samples) as a reference to control for the variability at the capture and sequencing steps. However, technical variability between samples complicates the analysis and can create spurious CNV calls. Here, we introduce ExomeDepth, a new CNV calling algorithm designed to control for this technical variability. ExomeDepth uses a robust model for the read count data and uses this model to build an optimized reference set in order to maximize the power to detect CNVs.”
With respect to claim 1 and Step A, Plagnol enriches a pool of DNA samples with a target enrichment technology, as set forth in section 3.1. Plagnol obtained DNA samples from patients, which were pooled into two batches. Plagnol carries out exome target enrichment using Agilent SureSelect kits. Plagnol does not specifically describe their process as “being associated with a library of pooled fragments from a set of amplicons/ regions,” but that is what the SureSelect Technology provided.
Agilent provides the full protocol for the SureSelect target enrichment technology, for the purpose of creating libraries of polynucleotides, to sequenced by high-throughput sequencers. Figure 1 of Agilent shows the overall workflow for SureSelect targeted enrichment. Section 4 of Agilent discusses how the resulting library of amplified fragments is obtained and assessed, and pooled for the purpose of multiplex sequencing in a high-throughput sequencer. Once the pooled library is obtained, Agilent directs the user to proceed to use the appropriate Illumina protocols. (p64).
With respect to claim 1 and step B, Plagnol sequences each amplicon/ region with a high throughput sequencer to obtain high-throughput sequencing data, which are sequence reads each with their own associated metadata in section 3. “Samples in both datasets have been sequenced using Illumina HiSeq with 94bp reads.” Sequence read data includes a polynucleotide sequence read, sequence read quality scores, base quality scores, sequence read length, etc. A summary of the data obtained is present in Plagnol supplemental table 1.
With respect to claim 1 and step C (1), Plagnol provides some details as to quality control analysis of the sequence read data, indicating the use of various databases, programs, and bioinformatic tools in section 3.3. These include analysis of GC content, extracting read count information, selecting “consistent” paired-end reads, and the use of Phred scaled mapping quality values greater than a threshold. Plagnol does not specifically teach “cleaning the raw sequencing data to remove low quality bases and adapter sequences.”
Chen provides software for the purpose of analyzing raw sequencing data from high-throughput sequencing experiments, such as those provided by Illumina, including analysis of base quality scores, trimming reads, and removal of adapter sequences from sequence reads. This is necessary in the analysis of raw sequence data, as stated by Chen:
“next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data.” (abstract)
Chen provides their tool, ngsShoRT, and compares it to other well-known tools also used to analyze and clean raw sequence read data from high throughput experiments, including CutAdapt, NGS QC toolkit, and Trimmomatic. Chen points out that for genomic structural variant detection, correct mapping to the reference genome depends heavily on read quality. “Therefor cleaning up raw sequencing reads can improve the accuracy and performance of alignment tools.” (Background, p2).
The program of Chen takes the output from the high throughput sequencer, in a variety of formats:
“ngsShoRT takes Single-Read (SR), Paired-End (PE), and Mate-Pair (MP) FastQ or Illumina’s native QSEQ format sequence files as input (compressed files are supported also) and runs them through a set of independent preprocessing algorithms including adapter/primer sequence removal, homopolymer sequence removal, Illumina QSEQ specific methods, reads with “N” bases removal/splitting, quality score based trimming, and 5' or 3'-end bases trimming. Outputs include a set of SR or PE/MP reads in FastQ format and a detailed summary statistics report. Using ngsShoRT to pre-process short read sequences is usually an iterative process: raw reads are trimmed by one of the ngsShoRT methods and the output can be used as an input to another ngsShoRT method. The end product is a trimmed data set ready for incorporation into various downstream assembly and data analysis pipelines.”
Chen provides the type of pre-processing performed by their software, in the “Design principles” section, beginning at page 2.
“There are several types of potential errors in NGS reads: un-called “N” bases, sequencing artifacts (usually platform specific PCR primers, linkers and adaptors) and low quality bases. Errors are more likely to occur at the 3′-ends of a read from Illumina sequencing technology [2]. Many de novo genome assembly projects [15-21] included pre-processing steps for removing reads with uncalled “N” bases, and detecting and removing of adaptor sequences using exact string matching algorithms to search for user-specified adaptor sequences. However, exact string matching may fail to detect all adapter sequences because of sequencing errors. A customizable approximate matching algorithm is more desirable in this case.”
Improvements provided by Chen include keeping paired end reads together, even if one half fails a quality test, limiting trimming of reads to be greater than a user selected K-mer length, backwards compatibility with prior formats of sequencing data, and issues of scalability. Each algorithm of Chen is then discussed in detail, as to what it performs, and how it acts on raw sequence data.
With respect to claim 1, and the items of cleaning to remove low quality bases, nperc(p) and ncutoff(n) and nsplit(l) each act to identify, filter, and trim low quality base calls (“N”) in raw sequence read data.
“We developed this method to remove “N” bases from a read without having to filter out the entire read and lose its information. nperc, ncutoff and nsplit are important for removing “N” bases from a read because they are usually associated with low quality score bases, and DBG assemblers either discard reads with such bases [33] or simply convert them to an arbitrarily chosen nucleotide such as “A” [22].” (p5)
Low quality base trimming, based on quality scores are provided in Chen by LQR(lqs,p), Mott(ml) and TERA(avg).
“LQR [lqs, p] trims a “low quality” read using a low quality score (lqs) cutoff for individual bases, and a percentage cutoff (p) to limit the number of such bases in a read. LQR filters out a read with over p% of bases whose quality score is under lqs. It is similar to the algorithms used in other pre-processing tools [16,26,34].
Mott [ml] is a quality window extraction algorithm that trims from both the 5'- and 3'-ends of a read. Starting at the 3'-end of a read, it counts the running sum of (ml- Perror) values, RSMLP, for each base (Perror of a base =10-qulityscore/10) in the read, and extracts the string from the first base with RSMLP>0 to the base with the highest RSMLP.” (p5)
“Unlike 3end, TERA trims the 3'-end of a read differently depending on its bases’ quality scores. Starting at the 3'-end, the running average quality score (RAQS) of each base is calculated until it exceeds a cutoff, avg, at the base X. All bases 3' to X are then discarded. A good read with high quality (above avg) bases at its 3'-end would have fewer bases trimmed by TERA, while a low quality read might have more bases trimmed.” (p6)
With respect to claim 1 and removal of adapter sequences, Chen provides 5adpt. 5adpt takes a list of provided adapter sequences, and removes them from the raw sequence read, using approximate matching.
“5adpt [mp, list, approx_match_modifiers, search_depth, action] detects (at a match percentage mp and up to a depth of search_depth) 5'-adaptor/primer sequences loaded from a list (which can be built-in Illumina primer library, and/or user-provided sequences) and removes them from a read. 5adpt allows users to do approximate matching using the Levenshtein edit distance implementation in CPAN’s String::Approx module [29]. This module allows approximate matching using a simple percentage cutoff or detailed modifiers (number of allowed insertions, substitutions, and deletions). This feature, accessible through the approx_match_modifiers option, allows 5adpt to be modified to fit platform specific features and error profiles.” (p4)
In the comparisons performed by Chen, their pre-processing algorithms had a high level of performance:
“In the three categories of algorithms we compared, ngsShoRT algorithms removed more low quality bases, improved the overall quality scores of the trimmed data, and required less time and RAM to run.” (p7)
After the raw data analysis processing performed by Plagnol, the cleaned sequence reads were aligned to a human genome reference sequence dataset, hg19, using novalign software. (Section 3.2)
With respect to claim 1 step C (2), Plagnol extracts a coverage count from the data, for each amplicon. ExomeDepth extracts and provides the coverage count from sequence read data, as it identifies the number of sequence reads which align to each identified amplicon. “Generally speaking, read depth-based approaches for CNV calling compare the number of reads mapping to a chromosome window with its expectation under a statistical model. Deviations from this expectation are indicative of CNV calls.” (Introduction).
With respect to claim 1, step C (3) (a), Plagnol provides repeatedly, over a set of iterations, steps for estimating the copy number values for each sample for each amplicon as follows.
Plagnol provides a) normalizing, the coverage count associated with each sample at page 2748, section 2.1. “An overview of a normalized measure of read depth [matching fragments per million
reads and per kilobase, FPKM, (Mortazavi et al., 2008), Fig. 1A] showed extensive exon–exon variability.”
Plaginol then b) automatically selects a set of reference samples with the closest normalized coverage, with a number of reference samples being less than all samples, in section 2.1.
“However, a comparison between pairs of exome datasets (Fig. 1A) demonstrates the high level of correlations of the normalized read count data across samples (squared FPKM correlation coefficients 0.98–0.988 among 15 exomes in Dataset 1 and 0.72–0.987 for the 9 exomes in Dataset 2). It is therefore possible to use one exome or combine several exomes to construct a reference set to base the CNV inference on.”
A set of reference samples is automatically selected with the closest normalized coverage count, by calculating distance values (FPKM squared correlation coefficient). Plagnol calculates the Rs statistic, as the ratio between the typical distances between the exome sample and the first reference sample. (section 2.1). Next, in an iterative process, for each exome sample, the reference samples are compared to the exome sample, ranked and added to the reference set by lowest distance (highest correlation) to the exome sample.
“At each iteration we fit our robust model and compute the expected value of the posterior probability in favour of a single-exon heterozygous deletion call. This process of adding samples to the aggregate reference stops once the posterior probability stops to increase. This optimization is essentially a trade-off between limiting the variance (by increasing the size of the reference set) and increasing the bias (by adding exome samples to the reference in spite of being less correlated). In Dataset 1 (Fig. 2A) we found that the optimum size of the reference set was ~10. In several instances adding further samples in the reference set actually decreased the power.” P2749.
Plaginol discloses c) estimating the copy number values as a function of at least the coverage counts in the samples and references. Plaginol normalizes the coverage count associated with each amplicon/ region / exon based on the normalized coverage count of the current sample, and the normalized count of the set of reference samples as set forth in section 3.6. p2752-3. The estimate of the initial coverage count for each exon is based on the coverage counts of the sample, and the selected set of reference samples.
Plagnol does not utilize a Hidden Markov Model to make the estimations.
Fromer, in the same field of bioinformatics, and the identification of copy number variations in a sample, uses an HMM to estimate copy number values, as a function of the coverage.
“The XHMM (eXome-Hidden Markov Model) software was designed to recover information on CNVs from targeted exome sequence data (Fromer et al., 2012), and allows researchers to more comprehensively understand the association between genetic copy number and disease. The key steps in running XHMM are depth of coverage calculations, data normalization, CNV calling, and statistical genotyping (Fig. 7.23.2). The calling and genotyping stages provide extensive quality metrics that are geared toward a range of analyses that require varying degrees of filtering of putative signal from noise. This paper provides detailed instructions for running XHMM and gives examples and instructions for analyses that are possible using the CNV calls and results from XHMM.” (Introduction.)
XHMM of Fromer works on cleaned data, which has been pre-processed, including to remove adapter sequences, and low-quality bases.
“To calculate the depth of sequencing coverage, you need to start with the exome s