DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Status
Claims 1-32 are currently pending and under exam herein.
Claims 1-32 are rejected.
Claim 1 is objected to.
Priority
The instant application claims the benefit of U.S. Provisional Patent Application No. 63/216,419 filed 29 June 2021. At this point in the examination, the effective filling date of the claims is 06/29/2021.
Information Disclosure Statement
The Information Disclosure Statements (IDS) filed 28 July 2022, 23 November 2022, and 24 April 2025 are in compliance with the provisions of 37 CFR 1.97 and have therefore been considered.
Drawings
The drawings, submitted 1 July 2022, require correction because several nuclei acid sequences do not meet the proper disclosure requirements, as detailed below.
Nucleotide and/or Amino Acid Sequence Disclosures
REQUIREMENTS FOR PATENT APPLICATIONS CONTAINING NUCLEOTIDE AND/OR AMINO ACID SEQUENCE DISCLOSURES
Items 1) and 2) provide general guidance related to requirements for sequence disclosures.
37 CFR 1.821(c) requires that patent applications which contain disclosures of nucleotide and/or amino acid sequences that fall within the definitions of 37 CFR 1.821(a) must contain a "Sequence Listing," as a separate part of the disclosure, which presents the nucleotide and/or amino acid sequences and associated information using the symbols and format in accordance with the requirements of 37 CFR 1.821 - 1.825. This "Sequence Listing" part of the disclosure may be submitted:
In accordance with 37 CFR 1.821(c)(1) via the USPTO patent electronic filing system (see Section I.1 of the Legal Framework for Patent Electronic System (https://www.uspto.gov/PatentLegalFramework), hereinafter "Legal Framework") as an ASCII text file, together with an incorporation-by-reference of the material in the ASCII text file in a separate paragraph of the specification as required by 37 CFR 1.823(b)(1) identifying:
the name of the ASCII text file;
ii) the date of creation; and
iii) the size of the ASCII text file in bytes;
In accordance with 37 CFR 1.821(c)(1) on read-only optical disc(s) as permitted by 37 CFR 1.52(e)(1)(ii), labeled according to 37 CFR 1.52(e)(5), with an incorporation-by-reference of the material in the ASCII text file according to 37 CFR 1.52(e)(8) and 37 CFR 1.823(b)(1) in a separate paragraph of the specification identifying:
the name of the ASCII text file;
the date of creation; and
the size of the ASCII text file in bytes;
In accordance with 37 CFR 1.821(c)(2) via the USPTO patent electronic filing system as a PDF file (not recommended); or
In accordance with 37 CFR 1.821(c)(3) on physical sheets of paper (not recommended).
When a “Sequence Listing” has been submitted as a PDF file as in 1(c) above (37 CFR 1.821(c)(2)) or on physical sheets of paper as in 1(d) above (37 CFR 1.821(c)(3)), 37 CFR 1.821(e)(1) requires a computer readable form (CRF) of the “Sequence Listing” in accordance with the requirements of 37 CFR 1.824.
If the "Sequence Listing" required by 37 CFR 1.821(c) is filed via the USPTO patent electronic filing system as a PDF, then 37 CFR 1.821(e)(1)(ii) or 1.821(e)(2)(ii) requires submission of a statement that the "Sequence Listing" content of the PDF copy and the CRF copy (the ASCII text file copy) are identical.
If the "Sequence Listing" required by 37 CFR 1.821(c) is filed on paper or read-only optical disc, then 37 CFR 1.821(e)(1)(ii) or 1.821(e)(2)(ii) requires submission of a statement that the "Sequence Listing" content of the paper or read-only optical disc copy and the CRF are identical.
Specific deficiencies and the required response to this Office Action are as follows:
Specific deficiency - This application contains sequence disclosures in accordance with the definitions for nucleotide and/or amino acid sequences set forth in 37 CFR 1.821(a)(1) and (a)(2). However, this application fails to comply with the requirements of 37 CFR 1.821 - 1.825.
The sequence disclosures are located in the drawings (07/01/2022): Figures: 20E, 20F, and 20G.
Required response – Applicant must provide:
A "Sequence Listing" part of the disclosure, as described above in item 1); as well as An amendment specifically directing entry of the "Sequence Listing" part of the disclosure into the application in accordance with 1.825(b)(2);
A statement that the "Sequence Listing" includes no new matter in accordance with 1.825(b)(5); and
A statement that indicates support for the amendment in the application, as filed, as required by 37 CFR 1.825(b)(4).
If the "Sequence Listing" part of the disclosure is submitted according to item 1) a) or b) above, Applicant must also provide:
A substitute specification in compliance with 37 CFR 1.52, 1.121(b)(3) and 1.125 inserting the required incorporation-by-reference paragraph, consisting of:
A copy of the previously-submitted specification, with deletions shown with strikethrough or brackets and insertions shown with underlining (marked-up version);
A copy of the amended specification without markings (clean version); and
A statement that the substitute specification contains no new matter;
If the "Sequence Listing" part of the disclosure is submitted according to item 1) b), c), or d) above, Applicant must also provide:
A replacement CRF in accordance with 1.825(b)(6); and
Statement according to item 2) a) or b) above.
Claim Objections
Claim 1 is objected to because of the following informalities: There is a grammatical mistake, “with analyte” should read “with an analyte”. Appropriate correction is required.
Claim Interpretation
The claims in this application are given their broadest interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The interpretation of the following claims notifies the applicant of the interpretation of claim language to aid in the understanding of possible/rejections.
An element in Claims 21 recites “mapping the first predicted base call to the first known oligo base sequence” and an element in claim 23 recites “refraining from mapping the third predicted base call to any of the first or second known oligo base sequences”, if a certain condition is met. The element in the claims is considered contingent and therefore not limitation of the claimed invention because the condition being met is not required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 1 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. Claim 1 recites “iteratively initially training a base caller”, however “initially” refers to a starting point, and “iteratively” refers to repeating a process, therefore after an iteration, the base caller is no longer “initially” trained. The rejection may be overcome, for example, by amending the claim to remove “iteratively” and recite “initially training a base caller”. Because dependent claims 2-29 incorporate the unsupported limitation of claim 1 and do not include further limitations to correct that issue, they are likewise rejected under 35 U.S.C. 112(b).
Claim 3, 5 and 10 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. Claim 3, 5, and 10 recite "the known single-oligo base sequence ". It is unclear if this known single-oligo base sequence is a separate, newly recited single-oligo base sequence or is the same single-oligo base sequence recited in claim 1. For examination purposes, it is interpreted that the single-oligo base sequence in claims 3, 5, and 10 are the same as the single-oligo base sequence recited in claim 1. Furthermore, claims 3,5, and 10 recite the limitation "the known single-oligo base sequence ". There is insufficient antecedent basis for this limitation in the claim. It is, therefore, suggested to amend claim 1 to recite “a known single-oligo base sequence”.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 4, 6-8, 28, 30 and 32 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite: (a) mathematical concepts, (e.g., mathematical relationships, formulas or equations, mathematical calculations); and (b) mental processes, i.e., concepts performed in the human mind, (e.g., observation, evaluation, judgment, opinion).
Subject matter eligibility evaluation in accordance with MPEP 2106:
Eligibility Step 1: Claims 1-28 are directed to a method (process) of progressively training a base caller. Claims 29-30 are directed to a computer-implemented method (process), and Claims 31-32 is directed to a computer-implemented method (process). Therefore, these claims are encompassed by the categories of statutory subject matter, and thus, satisfy the subject matter eligibility requirements under step 1.
[Step 1: YES]
Eligibility Step 2A: First it is determined in Prong One whether a claim recites a judicial exception, and if so, then it is determined in Prong Two whether the recited judicial exception is integrated into a practical application of that exception.
Eligibility Step 2A Prong One: In determining whether a claim is directed to a judicial exception, examination is performed that analyzes whether the claim recites a judicial exception, i.e., whether a law of nature, natural phenomenon, or abstract idea is set forth or described in the claim.
Dependent claim 4 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
using a back propagation path of a neural network configuration loaded in the base caller, updating weights and/or biases of the neural network configuration, based on the plurality of error signals (training using a backpropagation algorithm, and updating weights, encompass performing mathematical calculations).
Dependent claim 6 further recites:
repeating the second iteration of the initial training of the base caller with analyte comprising the single-oligo base sequence for a plurality of instances, until a convergence condition is satisfied (convergence is a mathematical concept).
Dependent claim 7 further recites:
the convergence condition is satisfied when between two consecutive repetitions of the second iteration of the initial training of the base caller, a decrease in the plurality of further error signals is less than a threshold (convergence is a mathematical concept).
Dependent claim 8 further recites:
the convergence condition is satisfied when the second iteration of the initial training of the base caller is repeated for at least a threshold number of instances (convergence is a mathematical concept).
Dependent claim 28 further recites:
with progression of the iterations during the iteratively further training, monotonically increasing a number of unique oligo base sequences in the analyte comprising the multi-oligo base sequences (increasing the number is a mathematical concept).
Dependent claim 30 further recites:
iterating the using, the labelling, and the training until a convergence is satisfied (convergence is a mathematical concept).
Independent claim 32 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
iterating the using, the culling, the labelling, and the training until a convergence is satisfied (convergence is a mathematical concept).
Therefore, claims 4, 6-8, 28, 30 and 32 recite an abstract idea.
[Step 2A Prong One: YES]
Eligibility Step 2A Prong Two: In determining whether a claim is directed to a judicial exception, further examination is performed that analyzes if the claim recites additional elements that when examined as a whole integrates the judicial exception(s) into a practical application (MPEP 2106.04(d)). A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception. The claimed additional elements are analyzed to determine if the abstract idea is integrated into a practical application (MPEP 2106.04(d)(I); MPEP 2106.05(a-h)). If the claim contains no additional elements beyond the abstract idea, the claim fails to integrate the abstract idea into a practical application (MPEP 2106.04(d)(III)).
The judicial exceptions identified in Eligibility Step 2A Prong One are not integrated into a practical application because of the reasons noted below.
The additional elements in 4, 6-8, 28, 30 and 32 include:
a computer
The additional elements of using a computer to perform calculations is an insignificant extra-solution activity that are part of the data gathering process used in the recited judicial exceptions (see MPEP 2106.05(g)).
When all limitations in 4, 6-8, 28, 30 and 32 have been considered as a whole, the claims are deemed to not recite any additional elements that would integrate a judicial exception into a practical application, and therefore 4, 6-8, 28, 30 and 32 are directed to an abstract idea (MPEP 2106.04(d)).
[Step 2A Prong Two: NO]
Eligibility Step 2B: Because the claims recite an abstract idea, and do not integrate that abstract idea into a practical application, the claims are probed for a specific inventive concept. The judicial exception alone cannot provide that inventive concept or practical application (MPEP 2106.05). Identifying whether the additional elements beyond the abstract idea amount to such an inventive concept requires considering the additional elements individually and in combination to determine if they amount to significantly more than the judicial exception (MPEP 2106.05A i-vi).
The claims do not include any additional elements that are sufficient to amount to significantly more than the judicial exception(s) because the reasons noted below.
The additional elements recited in claims 4, 6-8, 28, 30 and 32 are identified above, and carried over from Step 2A: Prong Two along with their conclusions for analysis at Step 2B. Any additional element or combination of elements that was considered to be insignificant extra-solution activity at step Step 2A: Prong Two was re-evaluated at step 2B, because if such re-evaluation finds that the element is unconventional or otherwise more than what is well-understood, routine, conventional activity in the field, this finding may indicate that the additional element is no longer considered to be insignificant; and all additional elements and combination of elements are other than what is well-understood, routine, conventional activity in the field, or simply append well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception, per MPEP 2106.05(d).
The additional element of using a computer to perform repetitive calculations is conventional. The courts have recognized that performing repetitive calculations using a computer is well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. See Flook, 437 U.S. at 594, 198 USPQ2d at 199 (recomputing or readjusting alarm limit values); Bancorp Services v. Sun Life, 687 F.3d 1266, 1278, 103 USPQ2d 1425, 1433 (Fed. Cir. 2012) (“The computer required by some of Bancorp’s claims is employed only for its most basic function, the performance of repetitive calculations, and as such does not impose meaningful limits on the scope of those claims.”).
Thus, claims 4, 6-8, 28, 30 and 32 are deemed to not contribute an inventive concept, i.e., amount to significantly more than the judicial exception(s) (MPEP 2106.05(II)).
[Step 2B: NO]
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, and 5-28, 30 and 32 are rejected under 35 U.S.C. 103(a) as being unpatentable over Kermani et al. (US 10068053 B2), in view of Brownlee. (Brownlee, Jason. “How to Control Neural Network Model Capacity with Nodes and Layers.” Machinelearningmastery, 25 Aug. 2020, machinelearningmastery.com/how-to-control-neural-network-model-capacity-with-nodes-and-layers).
Claim 1 is drawn to a computer implemented method of progressively training a base caller by repeatedly first training a base caller with a short nuclei acid sequence, and generating labelled training data using the trained base caller; training the base caller with more than one short nucleic acid sequences, and generating labelled training data using the further trained base caller; and then repeatedly training the base caller by repeating steps , while during a least one repetition, making the neural network configuration that is loaded within the base caller, more complex, wherein labelled training data generated during a repetition is used to train the base caller during an immediate subsequent iteration. In some embodiments: during an iteration, training the base caller with analyte having more than one oligo base sequences, increasing within the analyte, a number of oligo base sequences (claim 2); drawn to during an iteration of training the base caller, adding the single oligo base sequence into one or more clusters of a flow cell, generating more than one signals that correspond to the many clusters, each sequence within the sequences signals representative of base sequences loaded in a corresponding cluster of the many clusters, predicting corresponding base calls for the oligo base sequence, to generate predicted base calls, comparing predicted base calls to the actual known sequence. The differences between the prediction and the reality as the error signals and training the base caller during the first iteration, based on the error signals (claim 3); during the second iteration, predicting base on each sequence of the many sequence signals, corresponding bases for the known single oligo base sequences, to generate predicted base calls and also generating error signals, based on comparing a corresponding base calls and the bases of the known single- oligo sequence, thereby generating further error signals corresponding to sequence signals, and training the base caller during the second iteration, based on the plurality of further error signals (claim5); repeating the second iteration of the training of the base caller with analyte comprising the single-oligo base sequence for more than one instances, until a convergence condition is achieved (claim 6); when the error signals decreases below a threshold, between iterations, convergence is achieved (claim 7); the convergence condition is achieved when one of the repetitions of the initial training of the base caller is repeated for at least a number of times (claim 8); the sequence signals corresponding to the clusters, which are generated during an iteration of training the basecaller, is reused for the next iteration of training the basecaller (claim 9); for a first predicted base calls, comparing a first base of the first predicted base calls with a first base of the known single oligo sequence and comparing a second base of the first predicted base calls and a second base of the known single oligo sequence, to generate a corresponding first error signal (claim 10); training the base caller for a first number of repetitions with analyte comprising two unique oligo base sequences; and training the base caller for a second number of repetitions with analyte comprising three known oligo base sequences, first iteration happening before the second (claim 11); further training the base caller for a first number of iterations with analyte comprising two oligo base sequences, such that (for a first subset of the first number of iterations, a second neural network configuration is loaded within the base caller, and for a second subset of the first number of iterations happening after the first subset of iterations, a third neural network configuration is loaded within the base caller, wherein the first, second, and third neural network configurations are different from each other (claim 12); second network configuration is more complex than the previous one (claim 13); second neural network configuration has more layers than the previous one (claim 14); second neural network configuration has more weights than the previous one (claim 15); second neural network configuration has greater number of parameters than the previous one (claim 16); third neural network configuration has more layers than the previous one (claim 17); third neural network configuration has more weights than the previous one (claim 18); third neural network configuration has greater number of parameters than the previous one (claim 19); training the base caller using two known oligo sequences by adding a first known sequence and second known sequence to clusters of a flow cell and predicting, for each cluster of the first and second clusters, corresponding base calls, such that predicted base calls are generated, predictions that show a match to known sequences are mapped to the sequence, while predictions that do not correspond to the two known sequence are not mapped. Error signals are generated by comparing the mapped known sequences to the predictions, and the errors are used to further train the base caller (claim 20); comparing the bases to both oligo sequences, if the predicted sequence has at least a threshold number of similarity to one known sequence and less than that threshold to the other known sequence, the predicted base call is then mapped to the sequence that is more similar (claim 21); when the predicted based is not mapped to any of the sequences, each predicted sequence is then compared to both known base sequences, and if the predicted base sequence has less than a threshold number of similarity to the both known base sequences, the predicted base call is not mapped to either known sequence (claim 22); when the predicted based is not mapped, when comparing the predicted sequences to the known base sequences and the similarity threshold is surpassed for both of them, then the predicted base call is not mapped to either known base sequences (claim 23); the further trained basecaller being used to generate labelled training data. Following to further training the base caller during an iteration, it is used again to predict base calls for the clusters. The new predictions are mapped again so that confident predictions can be assigned to one of the known sequences, and uncertain predictions are not mapped. The ones that are mapped are then used as training data, with the known base sequences forming ground truth data labels for their corresponding predicted base calls (claim 24); the labelled training data generated during the one iteration of the first number iterations is used to train the base caller during an immediate subsequent iteration of the Ni iterations (claim 25); the neural network configuration of the base caller is the same during the one iteration of the Ni iterations and the immediate subsequent iteration of the Ni iterations (claim 26); a neural network configuration of the base caller during the immediate subsequent iteration of the Ni iterations is different from, and more complex than, a neural network configuration of the base caller during the one iteration of the Ni iterations (claim 27); progressively iterating, increasing a number of oligo base sequences in the analyte comprising the multi-oligo base sequences (claim 28); the using, the labelling, and the training until a convergence is satisfied (claim 30), and repeating the using, the discarding, the labelling, and the training until a convergence is achieved (claim 32).
With respect to the limitation of a computer-implemented method of progressively training a base caller, Kermani et al. teach a “basecaller for DNA sequencing using machine learning” [title] that can be trained progressively “the model can be improved by learning from mistakes, e.g., by iteratively improving the model on new training data” [col.8, para.2. lns 19-21], done in a computer 30 [col.5, para. 5, ln.1].
With respect to the limitation of iteratively initially training a base caller with analyte comprising a single-oligo base sequence, and generating labelled training data using the initially trained base caller, Kermani et al. teach initially “the initial base calls can be used to create initial sequences of nucleic acids” [col.2, para.2, lns.3-4], iteratively training a base caller [col.8, para.2. lns 19-21], using short nucleic acid sequences “fragments shorter than 1,000 bases can be referred to as short” [col.3, para.3, lns.20-21]. Kermani et al. further teach generating labelled training data, “An assumed sequence corresponds to the sequence that is believed to be accurate” [col.4, para.4, lns.1-2] meaning the assumed sequence corresponds to the generated label, which is used to train the model and uses the initially trained base caller “The determination of the sequence may use an initial basecaller” [col.9, para.5, lns.35-36].
With respect to the limitation of training the base caller with analyte comprising multi-oligo base sequences, and generating labelled training data using the trained base caller, Kermani et al. teach training the base caller with more than one short nucleic acid sequences “one or more training samples are obtained. The training samples include nucleic acids that are to be sequenced. The training samples can be nucleic acids from an organism or artificially created nucleic acids, or a mixture of both” [col.9, para.3, lns 1-5] where the nucleic acids can be short [col.3, para.3, lns.20-21]. Kermani et al. further teach generating labelled training data using the trained base caller, “assumed sequences for nucleic acids of one or more training samples are used to train a machine-learning model” [col.10, para.3, lns 1-3], “assumed sequences” correspond to the labelled training data and are generated using the trained base caller, as it mentions that they were “determined using a previous iteration of the model”, and also that the base caller is trained on more than one short nucleic acids, which can be “or artificially created nucleic acids, or a mixture of both”.
With respect to the limitations of iteratively further training the base caller by repeating step (i) and wherein labelled training data generated during an iteration is used to train the base caller during an immediate subsequent iteration, Kermani et al. teach “the model can be improved by learning from mistakes, e.g., by iteratively improving the model on new training data” [Fig.3; col.8, para.2. lns 19-21].
With respect to claim 2, Kermani et al. teach “one or more training samples are obtained. The training samples include nucleic acids that are to be sequenced. The training samples can be nucleic acids from an organism or artificially created nucleic acids, or a mixture of both” [col.9, para.3, lns 1-5].
With respect to claims 6-8, and claim 30, Kermani et al. teach iteratively training a base caller [Fig.3; col.8, para.2. lns 19-21] and achieving efficient convergence [col.29, para.2, lns. 3-8], as the training is iteratively, it is assumed this will be repeated at least one time until it reaches a certain point (threshold) and that the training of the basecaller method is to be used. When the errors decrease below the threshold between interactions, it means the model has learned and convergence can be achieved, which is characteristic of progressive learning.
With respect to claim 11 and claim 28, Kermani et al. teach iteratively training the base caller, “e.g., by iteratively improving the model on new training data” [col.8, para.2. lns 19-21] with one or more training sequences ““one or more training samples are obtained. The training samples include nucleic acids that are to be sequenced. The training samples can be nucleic acids from an organism or artificially created nucleic acids, or a mixture of both” [col.9, para.3, lns 1-5], using multiple oligo sequences [col.9, para.3, lns 1-5], as the training is iteratively, it is assumed the number of oligo sequences will change (increase) between one iteration and the next, as the model is progressively learning.
With respect to claim 3, claim 5, claims 9-10, and claim 20, Kermani et al. teach iteratively training the base caller [col.8, para.2. lns 19-21], and basecalling techniques using clusters of a flow cell [col.5, para.2, lns.1-4], and “mapping” [col.11, para.6,lns.15]. The authors also explain that the basecaller is trained comparing predicted base calls to known or assumed sequences, “For example, one training sample might be chosen because the sequences are artificially made and therefore known ahead of time” [col.15, para2, lns.10-12], with the predicted based being compared to the known sequences so that the model output can then become “the same or nearly the same as assumed sequences when the measured intensity values are input to the model” [col.12, para.5, lns.63-64], and also explain how errors are used to improve the model [col.12, para.1, lns. 5-12].
With respect to claim 21, Kermani et al. teach mapping when a prediction matches one known base sequence more than the other “The number of mismatches per arm can be used as a criteria as to whether or not to accept an alignment to a particular location” [col.20, para.3, lns.37-39] and “For example, a threshold for the number of mismatches that are allowed for alignment can be relatively low, such as only 2-4 mismatches” [col.19, para.4, lns.37-40].
With respect to claim 22, Kermani et al. teach refraining from mapping predictions that do not meet a certain similarity threshold, “Filter 335 can remove assumed sequences that are not very reliable. For example, the initial sequence may map to the reference sequence, but with too many mismatches” [col.12., para.4., lns.35-37] and “if the expected error is above the threshold, then that complexity can be deemed insufficient, and the initial sequence can be discarded” [col.22, para.4, lns.40-43].
With respect to claim 23, Kermani et al. teach refraining from mapping predicted if the predicted base sequence has less than a threshold number of similarity to the both known base sequences, “Also, if two locations in the reference sequence align with the same number of mismatches (but in different positions of the initial sequence), then that initial sequence can be discarded” [col.19, para.4, lns.40-43], meaning when the sequence aligns well to more than one reference, it gets discarded.
With respect to claims 24-25, Kermani et al. teach running the predictions again after training and using ones that are mapped are then used as training data, with the known base sequences forming ground truth data labels for their corresponding predicted base calls, “the training nucleic acids can include artificial sequences, for which the sequences are known [col.15, para.4, lns. 29-30], providing the ground truth labels. The authors further explain that “the resulting model can be used as the initial basecaller for a next round of training” [col.29, para.5, lns.52-54] with the outputs being same as the assumed sequences [col.12, para.5, lns.63-64], meaning that re-predicted base calls are coupled with known sequences as labeled trained data. As the model is progressively trained, it is assumed that the labelled training data generated during one iteration, will be used to train the base caller during an immediate subsequent iteration.
With respect to claim 32, Kermani et al. teach discarding (culling) “Discarding can be accomplished in a hard or a soft way” [col.14, para.1, lns.12-13], providing the ground truth labels, and training until a convergence is satisfied [col.29, para.2, lns. 3-8] , the model is iteratively improving, so further iteration is assumed until a convergence is achieved.
Kermani et al. partially teaches combination of multiple (intermediate) models thereby increasing a complexity [0171], and modifying model parameters [col.29, para.2, lns. 3-8], however the authors do not teach explicitly the limitation of during at least one iteration, increasing a complexity of neural network configuration loaded within the base caller.
Brownlee. does teach the concept of increasing the complexity of neural network “The capacity of a deep learning neural network model controls the scope of the types of mapping functions that it is able to learn” [para.1, lns.1-2] and “Increasing the number of layers provides a short-cut to increasing the capacity of the model with fewer resources, and modern techniques allow learning algorithms to successfully train deep models” [para.5, 3th bullet point].
With respect to claims 12-19, 26-27, Brownlee. teaches configuring model layers, nodes, optimal set of weights and parameters to increase model complexity, “We can control whether a model is more likely to overfit or underfit by altering its capacity”, “A model with more layers and more hidden units per layer has higher representational capacity — it is capable of representing more complicated functions” and “Repeat the experiment of increasing layers on a problem that requires the increased capacity provided by increased depth In order to perform well” [pgs. 1-17].
It would have been obvious to one of ordinary skill in the art at the time the invention was made to modify the configuration of the neural network loaded within the base caller of Kermani et al., with to increasing the capacity of the neural network, modifying parameters, weights and layers of Brownlee, because Brownlee. shows that “A model with less capacity may not be able to sufficiently learn the training dataset. A model with more capacity can model more different types of functions and may be able to learn a function to sufficiently map inputs to outputs in the training dataset” and “increasing the depth increases the capacity of the model. Training deep models, e.g. those with many hidden layers, can be computationally more efficient than training a single layer network with a vast number of nodes” [Controlling Neural Network Model Capacity]. A person of ordinary skill in the art would therefore have been motivated to change the configuration of the neural network loaded within the base caller by increasing its capacity to avoid issues with the model learning the training dataset. One would have had a reasonable expectation of success for making the combination because the model can learn the training dataset better and can be more efficient.
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Kermani et al. (US 10068053 B2), and Brownlee. (Brownlee, Jason. “How to Control Neural Network Model Capacity with Nodes and Layers.” Machinelearningmastery, 25 Aug. 2020, machinelearningmastery.com/how-to-control-neural-network-model-capacity-with-nodes-and-layers), as applied to Claim 1 above, and in further view of Rumelhart et al. (“Learning Representations by Back-Propagating Errors.” Nature, vol. 323, no. 6088, Oct. 1986, pp. 533–536).
Claim 4 is drawn to using a back propagation path of a neural network configuration loaded in the base caller, updating weights and/or biases of the neural network configuration, based on the plurality of error signals.
Browlee. teaches updating weights of the neural network configuration, and Kermani et al. teach using propagation of neural network loaded in the base caller [col.34, para.4, lns.25-26].
Kelmani et al. and Brownlee. do not explicitly teach using “back propagation” of a neural network configuration based on the error signals.
Rumelhart et al. teach “back-propagation for networks of neurone-like units. The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector” [col.1, para.1, lns.2-5, pg. 533].
It would have been obvious to one of ordinary skill in the art at the time the invention was made to modify the configuration of the neural network loaded within the base caller of Kermani et al., with to increasing the capacity of the neural network, modifying parameters, weights and layers of Brownlee, and further add the back-propagation of Rumelhart et al., because Rumelhart et al. show that their methods has “The ability to create useful new features distinguishes back-propagation from earlier, simpler methods such as the perceptron-convergence procedure” [col.1, para.1, lns.9-11, pg. 533] and “repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector”, meaning that backpropagation aims to minimize the cost function by adjusting network’s weights and biases [col.1, para.1, lns.1-4, pg. 533]. A person of ordinary skill in the art would therefore have been motivated to change the configuration of the neural network loaded within the base caller by using backpropagation because it allows the neural network to learn from errors, iteratively improving by adjusting weights. One would have had a reasonable expectation of success for making the combination because the model can improve overtime, minimizing error, therefore increasing accuracy.
Claim 29 and Claim 31 are rejected under 35 U.S.C. 103 as being unpatentable over Kermani et al. (US 10068053 B2), and Brownlee. (Brownlee, Jason. “How to Control Neural Network Model Capacity with Nodes and Layers.” Machinelearningmastery, 25 Aug. 2020, machinelearningmastery.com/how-to-control-neural-network-model-capacity-with-nodes-and-layers), as applied to Claim 1 above, and in further view of Massingham and Goldman. (“All Your Base: A Fast and Accurate Probabilistic Approach to Base Calling.” Genome Biology, vol. 13, no. 2, 2012, p. R13).
Claim 29 is drawn to using a base caller to predict base call sequences for unknown analytes sequenced to have a known sequence of an oligo; labeling each of the unknown analytes with a ground truth sequence that matches the known sequence; and training the base caller using the labelled unknown analytes. Claim 31 is drawn to using the basecaller to predict basecall sequences for unknown analytes which were sequenced to have known sequences of two or more oligos, discarding unknown analytes from based on classification of base call sequences to the known sequences, based on classification, labelling respective subset of discarded analytes with respective ground truth sequences that match known sequences, and training the base caller using the labelled respective subsets of the discarded unknown analytes
Kermani et al. teach “the training nucleic acids can include artificial sequences, for which the sequences are known [col.15, para.4, lns. 29-30], providing the ground truth labels, and further teach generating labelled training data, “An assumed sequence corresponds to the sequence that is believed to be accurate” [col.4, para.4, lns.1-2] meaning the assumed sequence corresponds to the generated label, which is used to train the model and uses the initially trained base caller “The determination of the sequence may use an initial basecaller” [col.9, para.5, lns.35-36]. Kermani et al. further teach discarding (culling) teach discarding (culling) “Discarding can be accomplished in a hard or a soft way” [col.14, para.1, lns.12-13], providing the ground truth labels, and training until a convergence is satisfied [col.29, para.2, lns. 3-8].
Kermani et al. do not teach using sequences for unknown analytes.
Massingham and Goldman. teach using unknown sequences “so calls can be made where a reference sequence is unknown” [col.1, para.3, lns. 18-19, pg.11].
It would have been obvious to one of ordinary skill in the art at the time the invention was made to modify the training method for the base caller of Kermani et al., with to increasing the capacity of the neural network, modifying parameters, weights and layers of Brownlee, and further add using unknown sequences of Massingham and Goldman., because Massingham and Goldman. show that their “AYB” method “is more accurate than other methods of base calling” [col.1, para.3, lns. 1-2, pg.11] and “In addition to its speed and accuracy, AYB has two other desirable properties” [col.1, para.3, lns. 16-17, pg.11]. A person of ordinary skill in the art would therefore have been motivated to use unknown sequences because it allows the neural network to learn from a wider number of samples, improving the accuracy. One would have had a reasonable expectation of success for making the combination because not only both all references are in the field of training basecallers, but because the model can be faster and more accurate.
Double Patenting
Claims 1, 6, 7, 28, 12, 14,15,16, 27, 29 and 31 are provisionally rejected on the ground of nonstatutory obviousness-type double patenting as being over claims 1-2, 16, 27, 3, 18, 19,20, 21, 17, 31 and 31 respectively, of application No. 17/830,316, in view of Brownlee. (Brownlee, Jason. “How to Control Neural Network Model Capacity with Nodes and Layers.” Machinelearningmastery, 25 Aug. 2020, machinelearningmastery.com/how-to-control-neural-network-model-capacity-with-nodes-and-layers).
The instant claim 1 is drawn to a computer implemented method of progressively training a base caller by repeatedly first training a base caller with analyte comprising o sequence, and generating labelled training data using the trained base caller; training the base caller with more than one single oligo sequences, and generating labelled training data using the further trained base caller; and then repeatedly training the base caller by repeating steps , while during a least one repetition, making the neural network configuration that is loaded within the base caller, more complex, wherein labelled training data generated during a repetition is used to train the base caller during an immediate subsequent iteration. In some embodiments: repeating the second iteration of the initial training of the base caller with analyte comprising the single-oligo base sequence for a plurality of instances, until a convergence condition is satisfied (claim 6); when the error signals decreases below a threshold, between iterations, convergence is achieved (claim 7); with progression of the iterations during the iteratively further training, monotonically increasing a number of unique oligo base sequences in the analyte comprising the multi-oligo base sequences (claim 28); further training the base caller for a first number of iterations with analyte comprising two oligo base sequences, such that (for a first subset of the first number of iterations, a second neural network configuration is loaded within the base caller, and for a second subset of the first number of iterations happening after the first subset of iterations, a third neural network configuration is loaded within the base caller, wherein the first, second, and third neural network configurations are different from each other (claim 12); the second neural network configuration has a greater number of layers than the first neural network configuration (claim 14); second neural network configuration has more weights than the previous one (claim 15); second neural network configuration has greater number of parameters than the previous one (claim 16); a neural network configuration of the base caller during the immediate subsequent iteration of the Ni iterations is different from, and more complex than, a neural network configuration of the base caller during the one iteration of the Ni iterations (claim 27); using a base caller to predict base call sequences for unknown analytes sequenced to have a known sequence of an oligo; labeling each of the unknown analytes with a ground truth sequence that matches the known sequence; and training the base caller using the labelled unknown analytes (claim 29) and using the baseballers to predict basecall sequences for unknown analytes which were sequenced to have known sequences of two or more oligos, discarding unknown analytes from based on classification of base call sequences to the known sequences, based on classification, labelling respective subset of discarded analytes with respective ground truth sequences that match known sequences, and training the base caller using the labelled respective subsets of the discarded unknown analytes (claim 31).
Claim 1 of co-pending application No. 17/830,316 is drawn to a computer implemented method of progressively training a base caller by repeatedly first training a base caller with analyte comprising organism base sequences and generating labelled training data using the trained base caller and then repeatedly training the base caller by repeating steps while during a repetition, making the neural network configuration that is loaded within the base caller, more complex, wherein labelled training data generated during a repetition is used to train the base caller during an immediate subsequent iteration.
Claim 2 of co-pending application No. 17/830,316 is drawn to training the base caller with analyte comprising one or more oligo base sequences, and generating labelled training data using the initially trained base caller.
Claim 3 co-pending application No. 17/830,316 is drawn to the Ni iterations are performed prior to the N2 iterations, and wherein the second organism base sequence has a higher number of bases than the first organism base sequence.
Claim 16 of co-pending application No. 17/830,316 is drawn to the neural network configuration of the base caller is reused for multiple iterations, until a convergence condition is satisfied.
Claim 17 of co-pending application No. 17/830,316 is drawn to a neural network configuration of the base caller during the first iteration of the Ni iterations is different from, and more complex than, a neural network configuration of the base caller during the second iteration of the N1 iterations.
Claim 18 of co-pending application No. 17/830,316 is drawn to for a first subset of the Ni iterations, further training the base caller with a first neural network configuration loaded in the base caller; for a second subset of the N1 iterations, further training the base caller with a second neural network configuration loaded in the base caller, the second neural network configuration different from the first neural network configuration.
Claim 19 of co-pending application No. 17/830,316 is drawn to the second neural network configuration has a greater number of layers than the first neural network configuration.
Claim 20 of co-pending application No. 17/830,316 is drawn to the second neural network configuration has a greater number of layers than the first neural network configuration.
Claim 21 of co-pending application No. 17/830,316 is drawn to the second neural network configuration has a greater number of weights than the first neural network configuration.
Claim 27 co-pending application No. 17/830,316 is drawn to the convergence condition is satisfied when between two consecutive iterations of the Ni iterations, a decrease in an error signal generated is less than a threshold.
Claim 31 of co-pending application No. 17/830,316 is drawn to using the base caller to predict single-oligo base call sequences for a population of single-oligo unknown analytes (i.e., unknown target sequences) sequenced to have a known sequence of an oligo, (ii) labels each single-oligo unknown analyte in the population of single-oligo unknown analytes with a single-oligo ground truth sequence that matches the known sequence, and (iii) trains the base caller using the labelled population of single-oligo unknown analytes; continuing with one or more multi-oligo training stages that (i) use the base caller to predict multi-oligo base call sequences for a population of multi-oligo unknown analytes sequenced to have two or more known sequences of two or more oligos, (ii) cull multi-oligo unknown analytes from the population of multi-oligo unknown analytes based on classification of multi-oligo base call sequences of the culled multi-oligo unknown analytes to the known sequences, (iii) based on the classification, label respective subsets of the culled multi-oligo unknown analytes with respective multi-oligo ground truth sequences that respectively match the known sequences, and (iv) further train the base caller using the labelled respective subsets of the culled multi-oligo unknown analytes.
In view of the combining teachings of claims 1-3, 16-19, 20-21, 27 and 31 of co-pending application No. 17/830,316, it would have been obvious to add a third neural network configuration when training the basecaller.
The difference between the computer-implemented method of progressively training a base caller of claims 1-3, 16-19, 20-21, 27 and 31 of co-pending application No. 17/830,316 above and the computer-implemented method of progressively training a base caller recited in instant claims 1, 6, 7, 28, 12, 14,15,16, 27, 29 and 31, is that it does not explicitly teach:
a third neural network configuration when training the basecaller
With respect to the limitation of “a third neural network configuration is loaded within the base caller, wherein the first, second, and third neural network configurations are different from each other” (claim 12), Brownlee. teaches configuring model layers, nodes, optimal set of weights and parameters to increase model complexity, “We can control whether a model is more likely to overfit or underfit by altering its capacity”, “A model with more layers and more hidden units per layer has higher representational capacity — it is capable of representing more complicated functions” and “Repeat the experiment of increasing layers on a problem that requires the increased capacity provided by increased depth In order to perform well” [pgs. 1-17].
It would have been obvious to one of ordinary skill in the art at the time the invention was made to modify the configuration of the neural network loaded within the base caller with to increasing the capacity of the neural network, adding a third neural network configuration, modifying parameters, weights and layers of Brownlee, because Brownlee. shows that “A model with less capacity may not be able to sufficiently learn the training dataset. A model with more capacity can model more different types of functions and may be able to learn a function to sufficiently map inputs to outputs in the training dataset” and “increasing the depth increases the capacity of the model. Training deep models, e.g. those with many hidden layers, can be computationally more efficient than training a single layer network with a vast number of nodes” [Controlling Neural Network Model Capacity]. A person of ordinary skill in the art would therefore have been motivated to change the configuration of the neural network loaded within the base caller by increasing its capacity to avoid issues with the model learning the training dataset. One would have had a reasonable expectation of success for making the combination because the model can learn the training dataset better and can be more efficient.
Conclusion
No claims are allowed.
Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDRIELE EICHNER whose telephone number is (571)272-9956. The examiner can normally be reached M-F, 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached at (571) 272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.S.E./Examiner, Art Unit 1687
/Karlheinz R. Skowronek/Supervisory Patent Examiner, Art Unit 1687