Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on has been entered.
DETAILED ACTION
This action is in response to amendment and/or remarks filed on 04/14/2026. In the current amendments, claims 1, 2, 4, 6, and 10-13 have been amended. Claim 8 has been canceled.
The arguments made against the 35 U.S.C 112(b) rejections were found to be persuasive. The arguments made against the 35 U.S.C 101 rejections were found to be persuasive. The arguments made against the 35 U.S.C 103 rejections were not found to be persuasive.
The Examiner cites particular sections in the references as applied to the claims
below for the convenience of the applicant(s). Although the specified citations are
representative of the teachings in the art and are applied to the specific limitations within
the individual claim, other passages and figures may apply as well. It is respectfully
requested that, in preparing responses, the applicant(s) fully consider the references in
their entirety as potentially teaching all or part of the claimed invention, as well as the
context of the passage as taught by the prior art or disclosed by the Examiner.
Response to Arguments
Applicant’s arguments and remarks filed 04/14/2026 have been fully considered but were not all found to persuasive.
35 U.S.C 103
Applicant asserts:
Applicant asserts “The cited references fail to teach or suggest several key elements of the presently claimed invention, particularly the specific fragmentation modeling and peak determination framework recited in the amended independent claims Tsou does not teach the claimed fragmentation modeling framework
As amended, independent claim 1 recites that the determination unit: " generates a plurality of fragmentation cases for a peptide to be confirmed,
" calculates fragmentation probabilities of peptide product ions in fragmentation
units of four amino acids,
" predicts fragment sequences corresponding to C-terminal and N-terminal
directions based on fragmentation positions, and
" determines a spectral peak profile having the highest probability among the
plurality of generated fragmentation cases. Tsou does not disclose or suggest such a fragmentation case generation or probability- based peak determination process. Instead, Tsou focuses on training machine learning models to predict peptide
fragmentation spectra using deep learning techniques applied to peptide sequence data. Tsou's approach relies on training predictive models using large spectral datasets but does not involve enumerating multiple fragmentation cases for a peptide and determining the most probable peak profile among them.
Thus, the fundamental approach of the present invention differs significantly from the predictive modeling described in Tsou.
The cited references do not teach fragmentation probability modeling in four- amino-acid units. …
Examiner’s response:
The Examiner agrees that Tsou does not explicitly teach calculating fragmentation probabilities of peptide product ions in fragmentation
units of four amino acids, (with Tsou there are ranges of amino acids as features as shown in para [0140] and are not limited to four), and predicting fragment sequences corresponding to C-terminal and N-terminal directions based on fragmentation positions. The Examiner notes, however, that Tsou teaches predicting complete mass spectra data of peptides using scores to indicate predictions which are similar to probabilities as shown in para [0232].
Applicant asserts:
Applicant asserts: “The claimed multi-model architecture is not taught or suggested. The claims further require that the machine learning unit include multiple learning models, specifically:
a first learning model based on amino acid sequence information,
a second learning model using charge, mass, peptide length, and proline presence information, and a third learning model using fragmentation information corresponding to two or more unit peptides. The cited references, including Tsou, generally describe a single deep learning model used for predicting peptide spectra. They do not disclose or suggest a system architecture employing multiple distinct learning models that process different peptide characteristics and fragmentation information as recited in the claims.”
Examiner’s response:
The Examiner respectfully disagrees. The Examiner notes that Tsou does teach a potential ensemble of classifiers, of which are listed as machine learning models, as shown in in para [0180] as well as a potential ensemble of deep learning architectures which are to potentially include a recurrent neural network (RNN) and convolutional neural network (CNN) as shown in para [0181].
Applicant asserts:
Applicant asserts: “The combination proposed by the Examiner is based on hindsight. The Examiner's rejection appears to combine multiple references related to machine learning and peptide spectrum prediction in order to reconstruct the claimed invention.
However, the cited references address different approaches to peptide spectrum
prediction and do not provide any teaching or motivation to combine their disclosures in the specific manner required to arrive at the claimed system. In particular, the cited references do not suggest modifying the deep learning prediction approaches to include:
" explicit fragmentation case generation,
" fragmentation probability calculations based on four-amino-acid units, and
" determination of a spectral peak profile having the highest probability among
generated fragmentation cases.
The proposed combination therefore appears to rely on impermissible hindsight reconstruction using Applicant's disclosure as a roadmap.”
Examiner’s response:
The Examiner respectfully disagrees. In regard to the argument that the combination of cited references rely on impermissible hindsight, MPEP 2141.01(a) states “In order for a reference to be proper for use in an obviousness rejection
under 35 U.S.C. 103 , the reference must be analogous art to the claimed invention. In
re Bigio, 381 F.3d 1320, 1325, 72 USPQ2d 1209, 1212 (Fed. Cir. 2004). A reference is
analogous art to the claimed invention if: (1) the reference is from the same field of
endeavor as the claimed invention (even if it addresses a different problem); or (2) the
reference is reasonably pertinent to the problem faced by the inventor (even if it is not in
the same field of endeavor as the claimed invention). Note that "same field of endeavor"
and "reasonably pertinent" are two separate tests for establishing analogous art; it is not
necessary for a reference to fulfill both tests in order to qualify as analogous art.”. In the
instant case, the Tsou, Zhou, Tiwary, and Shan references are analogous because they are within the same field of endeavor, namely that of Mass Spectrometry spectrum prediction thereof, even if they are addressing different problems.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-9 are rejected under 35 U.S.C 103 as being unpatentable over Tsou et al. (US20210041454A1 hereinafter referred to as Tsou) in view of Zhou et al. (“pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning” hereinafter referred to as Zhou) in further view of Tiwary et al. (“High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis” hereinafter referred to as Tiwary) and further in view of Shan et al. (US20190147983A1 hereinafter referred to as Shan).
Regarding claim 1 (currently amended):
Tsou teaches a system for predicting a spectral profile of a product ion of a peptide (see para [0077]: In an aspect, pDeep covers a deep learning-based method to predict the intensity distribution of product ions of a peptide.”)
in order to distinguish… of different peptides having similar retention times (RT) and mass-to-charge (m/z) ratios during a multiple reaction monitoring (MRM) process, the system (see para [0144]: “ Further, the use of retention time data allows for further discrimination of different peptides, for example highly similar peptides, e.g., peptides that could not be distinguished using standard techniques.”. Also see para [0137]: “Mass spectrometry (MS) is a tool capable of identifying naturally (in vivo) presented HLA peptides in human cell lines, tumor tissues and bodily fluids, such as plasma. Mass spectrometry (MS) generates a signal (spectrum) of values (m/z, intensity) related to the presence of a biomolecule with a certain mass-to-charge ratio (m/z), and abundance (intensity) in the sample.”. Also see para [0244]: “The mass spectrometer may be operated in various modes of operation including a mass spectrometry (“MS”) mode of operation, a tandem mass spectrometry (“MS/MS”) mode of operation, a mode of operation in which parent or precursor ions are alternatively fragmented or reacted so as to produce fragment or product ions, and not fragmented or reacted or fragmented or reacted to a lesser degree, a Multiple Reaction Monitoring (“MRM”) mode of operation,”.) comprising:
a data acquisition unit including a peptide information acquisition unit and a spectrum recognition unit, the data acquisition unit acquiring characteristic information of a plurality of learning peptides and spectral data corresponding to the plurality of learning peptides (see claim 25: “receiving, at the at least one processor, test data comprising peptide spectrum data;. Also see para [0178]: “The training database is a computer-implemented store of data reflecting a plurality of peptide spectrum data for a plurality of peptides association with a classification with respect to antigen characterization of each respective peptide. The peptide spectrum data may comprise experimental peptide spectrum data, predicted peptide spectrum data, or a combination thereof.”)
a machine learning unit including a plurality of predetermined learning models configured to apply predetermined weights (see para [0076]: “In an aspect, the deep learning algorithm described herein is selected from pDeep (Zhou et al., Anal. Chem. 89, 12690-12697 (2017)), DeepMass (Tiwary et al., Nature Methods, 16:519-525 (2019)), and PROSIT (Gessulat et al., Nature Methods, 16:509-518 (2019), the disclosures of each of which are herein incorporated by reference in their entirety).”. Also see para [0101]: “FIG. 14 shows a Dot Product comparison between Immatics-pDeep (HCD) (an embodiment of the system and method described herein) against Prosit pretrained model (HCD 25) and Prosit pretrain model (HCD 27) The comparison was done only for HCD model because the Prosit model is limited to HCD.”)
the machine learning unit extracting a plurality of characteristic information of the plurality of learning peptides (see para [0007]: “In an aspect, the disclosure relates to methods for identifying one or more peptides including: (a) analyzing one or more tissue samples by mass spectrometry (MS), (b) acquiring experimental mass spectra from one or more peptides, for example antigenic peptides, bound to HLA-subtypes in the one or more tissue samples, (c) generating a peptide spectrum match (PSM) of the one or more peptides by comparing the acquired mass spectra to peptide theoretical spectra, (d) producing a matched spectral library or database of peptides, for example antigenic peptides, based on steps (a), (b), and (c), (e) inputting the spectral library or database of peptides into an algorithm, for example a deep learning algorithm, to produce a spectral library of predicted peptide fragmentation spectra.”[Examiner’s note i.e., emphasis added. Tsou discloses peptide information including tissue samples. In the instant case, a learning peptide is recited as any material, biological fluid, tissue, or cell obtained from or derived from an individual for learning, as evidenced by the specification at [0016]],
performing learning using the plurality of characteristic information and a spectrum corresponding to the plurality of learning peptides as respective input values of the plurality of learning models (see para [0213]: “The training data may comprise mass spectrum data, retention time data, or combinations thereof.”.
acquiring peptide analysis learning data output from the plurality of learning models (see para [0077]: “The BiLSTM-based pDeep can take the whole peptide as input, convert the different cleavage sites into feature vectors of different time-steps, and output the corresponding intensity of each peak.”. Also see para [0214]: “Preferred classification systems use classifiers such as, but not limited to, support vector machines (SVM), AdaBoost, penalized logistic regression, naive Bayes classifiers, classification trees, k-nearest neighbor classifiers, Deep Learning classifiers, neural nets, random forests, Fully Convolutional Networks (FCN), Convolutional Neural Networks (CNN), and/or an ensemble thereof. Deep Learning classifiers are a more preferred classification system. The classification system outputs a classification of the peptide based on the test data, e.g., peptide spectra data, retention time data, combinations thereof.”),
wherein the plurality of learning models include a first learning model configured to perform learning using amino acid sequence type information of the learning peptides as an input value (see para [0154]: “By providing for greater resolution, methods described herein are capable of identifying peptides that were previously difficult to identify peptide sequences with a greater specificity and accuracy. For example, potentially useful epitopes that would be destroyed in a trypsin digest. This results in less false positive identifications. In an aspect, methods described herein are capable of identifying peptides with the same amino acid sequence but in a different configuration. In an aspect, amino acid positions are adjacent or further down the amino acid chain.”. Also see para [0181]: “For example, the deep learning architecture may be multilayer perceptron neural network (MLPNN), backpropagation, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Generative Adversarial Network (GAN), Restricted Boltzmann Machine (RBM), pDeep, DeepMass, PROSIT, Deep Belief Network (DBN), or an ensemble thereof.”)
a second learning model configured to perform learning using charges, mass, length of unit peptides, and presence or absence of proline as input values (see para [0254]: “In this example, inputs to the fragmentation models are, peptide sequences, precursor charge, and NCE. Peptide sequences are represented as discrete integer vectors of length 30, with each non-zero integer mapping to one amino acid and padded with zeros for sequences shorter than 30 amino acids.”. Also see para [0020]: “In an aspect, after inputting the library or database of peptides into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra, the method includes matching the predicted peptide fragmentation spectra generated from the algorithm to the corresponding peptide spectrum match (PSM). Mass spectrometry (MS) generates a signal (spectrum) of values (m/z, intensity) related to the presence of a biomolecule with a certain mass-to-charge ratio (m/z), and abundance (intensity) in the sample.”. Also see para [0253]: “Fragmentation Models Model Architecture In an aspect, the peptide encoder includes three layers: (1) a bi-directional recurrent neural network (BDN) with gated recurrent memory units (GRU), (2) a recurrent GRU layer, and (3) an attention layer all with dropout. ”), and
a third learning model configured to perform learning using fragmentation information corresponding to two or more unit peptides as an input value (see para [0232]: “The prediction model, which determines the predicted peptide spectrum as generated by methods described herein, can be compared to the experimental spectra to determine the accuracy of the prediction model. As set forth in FIG. 12, technical variation corresponds to the correlation between replicate experimental confirmation of peptide spectrum. As technical variation is based on actual experiments, such as confirmation of a peptide by mass spectrometry, and not on predicted confirmation of structure, the score is closer to 1 (highest) and theoretically should be higher than any predicted model. The closer a prediction model has a score closer to 1, the more accurate the prediction of peptide fragmentation patter”)
and a peak prediction unit including a determination unit, the peak prediction unit (see para [0232]: “Technical variation herein refers to peptide spectra similarities between two replicate spectra from a same peptide, which represents the upper bound of peptide fragmentation prediction performance for any algorithm. For example, the technical variation is a given peptide A is determined by comparing the distinct experimental peptide spectra. See, for example, FIG. 7. The prediction model, which determines the predicted peptide spectrum as generated by methods described herein, can be compared to the experimental spectra to determine the accuracy of the prediction model. Also see para [0163]: “As an example, shown in FIG. 4, a deep learning prediction model can successfully predict spectrum peak intensities, i.e. peptide fragmentation pattern of a peptide.”)
predicting a spectral profile of spectral data corresponding to a peptide to be confirmed using the peptide analysis learning data when characteristic information of the peptide to be confirmed obtained from a biological sample is acquired (see para [0032]: “obtaining at least one tumor tissue sample and corresponding healthy tissue sample from an individual,
(b) identifying one or more antigenic peptides bound to HLA-subtypes in a tumor tissue sample by mass spectrometry (MS), to produce one or more experimental peptide fragmentation spectra of the one or more antigenic peptides;
(c) comparing experimental peptide fragmentation spectra to those found in public and/or non-public databases;
(d) estimation of false discovery rate (FDR);
(e) generation of a peptide spectrum match (PSM);
(f) inputting the data generated by the experimental mass spectrometry methodology into a deep learning algorithm to train a peptide fragmentation prediction model;
(g) developing predicted peptide spectrum;.
Also see para [0163]: “As an example, shown in FIG. 4, a deep learning prediction model can successfully predict spectrum peak intensities, i.e. peptide fragmentation pattern of a peptide.”)
wherein the determination unit generates a plurality of fragmentation cases for a peptide to be confirmed (see para [0251]: “A total of 7825 runs was acquired in profile mode covering most samples with five replicate injections making use of different mass analyzers in low- (TOP3, ion trap acquiring top 3 precursors) and high-resolution mode (TOP5, Orbitrap® acquiring top 5 precursors, R=7500), as well as different fragmentations using collision-induced dissociation (CID)”. Also see para [0144]: “The method described herein uses unconventional methods to better identify potential HLA peptides for use in T-cell therapies. Instead of performing a trypsin digest to produce a library of random peptides that are run through a mass spectroscopy, random, computer generated peptide fragments are generated from the parent protein and this library of fragments is processed by a classifier to identify antigenic peptides.”)
…
determines a spectral peak profile having the highest probability among the plurality of generated fragmentation cases (see para [0233]: “(see para [0233]: “ The signal peaks in an MS/MS spectrum indicate the presence of a peptide fragment ion with a specific mass. The intensity of a signal peak is dependent on a number of factors: the abundance of the peptide in the sample, the efficiency of the cleavage that generated the fragment, the proteotypicity of the fragment ion and other factors related to the peptide and the machine that generated the MS2 spectrum”).
Tsou does not explicitly teach to distinguish overlapping peaks of different peptides.
Zhou, however, teaches to distinguish overlapping peaks of different peptides (see fig. within abstract on page 1 that shows the overall architecture of pDeep, a deep neural network-based model for the spectrum prediction of peptides. Also see page 12691: “With the accurate spectrum prediction, pDeep shows its potential in distinguishing extremely similar peptides with isobaric amino acids or isobaric amino acid combinations by considering the intensity information. For example, pDeep can distinguish GG from N and AG from Q in peptides with >0.93 accuracy. Furthermore, pDeep can distinguish I from L with ∼0.67 accuracy in HCD spectra and ∼0.76 accuracy in EThcD spectra”. Also see page 12691: “In this work, we developed pDeep, a deep learning-based method to predict the intensity distribution of product ions of a peptide”. Also see page 12696 section ‘Methods’: “The BiLSTM-based pDeep takes the whole peptide as input, converts the different cleavage sites into feature vectors of different time-steps, and outputs the corresponding intensity of each peak, as shown in Figure 8”)
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Tsou and Zhou before him or her, to modify the system of claim 1 to include attributes of distinguishing overlapping peaks of different peptides in order to distinguish between different peptides, especially those of similar spectral properties (see page 12691: “With the accurate spectrum prediction, pDeep shows its potential in distinguishing extremely similar peptides with isobaric amino acids or isobaric amino acid combinations by considering the intensity information).
Neither Tsou nor Zhou explicitly teach including whether proline is contained in each peptide for input into the plurality of machine learning models or predicting fragment sequences corresponding to C-terminal and N-terminal directions based on fragmentation positions.
Tiwary, however, teaches in analogous including whether proline is contained in each peptide for input into the plurality of machine learning models (see page 9 section ‘DeepMass: Prism model interpretation’: “To determine the influence of specific amino acids on peptide fragmentation, we repeated the analysis on a per-residue basis (Supplementary Fig. 12).” … “Similarly, among negative attribution profiles, we identified two trends. First, branched-chain amino acids and proline had an influence relatively concentrated at the cleavage site, and second, they had a smoother distribution of influence downstream; asparagine was a notable outlier, with its strongest influence on peaks just upstream of it.”) and
predicting fragment sequences corresponding to C-terminal and N-terminal directions based on fragmentation positions (see pg. 521 fig. 3: “Fig. 3 | Sliding-window-based regression model for prediction of fragment intensities. A symmetrical sliding window is placed around the target peptide bond for which the b- and y-ion intensity should be predicted (red boxes). Amino acids in the window are translated into 0/1 variables by one-hot encoding. Additional features, including the amino acids at the N and C termini, the distances of the bond to the termini and the peptide length, are added to the feature space. This process is repeated for each position in the input peptide sequences. A fully connected two-hidden layer neural network is then trained and outputs the logarithmic b- and y-ion intensities.”)
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Tsou, Zhou, and Tiwary before him or her, to modify the system of claim 1 to include attributes of including whether proline is contained in each peptide for input into the plurality of machine learning models and predicting fragment sequences corresponding to C-terminal and N-terminal directions based on fragmentation positions in order to determine the influence of specific amino acids on peptide fragmentation (see page 9 section ‘DeepMass: Prism model interpretation’: “To determine the influence of specific amino acids on peptide fragmentation, we repeated the analysis on a per-residue basis (Supplementary Fig. 12).” … “Similarly, among negative attribution profiles, we identified two trends. First, branched-chain amino acids and proline had an influence relatively concentrated at the cleavage site, and second, they had a smoother distribution of influence downstream; asparagine was a notable outlier, with its strongest influence on peaks just upstream of it.”)
Neither Tsou, Zhou, nor Tiwary teaches calculates fragmentation probabilities of peptide product ions in fragmentation units of four amino acids.
Shan, however, analogously teaches calculates fragmentation probabilities of peptide product ions in fragmentation units of four amino acids (see para [0129]: “In some embodiments, the ion-CNN is configured to learn features (the peaks) of fragment ions in a spectrum and summarizes the overall information. The input data to the ion-CNN is a prefix, i.e., a sequence including the “start” symbol and the amino acids that have been predicted up to the current iteration. The output is a probability distribution over 20 amino acid residues, their modifications, and three special symbols “start”, “end”, and “padding”. In one embodiment, three modifications are considered: fixed modification carbamidomethylation (C), and variable modifications Oxidation (M) and Deamidation (NQ), hence, a total of 26 symbols are used for prediction. For example, where the fourth amino acid is considered, the prefix consists of four symbols “start”, “P”, “E”, “P”. Symbol “T” is predicted as the next amino acid (4th amino acid in this example) by sampling or by selecting the highest probability from the model output probability distribution.”)
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Tsou, Zhou, Tiwary, and Shan before him or her, to modify the system of claim 1 to include attributes of including whether proline is contained in each peptide for input into the plurality of machine learning models and predicting fragment sequences corresponding to C-terminal and N-terminal directions based on fragmentation positions in order to learn features, such as peaks of fragment ions in a spectrum (see para [0129]: “In some embodiments, the ion-CNN is configured to learn features (the peaks) of fragment ions in a spectrum and summarizes the overall information. The input data to the ion-CNN is a prefix, i.e., a sequence including the “start” symbol and the amino acids that have been predicted up to the current iteration”).
Regarding claim 2 (currently amended):
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 1.
Tsou further teaches wherein the first learning model performs learning using amino acid sequence type information included in the learning peptide as an input value (see para [0041]: “Inputting an HLA peptide sequence database into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra corresponding to the input HLA peptide sequences, (f) matching the peptide fragmentation spectra with the library of predicted peptide fragmentation spectra, (g) identifying the sequences of the one or more antigenic peptides, when the peptide fragmentation spectra match the predicted peptide fragmentation spectra that correspond to the HLA peptide sequences.”.)
Regarding claim 3:
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 2.
Tsou further teaches wherein the first learning model is implemented as a recurrent neural network (RNN) (see para [0181]: “For example, the deep learning architecture may be multilayer perceptron neural network (MLPNN), backpropagation, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Generative Adversarial Network (GAN), Restricted Boltzmann Machine (RBM), pDeep, DeepMass, PROSIT, Deep Belief Network (DBN), or an ensemble thereof.”)
Regarding claim 4 (Currently Amended):
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 1.
Tsou further teaches wherein the second learning model performs learning using charges, a mass, and a length of a unit peptide (see para [0254]: “In this example, inputs to the fragmentation models are, peptide sequences, precursor charge, and NCE. Peptide sequences are represented as discrete integer vectors of length 30, with each non-zero integer mapping to one amino acid and padded with zeros for sequences shorter than 30 amino acids.”. Also see para [0020]: “In an aspect, after inputting the library or database of peptides into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra, the method includes matching the predicted peptide fragmentation spectra generated from the algorithm to the corresponding peptide spectrum match (PSM). Mass spectrometry (MS) generates a signal (spectrum) of values (m/z, intensity) related to the presence of a biomolecule with a certain mass-to-charge ratio (m/z), and abundance (intensity) in the sample.”)
Regarding claim 5:
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the claim 4.
Tsou does not teach wherein the second learning model is implemented as at least one fully connected layer.
Shan, however, teaches in analogous wherein the second learning model is implemented as at least one fully connected layer (see para [0079]: “In embodiments of the system comprising a CNN, the CNN comprises a plurality of layers. In some embodiments, the CNN comprises at least one convolutional layer and at least one fully connected layer.”.)
Before the effective filing date of the claimed invention, it would have been
obvious to one of ordinary skill in the art, having the teachings of Tsou, Zhou, Tiwary and Shan before him or her, to modify the system of claim 5 to include attributes of having a learning model that is implemented as at least one fully connected layer in order to increase the accuracy of the system (see Shan at para [0080]: “The inventors have found that adding a second convolutional layer to the first convolutional layer, as well as adding a second fully connected layer to the first connected layer, both significantly increased the accuracy of the system. Adding further convolutional layers or fully connected layers beyond the first two in both cases may yield greater accuracy but these increases in accuracy were not significant.”.).
Regarding claim 6:
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 1.
Tsou further teaches wherein third learning model performs learning using fragmentation information corresponding to two or more unit peptides as an input value (see para [0020]: “In an aspect, after inputting the library or database of peptides into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra, the method includes matching the predicted peptide fragmentation spectra generated from the algorithm to the corresponding peptide spectrum match (PSM).”.)
Regarding claim 7:
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 6.
Tsou further teaches wherein the third learning model is implemented as a convolution neural network (CNN) (see para [0181]: “For example, the deep learning architecture may be multilayer perceptron neural network (MLPNN), backpropagation, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Generative Adversarial Network (GAN), Restricted Boltzmann Machine (RBM), pDeep, DeepMass, PROSIT, Deep Belief Network (DBN), or an ensemble thereof.”)
Regarding claim 9:
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 1.
Tsou further teaches wherein the machine learning unit acquires the peptide analysis learning data by giving a predetermined weight to each of the plurality of learning models (see para [0189]: “AdaBoost provides a way to classify each of n subjects into two or more categories based on one k-dimensional vector (called a k-tuple) of measurements per subject. AdaBoost takes a series of “weak” classifiers that have poor, though better than random, predictive performance and combines them to create a superior classifier. The weak classifiers that AdaBoost uses are classification and regression trees (CARTs). CARTs recursively partition the dataspace into regions in which all new observations that lie within that region are assigned a certain category label. AdaBoost builds a series of CARTs based on weighted versions of the dataset whose weights depend on the performance of the classifier at the previous iteration. ”.)
Claims 10-12 are rejected under 35 U.S.C 103 as being unpatentable over Tsou et al. (US20210041454A1 hereinafter referred to as Tsou) in view of Zhou et al. (“pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning” hereinafter referred to as Zhou) in further view of Tiwary et al. (“High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis” hereinafter referred to as Tiwary).
Regarding claim 10:
Tsou teaches a system for predicting a spectral profile of a product ion of a peptide (see para [0077]: In an aspect, pDeep covers a deep learning-based method to predict the intensity distribution of product ions of a peptide.”)
in order to distinguish… of different peptides having similar retention times (RT) and mass-to-charge (m/z) ratios during a multiple reaction monitoring (MRM) process, the system (see para [0144]: “ Further, the use of retention time data allows for further discrimination of different peptides, for example highly similar peptides, e.g., peptides that could not be distinguished using standard techniques.”. Also see para [0137]: “Mass spectrometry (MS) is a tool capable of identifying naturally (in vivo) presented HLA peptides in human cell lines, tumor tissues and bodily fluids, such as plasma. Mass spectrometry (MS) generates a signal (spectrum) of values (m/z, intensity) related to the presence of a biomolecule with a certain mass-to-charge ratio (m/z), and abundance (intensity) in the sample.”. Also see para [0244]: “The mass spectrometer may be operated in various modes of operation including a mass spectrometry (“MS”) mode of operation, a tandem mass spectrometry (“MS/MS”) mode of operation, a mode of operation in which parent or precursor ions are alternatively fragmented or reacted so as to produce fragment or product ions, and not fragmented or reacted or fragmented or reacted to a lesser degree, a Multiple Reaction Monitoring (“MRM”) mode of operation,”.) comprising:
a data acquisition unit including a peptide information acquisition unit and a spectrum recognition unit, the data acquisition unit acquiring characteristic information of a plurality of learning peptides and spectral data corresponding to the plurality of learning peptides (see claim 25: “receiving, at the at least one processor, test data comprising peptide spectrum data;. Also see [0178]: “The training database is a computer-implemented store of data reflecting a plurality of peptide spectrum data for a plurality of peptides association with a classification with respect to antigen characterization of each respective peptide. The peptide spectrum data may comprise experimental peptide spectrum data, predicted peptide spectrum data, or a combination thereof.”)
a machine learning unit including a plurality of predetermined learning models configured to apply predetermined weights “(see para [0076]: “In an aspect, the deep learning algorithm described herein is selected from pDeep (Zhou et al., Anal. Chem. 89, 12690-12697 (2017)), DeepMass (Tiwary et al., Nature Methods, 16:519-525 (2019)), and PROSIT (Gessulat et al., Nature Methods, 16:509-518 (2019), the disclosures of each of which are herein incorporated by reference in their entirety).”. Also see para [0101]: “FIG. 14 shows a Dot Product comparison between Immatics-pDeep (HCD) (an embodiment of the system and method described herein) against Prosit pretrained model (HCD 25) and Prosit pretrain model (HCD 27) The comparison was done only for HCD model because the Prosit model is limited to HCD.”)
extracting a plurality of characteristic information of the plurality of learning peptides (see para [0007]: “In an aspect, the disclosure relates to methods for identifying one or more peptides including: (a) analyzing one or more tissue samples by mass spectrometry (MS), (b) acquiring experimental mass spectra from one or more peptides, for example antigenic peptides, bound to HLA-subtypes in the one or more tissue samples, (c) generating a peptide spectrum match (PSM) of the one or more peptides by comparing the acquired mass spectra to peptide theoretical spectra, (d) producing a matched spectral library or database of peptides, for example antigenic peptides, based on steps (a), (b), and (c), (e) inputting the spectral library or database of peptides into an algorithm, for example a deep learning algorithm, to produce a spectral library of predicted peptide fragmentation spectra.”[Examiner’s note i.e., emphasis added. Tsou discloses peptide information including tissue samples. In the instant case, a learning peptide is recited as any material, biological fluid, tissue, or cell obtained from or derived from an individual for learning, as evidenced by the specification at [0016]],
performing learning using the plurality of characteristic information and a spectrum corresponding to the plurality of learning peptides as respective input values of the plurality of learning models, and acquiring peptide analysis learning data output from the plurality of learning models (see para [0213]: “The training data may comprise mass spectrum data, retention time data, or combinations thereof.”.
acquiring peptide analysis learning data output from the plurality of learning models”. Also see para [0077]: “The BiLSTM-based pDeep can take the whole peptide as input, convert the different cleavage sites into feature vectors of different time-steps, and output the corresponding intensity of each peak.”. Also see para [0214]: “Preferred classification systems use classifiers such as, but not limited to, support vector machines (SVM), AdaBoost, penalized logistic regression, naive Bayes classifiers, classification trees, k-nearest neighbor classifiers, Deep Learning classifiers, neural nets, random forests, Fully Convolutional Networks (FCN), Convolutional Neural Networks (CNN), and/or an ensemble thereof. Deep Learning classifiers are a more preferred classification system. The classification system outputs a classification of the peptide based on the test data, e.g., peptide spectra data, retention time data, combinations thereof.”), wherein the plurality of learning models include
a first learning model performing learning using amino acid sequence type information of the learning peptides as an input value (see para [0154]: “By providing for greater resolution, methods described herein are capable of identifying peptides that were previously difficult to identify peptide sequences with a greater specificity and accuracy. For example, potentially useful epitopes that would be destroyed in a trypsin digest. This results in less false positive identifications. In an aspect, methods described herein are capable of identifying peptides with the same amino acid sequence but in a different configuration. In an aspect, amino acid positions are adjacent or further down the amino acid chain.”. Also see para [0181]: “For example, the deep learning architecture may be multilayer perceptron neural network (MLPNN), backpropagation, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Generative Adversarial Network (GAN), Restricted Boltzmann Machine (RBM), pDeep, DeepMass, PROSIT, Deep Belief Network (DBN), or an ensemble thereof.”)
a second learning model performing learning using charges, mass, length of unit peptides, and presence or absence of proline as input values (see para [0254]: “In this example, inputs to the fragmentation models are, peptide sequences, precursor charge, and NCE. Peptide sequences are represented as discrete integer vectors of length 30, with each non-zero integer mapping to one amino acid and padded with zeros for sequences shorter than 30 amino acids.”. Also see para [0020]: “In an aspect, after inputting the library or database of peptides into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra, the method includes matching the predicted peptide fragmentation spectra generated from the algorithm to the corresponding peptide spectrum match (PSM). Mass spectrometry (MS) generates a signal (spectrum) of values (m/z, intensity) related to the presence of a biomolecule with a certain mass-to-charge ratio (m/z), and abundance (intensity) in the sample.”. Also see para [0253]: “Fragmentation Models Model Architecture In an aspect, the peptide encoder includes three layers: (1) a bi-directional recurrent neural network (BDN) with gated recurrent memory units (GRU), (2) a recurrent GRU layer, and (3) an attention layer all with dropout. ”)
a third learning model configured to perform learning using fragmentation information corresponding to two or more unit peptides as an input value (see para [0232]: “The prediction model, which determines the predicted peptide spectrum as generated by methods described herein, can be compared to the experimental spectra to determine the accuracy of the prediction model. As set forth in FIG. 12, technical variation corresponds to the correlation between replicate experimental confirmation of peptide spectrum. As technical variation is based on actual experiments, such as confirmation of a peptide by mass spectrometry, and not on predicted confirmation of structure, the score is closer to 1 (highest) and theoretically should be higher than any predicted model. The closer a prediction model has a score closer to 1, the more accurate the prediction of peptide fragmentation patter”)
Tsou does not explicitly teach to distinguish overlapping peaks of different peptides.
Zhou, however, teaches to distinguish overlapping peaks of different peptides (see fig. within abstract on page 1 that shows the overall architecture of pDeep, a deep neural network-based model for the spectrum prediction of peptides. Also see page 12691: “With the accurate spectrum prediction, pDeep shows its potential in distinguishing extremely similar peptides with isobaric amino acids or isobaric amino acid combinations by considering the intensity information. For example, pDeep can distinguish GG from N and AG from Q in peptides with >0.93 accuracy. Furthermore, pDeep can distinguish I from L with ∼0.67 accuracy in HCD spectra and ∼0.76 accuracy in EThcD spectra”. Also see page 12691: “In this work, we developed pDeep, a deep learning-based method to predict the intensity distribution of product ions of a peptide”. Also see page 12696 section ‘Methods’: “The BiLSTM-based pDeep takes the whole peptide as input, converts the different cleavage sites into feature vectors of different time-steps, and outputs the corresponding intensity of each peak, as shown in Figure 8”)
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Tsou and Zhou before him or her, to modify the system of claim 10 to include attributes of distinguishing overlapping peaks of different peptides in order to distinguish between different peptides, especially those of similar spectral properties (see page 12691: “With the accurate spectrum prediction, pDeep shows its potential in distinguishing extremely similar peptides with isobaric amino acids or isobaric amino acid combinations by considering the intensity information).
Tsou does not explicitly teach including whether proline is contained in each peptide for input into the plurality of machine learning models.
Tiwary, however, teaches in analogous including whether proline is contained in each peptide for input into the plurality of machine learning models (see page 9 section ‘DeepMass: Prism model interpretation’: “To determine the influence of specific amino acids on peptide fragmentation, we repeated the analysis on a per-residue basis (Supplementary Fig. 12).” … “Similarly, among negative attribution profiles, we identified two trends. First, branched-chain amino acids and proline had an influence relatively concentrated at the cleavage site, and second, they had a smoother distribution of influence downstream; asparagine was a notable outlier, with its strongest influence on peaks just upstream of it.”) and
wherein the machine learning unit predicts fragment sequences of a plurality of peptide product ions corresponding to each of a C direction and an N direction based on a position where fragmentation of the unit peptide starts (see pg. 521 fig. 3: “Fig. 3 | Sliding-window-based regression model for prediction of fragment intensities. A symmetrical sliding window is placed around the target peptide bond for which the b- and y-ion intensity should be predicted (red boxes). Amino acids in the window are translated into 0/1 variables by one-hot encoding. Additional features, including the amino acids at the N and C termini, the distances of the bond to the termini and the peptide length, are added to the feature space. This process is repeated for each position in the input peptide sequences. A fully connected two-hidden layer neural network is then trained and outputs the logarithmic b- and y-ion intensities.”)
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Tsou, Zhou, and Tiwary before him or her, to modify the system of claim 10 to include attributes of including whether proline is contained in each peptide for input into the plurality of machine learning models and wherein the machine learning unit predicts fragment sequences of a plurality of peptide product ions corresponding to each of a C direction and an N direction based on a position where fragmentation of the unit peptide starts in order to determine the influence of specific amino acids on peptide fragmentation (see page 9 section ‘DeepMass: Prism model interpretation’: “To determine the influence of specific amino acids on peptide fragmentation, we repeated the analysis on a per-residue basis (Supplementary Fig. 12).” … “Similarly, among negative attribution profiles, we identified two trends. First, branched-chain amino acids and proline had an influence relatively concentrated at the cleavage site, and second, they had a smoother distribution of influence downstream; asparagine was a notable outlier, with its strongest influence on peaks just upstream of it.”).
Regarding claim 11 (currently amended):
Tsou in view of Zhou in further view of Tiwary teaches the system of claim 10.
Tsou further teaches wherein the first learning model performs learning using amino acid sequence type information included in the learning peptide as an input value (see para [0041]: “Inputting an HLA peptide sequence database into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra corresponding to the input HLA peptide sequences, (f) matching the peptide fragmentation spectra with the library of predicted peptide fragmentation spectra, (g) identifying the sequences of the one or more antigenic peptides, when the peptide fragmentation spectra match the predicted peptide fragmentation spectra that correspond to the HLA peptide sequences.”.)
Regarding claim 12 (Currently Amended):
Tsou in view of Zhou in further view of Tiwary teaches the system of claim 10.
Tsou further teaches wherein the second learning model performs learning using charges, a mass, and a length of a unit peptide (see para [0254]: “In this example, inputs to the fragmentation models are, peptide sequences, precursor charge, and NCE. Peptide sequences are represented as discrete integer vectors of length 30, with each non-zero integer mapping to one amino acid and padded with zeros for sequences shorter than 30 amino acids.”. Also see para [0020]: “In an aspect, after inputting the library or database of peptides into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra, the method includes matching the predicted peptide fragmentation spectra generated from the algorithm to the corresponding peptide spectrum match (PSM). Mass spectrometry (MS) generates a signal (spectrum) of values (m/z, intensity) related to the presence of a biomolecule with a certain mass-to-charge ratio (m/z), and abundance (intensity) in the sample.”).
Claim 13 is rejected under 35 U.S.C 103 as being unpatentable over Tsou et al. (US20210041454A1 hereinafter referred to as Tsou) in view of Zhou et al. (“pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning” hereinafter referred to as Zhou) in further view of Tiwary et al. (“High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis” hereinafter referred to as Tiwary) in further view of Gee et al. (US20200010527A1 hereinafter referred to as Gee).
Regarding claim 13 (Currently Amended):
Tsou in view of Zhou in further view of Tiwary and further in view of Shan teaches the system of claim 10.
Tsou further teaches wherein the third learning model performs learning using fragmentation information corresponding to the two or more unit peptides as an input value (see [0020]: “In an aspect, after inputting the library or database of peptides into a deep learning algorithm to produce a library of predicted peptide fragmentation spectra, the method includes matching the predicted peptide fragmentation spectra generated from the algorithm to the corresponding peptide spectrum match (PSM)”)
Tsou does not teach using the sliding window algorithm.
Gee, however, teaches in analogous using a sliding window manner in the context of peptide analysis, spectrometry, and neural networks (see [0192]: “This model was used to score given peptides from the Uniprot database (downloaded Dec. 18, 2015) and patient-specific exomes using peptides isolated from an L-length sliding window converted to one-hot matrices for neural network input.”. Also see [0008]: “Compositions are provided for ligands for a T cell receptor (TCR) of interest in a defined MHC context. The composition may comprise or consist of a defined peptide, or may comprise or consist of a polynucleotide encoding such a peptide. Such peptides may be fragments of naturally occurring antigenic proteins; may be fragments of neoantigenic proteins that are the subject of somatic mutation during tumorigenesis, or may be a synthetically generated mimic of an antigenic protein.”)
Before the effective filing date of the claimed invention, it would have been
obvious to one of ordinary skill in the art, having the teachings of Tsou, Zhou, Tiwary, Shan and Gee before him or her, to modify the system of claim 13 to include attributes of having a sliding window in order to operate on fixed sized sequences of peptides (see [0036]: “The sequences of peptides are determined by any convenient methods of high throughput sequencing. Sequences may be analyzed, for example by the methods disclosed in the Examples, using clustering algorithms. Peptides may be analyzed to search human protein (Uniprot) or patient-specific exomes to score peptides of fixed lengths using a sliding window”.).
Pertinent Art
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
“Prediction of LC-MS/MS Properties of Peptides from Sequence by Deep Learning” — discloses spectral properties using deep learning using an RNN
“A machine learning approach to explore the spectra intensity pattern of peptides using tandem mass spectrometry data” — discloses peak prediction
“Liquid Chromatography Mass Spectrometry-Based Proteomics: Biological and Technological Aspects” — discloses background info into LC-MS
Conclusion
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew A Bracero whose telephone number is (571)270-0592. The examiner can normally be reached Monday - Friday 9:00a.m. - 5:00 p.m. ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached Monday - Friday 9:00a.m. - 5:00 p.m. ET.. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANDREW BRACERO/Examiner, Art Unit 2126
/DAVID YI/Supervisory Patent Examiner, Art Unit 2126