Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office Action is in response to the application filed on 02/20/2023.
Claims 1-10 are pending.
Drawings
The drawings filed on 02/20/2023 are accepted.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The Claim recites the language of “training a convolutional neural network by inputting a first mass spectrum matched to an amino acid sequence into the convolutional neural network to produce a trained convolutional neural network;
obtaining from a mass spectrometer a second mass spectrum of a protein sample having an unknown amino acid sequence;
discretizing the second mass spectrum into a weighted vector;
inputting the weighted vector into the trained convolutional neural network; and
determining, by an output of the trained convolutional neural network, a predicted amino acid sequence corresponding to the second mass spectrum.”
Claim 1 recites the limitation of “training a convolutional neural network by inputting a first mass spectrum matched to an amino acid sequence into the convolutional neural network to produce a trained convolutional neural network”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is nothing in the claim element precludes the step from practically being performed in the mind. For example, “training” in the context of this claim encompasses the user manually input data. Similarly, the limitation of obtaining from a mass spectrometer a second mass spectrum of a protein sample having an unknown amino acid sequence, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, but for the “obtaining” in the context of this claim encompasses the user is receiving an number/set of information. Similarly, the limitation of discretizing the second mass spectrum into a weighted vector, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “discretizing” in the context of this claim encompasses the user manually analyzing information. Similarly, the limitation inputting the weighted vector into the trained convolutional neural network, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “inputting the weight” in the context of this claim encompasses the user manually entering and comparing information. Also Similarly, the limitation of determining, by an output of the trained convolutional neural network, a predicted amino acid sequence corresponding to the second mass spectrum, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “determining” in the context of this claim encompasses the user manually determining and predict a results. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional element – using one or more storage device to perform the training, obtaining, discretizing, inputting, determining, identifying, accessing steps. The neural network in those steps is recited at a high-level of generality (i.e., as a generic system performing a generic computer function of returning a list) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a processor the obtaining, discretizing, inputting, determining, identifying, accessing steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claim 2 is dependent on independent claim 1 and includes all the limitations of claim 1. Claim 2 recites “prior to inputting the first mass spectrum into the convolutional neural network, discretizing the first mass spectrum into a first weighted vector, wherein the first weighted vector corresponds to a peak height in segments of the first mass spectrum”. The claim language provides only further input and discretizing which is directed towards the abstract idea and does not amount to significantly more. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Claim 3 is dependent on independent claim 1 and includes all the limitations of claim 1. Claim 3 recites “the weighted vector corresponds to a peak height in segments of the second mass spectrum”. The claim language provides only further compare information which is directed towards the abstract idea and does not amount to significantly more. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Claim 4 contain essentially the same subject matter as claim 1 and therefore are rejected under the same rationale.
Claims 5-10 is dependent on independent claim 1 and includes all the limitations of claim 1. Claim 4 recites “determining...discretizing...one dimensional... prior to inputting the mass...”. The claim language provides only further determining and input information which is directed towards the abstract idea and does not amount to significantly more. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Accordingly, the claims 1-10 are not patent eligible.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-10 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-12 of U.S. Patent No. 11,587644. Although the conflicting are not patentably distinct from each other because since the claims of the Patent No. 11,587644 contains every element of the claims of the instant application, and as such, anticipate the claims of the instant application 18/111875. (See table below).
Instant Application claim 1
Patent No. 11,587644 claim 1
A method of identifying features in mass spectral data, comprising:
training a convolutional neural network by inputting a first mass spectrum matched to an amino acid sequence into the convolutional neural network to produce a trained convolutional neural network;
obtaining from a mass spectrometer a second mass spectrum of a protein sample having an unknown amino acid sequence;
discretizing the second mass spectrum into a weighted vector;
inputting the weighted vector into the trained convolutional neural network; and
determining, by an output of the trained convolutional neural network, a predicted amino acid sequence corresponding to the second mass spectrum.
A method of identifying features in mass spectral data, comprising: identifying a first mass spectrum matched to an amino acid sequence; pooling the amino acid sequence from the first mass spectrum into a plurality of groups of sequential amino acids; classifying each amino acid as aliphatic or aromatic and assigning a first feature to each amino acid based on a classification as aliphatic or aromatic; classifying each amino acid as hydrophobic or hydrophilic and assigning a second feature to each amino acid based on a classification as hydrophobic or hydrophilic; classifying each amino acid as positively charged or negatively charged and assigning a third feature to each amino acid based on a classification as positively charged or negatively charged; producing a subsequence for each of the groups of sequential amino acids based on the first feature, the second feature, and the third feature for each amino acid; training a convolutional neural network by inputting into the convolutional neural network the subsequence for each of the groups of sequential amino acids to produce a trained convolutional neural network; obtaining from a mass spectrometer a second mass spectrum of a protein sample having an unknown amino acid sequence; inputting the second mass spectrum into the trained convolutional neural network; identifying, by the trained convolutional neural network, a presence or absence of each subsequence in the second mass spectrum of the protein sample; and determining, by an output of the trained convolutional neural network, a predicted amino acid sequence corresponding to the second mass spectrum based on the presence or absence of each subsequence.
Claims 1-4 of Patent No. 11,587,644 satisfies all the elements of claims 2-3 of the instant application, and as such, anticipates the claims of instant application.
Instant Application claim 4
Patent No. 11,587,644 claim 5
A method of identifying features in mass spectral data, comprising:
training a convolutional neural network by inputting a mass spectra from a known protein sample and a corresponding known amino acid sequence into the convolutional neural network to produce a trained convolutional neural network;
obtaining a mass spectra of an unknown protein sample;
inputting the mass spectra of the unknown protein sample into the trained convolutional neural network; and
determining, by a first output of the trained convolutional neural network, a presence or absence of an amino acid in the unknown protein sample.
A method of identifying features in mass spectral data, comprising: training a convolutional neural network by inputting a mass spectra from a known protein sample and a corresponding known amino acid sequence into the convolutional neural network to produce a trained convolutional neural network; obtaining a mass spectra of an unknown protein sample; inputting the mass spectra of the unknown protein sample into the trained convolutional neural network; determining, by a first output of the trained convolutional neural network, a presence or absence of an amino acid in the unknown protein sample; determining, by a second output of the trained convolutional neural network, a length of a peptide sequence of the unknown protein sample; and determining, by a third output of the trained convolutional neural network, a frequency of the amino acid in the peptide sequence of the unknown protein sample.
Claims 5-9 of Patent No. 11,587,644 satisfies all the elements of claims 5-10 of the instant application, and as such, anticipates the claims of instant application.
Examiner Notes
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-10 are rejected under 35 U.S.C. 103 as being unpatentable over Tran et al. (“De novo peptide sequencing by deep learning”) in view of Torng et al. (“3D deep convolutional neural networks for amino acid environment similarity analysis”).
As per as claim 1, Tran discloses:
A method of identifying features in mass spectral data (Tran, e.g., page 8251, [par. 2], “...DeepNovo integrates CNNs and LSTM networks to learn features of tandem mass spectra, fragment ions, and sequence patterns for predicting peptides...” (mass spectra data)) comprising:
training a convolutional neural network by inputting a first mass spectrum matched to an amino acid sequence into the convolutional neural network to produce a trained convolutional neural network (Tran, e.g., fig. 1, associating with texts description, page 8248, [001], “...The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data...”) (the examiner asserts a CNN by inputting spectra data (which including a first mass spectrum) into the CNN to produce a trained CNN);
obtaining from a mass spectrometer a second mass spectrum of a protein sample having an unknown amino acid sequence (Tran, e.g., page 8248, “...“We evaluated the performance of DeepNovo compared with current state of the art de novo peptide sequencing tools...For performance evaluation, we used two sets of data, low resolution and high resolution, from previous publications. The low-resolution set includes seven datasets (41–47) (Table S1). The first five datasets were acquired from the Thermo Scientific LTQ Orbitrap with the collision-induced dissociation (CID) technique. The other two were acquired from the Thermo Scientific Orbitrap Fusion with the higher-energy collisional dissociation (HCD) technique. The high-resolution set includes nine datasets acquired from the Thermo Scientific Q-Exactive with the HCD technique (48–56) (Table S2)” disclose obtaining mass spectrum data from different types of mass spectrometers (including Thermo Scientific LTQ Orbitrap, Thermo Scientific Orbitrap Fusion, and Thermo Scientific Q-Exactive) to test the performance of the DeepNovo model), and (page 8249 [002], “...The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” and pg. 8249 second full paragraph: “To measure the accuracy of de novo sequencing results, we compared the real peptide sequence and the de novo peptide sequence of each spectrum...” teach the input data are peptide samples having sequences that need to be predicted “de novo,” thus rendering the input to be unknown amino acid sequences to the model) and further (page 8249, [002], “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” (disclose that peptides are part of protein sequence (protein sample));
discretizing the second mass spectrum into a weighted vector; inputting the weighted vector into the trained convolutional neural network (Tran, e.g., fig. 1, associating with texts description, page 8248, [004], “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” (disclose discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), which are inputted into the trained CNN) and further see (page 8247, [004], “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights) disclose intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors); and
determining, by an output of the trained convolutional neural network, a predicted amino acid sequence corresponding to the second mass spectrum (Tran, e.g., fig. 1, associating with texts description, page 8248, [0002], “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” (disclose using the output from the trained CNN to produce a predicted amino acid sequence corresponding to input mass spectrum)) and (page 8249, [0002], “...Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides...”) disclose that peptides are part of protein sequence (protein sample)).
To make records clearer regarding to the language of “amino acid sequence corresponding to mass spectrum” (although as stated above, Tran functional disclose the features of “amino acid sequence corresponding to mass spectrum” (Tran, e.g. page 8248, [0002]).
However Tomg, in an analogous art, discloses “amino acid sequence corresponding to mass spectrum” (Tong, Fig. 6, associating with teach description, “3DCNN-Training. Amino acid groupings discovered by our 3DCNN generally agree with known amino acid similarities. Six clusters were discovered by our network. The first cluster includes phenylalanine, tryptophan, and tyrosine. These are the three amino acids known to be hydrophobic and aromatic. The second and third clusters comprises valine, isoleucine and leucine, methionine respectively, which are all non-polar and aliphatic. The polar amino acids form the fourth cluster. Amino acids with known distinct properties, glycine and cysteine do not form local blocks with the other amino acids...”) (teach amino acid sequence correspond to mass spectrum)). Thus, it would have been obvious to one of ordinary skill in the art BEFORE the effective filling date of the claimed invention to combine the teaching of Torng and Tran to study how the 20 amino acids interact with their neighboring microenvironment, we train our network to predict the amino acids most compatible with a specific location within a protein structure and show that out 3DCNN achieved superior performances over models using conventional features (Torng, e.g., page 3, [0001]).
As per as claim 2, the combination of Torng and Tran disclose:
The method of claim 1, further comprising, prior to inputting the first mass spectrum into the convolutional neural network network (Tran, e.g., fig. 1, associating with texts description, page 8248, [001], “...The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data...”) (the examiner asserts a CNN by inputting spectra data (which including a first mass spectrum) into the CNN to produce a trained CNN), discretizing the first mass spectrum into a first weighted vector, wherein the first weighted vector corresponds to a peak height in segments of the first mass spectrum (Tran, e.g., fig. 1, associating with texts description, page 8248, [004], “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” (disclose discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), which are inputted into the trained CNN) and further see (page 8247, [004], “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights) disclose intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors).
As per as claim 3, the combination of Torng and Tran disclose:
The method of claim 1, wherein the weighted vector corresponds to a peak height in segments of the second mass spectrum (Tran, e.g., page 8248, [0002], “...DeepNovo integrates CNNs and LSTM networks to learn features of tandem mass spectra, fragment ions, and sequence patterns for predicting peptides” disclose learning features of mass spectra data).
Claim 4 contain essentially the same subject matter as claim 1 and therefore are rejected under the same rationale.
As per as claim 5, the combination of Torng and Tran disclose:
The method of claim 4, further comprising determining, by a second output of the trained convolutional neural network, a length of a peptide sequence of the unknown protein sample (Tran, e.g., page 8251, [0001], “...we present a key downstream application of DeepNovo for complete de novo sequencing of mAbs. We trained the DeepNovo model with an in-house antibody database and used it to perform de novo peptide sequencing on two antibody datasets, the WIgG1 light and heavy chains of mouse (21). Note that the two testing datasets were not included in the training database. De novo peptides from DeepNovo were then used by the assembler ALPS (21) to automatically reconstruct the complete sequences of the antibodies (Figs. S5 and S6). For the light chain (length of 219 aa), we were able to reconstruct a single full-length contig that covered 100% of the target with 99.5% accuracy (218/219). For the heavy chain (length of 441 aa), we obtained three contigs together covering 97.5% of the target (430/441) with 97.2% accuracy (418/430)...”) disclose training DeepNovo model (including the convolutional neural network) to determine an output indicating the full-length predicted/sequenced peptide wherein the testing data is not included in the training database, thus rendering the testing sample to be “unknown”) and further see (page 8249, [002], “...Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides...” disclose that peptides are part of protein sequence (protein sample)).
As per as claim 6, the combination of Torng and Tran disclose:
The method of claim 4, further comprising determining, by a third output of the trained convolutional neural network, a frequency of the amino acid in the peptide sequence of the unknown protein sample (Tran, e.g., page 8251, [0001], “...we present a key downstream application of DeepNovo for complete de novo sequencing of mAbs. We trained the DeepNovo model with an in-house antibody database and used it to perform de novo peptide sequencing on two antibody datasets, the WIgG1 light and heavy chains of mouse (21). Note that the two testing datasets were not included in the training database. De novo peptides from DeepNovo were then used by the assembler ALPS (21) to automatically reconstruct the complete sequences of the antibodies (Figs. S5 and S6). For the light chain (length of 219 aa), we were able to reconstruct a single full-length contig that covered 100% of the target with 99.5% accuracy (218/219). For the heavy chain (length of 441 aa), we obtained three contigs together covering 97.5% of the target (430/441) with 97.2% accuracy (418/430)...”) disclose training DeepNovo model (including the convolutional neural network) to determine an output indicating the full-length predicted/sequenced peptide wherein the testing data is not included in the training database, thus rendering the testing sample to be “unknown”) and further see (page 8249, [002], “...Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides...”) and further see (Torng, Fig. 6, associating with teach description, “3DCNN-Training. Amino acid groupings discovered by our 3DCNN generally agree with known amino acid similarities. Six clusters were discovered by our network. The first cluster includes phenylalanine, tryptophan, and tyrosine. These are the three amino acids known to be hydrophobic and aromatic. The second and third clusters comprises valine, isoleucine and leucine, methionine respectively, which are all non-polar and aliphatic. The polar amino acids form the fourth cluster. Amino acids with known distinct properties, glycine and cysteine do not form local blocks with the other amino acids...”) (teach amino acid sequence correspond to mass spectrum)).
As per as claim 7, the combination of Torng and Tran disclose:
The method of claim 4, further comprising discretizing the mass spectra of the unknown protein sample into a one-dimensional vector prior to inputting the mass spectra of the unknown protein sample into the trained convolutional neural network (Tran, e.g., fig. 1, associating with texts description, page 8248, [004], “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” (disclose discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), which are inputted into the trained CNN) and further see (page 8247, [004], “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights) disclose intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors).
As per as claim 8, the combination of Torng and Tran disclose:
The method of claim 7, wherein the one-dimensional vector corresponds to a presence or absence of a peak in each segment of the mass spectra of the unknown protein sample (Tran, e.g., page 8248, [0002], “...DeepNovo integrates CNNs and LSTM networks to learn features of tandem mass spectra, fragment ions, and sequence patterns for predicting peptides” disclose learning features of mass spectra data).
As per as claim 9, the combination of Torng and Tran disclose:
The method of claim 4, further comprising, prior to inputting the mass spectra from the known protein sample into the convolutional neural network (Tran, e.g., fig. 1, associating with texts description, page 8248, [001], “...The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data...”) (the examiner asserts a CNN by inputting spectra data (which including a first mass spectrum) into the CNN to produce a trained CNN), discretizing the mass spectra of the known protein sample into a one-dimensional vector, wherein the one- dimensional vector corresponds to a presence or absence of a peak in each segment of the mass spectra of the known protein sample (Tran, e.g., fig. 1, associating with texts description, page 8248, [004], “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” (disclose discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), which are inputted into the trained CNN) and further see (page 8247, [004], “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights) disclose intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors).
As per as claim 10, the combination of Torng and Tran disclose:
The method of claim 4, further comprising, prior to inputting the mass spectra from the known protein sample into the convolutional neural network, discretizing the mass spectra of the known protein sample into a weighted vector, wherein the weighted vector corresponds to a peak height in each segment of the mass spectra of the known protein sample (Tran, e.g., fig. 1, associating with texts description, [0004], “...DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network...”) (disclose discretizing the input spectrum into intensity vectors (correspond to weighted vectors), prior to inputting input spectra into the trained CNN, wherein the intensity vectors are at least one-dimensional) and see (page 8247, [0004], “...peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)...” (disclose intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors; peak heights correspond to presence of peak in a segment), and further see (page 8248, [0004], “...The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration...” (disclose training a CNN by inputting annotated spectra data (correspond to known amino acid sequence) into the CNN to produce a trained CNN) and also (page 8249, [0002], “...“Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” discloses peptides are part of protein sequence (protein sample)).
Additional Art Considered
The prior art made of record and not relied upon is considered pertinent to the Applicants’ disclosure.
The following patents and papers are cited to further show the state of the art at the time of Applicants’ invention with respect to identify field and profiling of molecules, and more specifically, to convolutional neural network algorithms used to classify and identify features in mass spectral data to classify and identify features in mass spectral data using neural network algorithms. A convolutional neural network (CNN) was trained to identify amino acids from an unknown protein sample.
a. Fan et al. (US PGPUB 2017/0329892, hereafter Fan); “computational Method For Classifying And Predicting Protein Side Chain Conformations” disclose “ classifying and predicting protein side chain conformations utilizing a data driven scoring function and determining representative conformations of the dusters, wherein an average structural difference between a representative conformation of a duster and conformations in the duster is below a predetermined threshold”.
Fan also teaches Side chain prediction is a fundamental component of many protein modeling applications such as docking, structural prediction, and design, and amino acids [0084].
Fan further teaches discretizing for machine learning algorithms are used to train a prediction model [0085-0086].
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TUAN A PHAM whose telephone number is (571)270-3173. The examiner can normally be reached M-F 7:45 AM - 6:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on 571-272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TUAN A PHAM/Primary Examiner, Art Unit 2163