Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses.


The use of artificial neural networks (ANNs) is described for predicting the reversed-phase liquid chromatography retention times of peptides enzymatically digested from proteome-wide proteins. To enable the accurate comparison of the numerous LC/MS data sets, a genetic algorithm was developed to normalize the peptide retention data into a range (from 0 to 1), improving the peptide elution time reproducibility to approximately 1%. The network developed in this study was based on amino acid residue composition and consists of 20 input nodes, 2 hidden nodes, and 1 output node. A data set of approximately 7000 confidently identified peptides from the microorganism Deinococcus radiodurans was used for the training of the ANN. The ANN was then used to predict the elution times for another set of 5200 peptides tentatively identified by MS/MS from a different microorganism (Shewanella oneidensis). The model was found to predict the elution times of peptides with up to 54 amino acid residues (the longest peptide identified after tryptic digestion of S. oneidensis) with an average accuracy of approximately 3%. This predictive capability was then used to distinguish with high confidence isobar peptides otherwise indistinguishable by accurate mass measurements as well as to uncover peptide misidentifications. Thus, integration of ANN peptide elution time prediction in the proteomic research will increase both the number of protein identifications and their confidence.