Difference between revisions of "Sequence-based predictions HEXA"
(→Secondary Structure Prediction) |
(→Prediction of transmembrane alpha-helices and signal peptides) |
||
(40 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
=== Secondary Structure Prediction === |
=== Secondary Structure Prediction === |
||
− | To analyse the secondary structure of our protein we used different methods. In our analysis we used PSIPRED, Jpred3 and DSSP. In the analysis section of this page we want to compare these three methods to see if the methods |
+ | To analyse the secondary structure of our protein we used different methods. In our analysis we used PSIPRED, Jpred3 and DSSP. In the analysis section of this page we want to compare these three methods to see if the methods give similar results or if they differ extremely. |
− | |||
− | [[Here http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/secstr_general]] you can find some general information about these methods. |
||
− | |||
− | |||
− | |||
− | |||
− | ''' PSIPRED ''' |
||
− | |||
− | ''Authors:'' David T. Jones<br> |
||
− | ''Year:'' 1999<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed/10493868 Protein secondary structure prediction based on position-specific scoring matrices]]<br> |
||
− | |||
− | ''Description:'' <br> |
||
− | PSIPRED is a secondary structure prediction tool, which uses neural networks. The neural network has a single hidden layer and a |
||
− | feed-forward back-propagation architecture. The procedure of this method is split into three main steps. The first one is the generation of sequence profiles which means it generates a position-specific scoring matrix from PSI-BLAST and takes it as an input for the neural network. The second step is the prediction of the initial secondary structure which means it creates an output layer where the units represent one of three secondary structure states (helix, strand or coil). The last step is the filtering of the predicted structure which is the successive filtering of the outputs from the main network. |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://bioinf.cs.ucl.ac.uk/psipred/ Webserver]] for our analysis. The input for the webserver is only the sequence in FASTA-format.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | As a prediction result you get different possible files with the predicted secondary structure. The possibles outputs are in pdf, postscript or txt. The pdf and postscript have a more graphical representation whereas the txt is more simple. |
||
− | <br> |
||
− | |||
− | |||
− | ''' Jpred3 ''' |
||
− | |||
− | ''Authors:'' Cole C, Barber JD, Barton GJ<br> |
||
− | ''Year:'' 2008<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed/18463136 The Jpred 3 secondary structure prediction server]]<br> |
||
− | |||
− | ''Description:'' <br> |
||
− | Jpred3 is a server for secondary structure prediction. It uses the Jnet algorithm for the prediction which consists of neural networks. The special is that it has two possible inputs: the sequence or a multiple sequence alignment. Furthermore it delivers differenz possible output files like HTML, pdf or postscript. |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://www.compbio.dundee.ac.uk/www-jpred/index.html Webserver]] for our analysis. The input for the webserver was in our case only the sequence in FASTA-format.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | As a prediction result you get different possible files with the predicted secondary structure. There are some complex and some simple outputs. The conmplex ones contains the multiple sequence alignment as well as the predicted secondary structure. In contrast the simple ones contain only the secondary structure prediction. |
||
− | <br> |
||
− | |||
− | |||
− | ''' DSSP ''' |
||
− | |||
− | ''Authors:'' Kabsch W, Sander C.<br> |
||
− | ''Year:'' 1983<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed?term=Dictionary%20of%20protein%20secondary%20structure%3A%20pattern%20recognition%20of%20hydrogen-bonded%20and%20geometrical%20features Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.]]<br> |
||
− | |||
− | Description: <br> |
||
− | DSSP is a database for secondary structure assignments for each PDB entry. It is no prediction tool, but is often used to determine the prediction success by comparing the predicted secondary structure with the one from DSSP. |
||
− | It defines the secondary structure by given atomic coordinates in PDB-format. It bases mainly on H-bonding, because there are specific h-bonds at more or less specific positions which define helices or sheets. |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://swift.cmbi.ru.nl/servers/html/ Webserver]] for our analysis. There are two possible inputs: the PDB-id or the sequence. We used the PDB-id.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | As a result you get a file with the assigned secondary structure, the symmetry and the accessibility of the whole protein. |
||
− | <br> |
||
+ | [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/secstr_general Here]] you can find some general information about these methods. |
||
+ | <br><br> |
||
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
---- |
---- |
||
=== Prediction of disordered regions === |
=== Prediction of disordered regions === |
||
+ | After analysing the secondary structure, we also want to have a look at disordered regions in this protein. Therefore, we used different methods. We used DISOPRED, POODLE in several variations, IUPred and Meta-Disorder. As before, with the the secondary structure prediction methods we want to compare the different methods and variants, if the predictions are similar. Therefore, we also want to decided which methods seems to be the best one for our purpose. |
||
− | ''' DISOPRED ''' |
||
+ | To get more insight into the methods and the theory behind them we also offer you an [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/disorder_general general information page]]. |
||
− | ''Authors:'' Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT.<br> |
||
− | ''Year:'' 2004<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed/15019783 Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.]]<br> |
||
− | |||
− | ''Description:'' <br> |
||
− | This method is based on a neuronal network which was trained on high resolution X-ray structures from PDB. Disordered regions are regions, which appear in the sequence record, but their electrons are missing from electronic density map. This approach can also failed, because missing electrons can also arise because of the cristallization process. |
||
− | The method runs first a PsiBlast search against a filtered sequence database. Next, a profile for each residue is calculated and classified by using the trained neuronal network. <br> |
||
− | |||
− | ''Input:''<br> |
||
− | If you run disopred on the console, you have to define the location of your database. The program needs as input your sequence in a file with fasta format. |
||
− | <br> |
||
− | |||
− | ''Output:''<br> |
||
− | As a prediction result you get a file with the predicted disordered region, the precision and recall. Furthermore you can get a more detailed output. There you see the sequence, and the predictions and also how likely the prediction for each residue is. |
||
− | <br> |
||
− | |||
− | |||
− | ''' POODLE ''' |
||
− | |||
− | Prediction of order and disorder by machine-learning<br> |
||
− | ''Authors:'' S. Hirose, K. Shimizu, S. Kanai, Y. Kuroda and T. Noguchi<br> |
||
− | ''Year:'' 2007<br> |
||
− | |||
− | ''Description:''<br> |
||
− | POODLE is based on a machine learning algorithm. This method is based on a 2-level SVM (Support Vector Machine). |
||
− | |||
− | We describe here the POODLE-L algorithm in detail, but all POODLE variants use the same principle. |
||
− | The method was trained on disordered proteins and proteins with no disoredered regions. On the first level, the SVM predicts the probability of a 40-residue sequence segment to be disordered. If the algorithm found such a disordered region, the second level of the SVM use the output from the first level and predicts the probability to be disordered for each amino acid. |
||
<br><br> |
<br><br> |
||
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
− | |||
− | ''Different variants of POODLE:''<br> |
||
− | |||
− | * POODLE-L |
||
− | ''Describtion:'' Prediction of long disorderd redions. <br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=17545177&ordinalpos=8&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions.]]<br> |
||
− | |||
− | * POODLE-S |
||
− | ''Describtion:'' Prediction of short disorderd redions. <br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=17599940&ordinalpos=7&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum POODLE-S: web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix.]]<br> |
||
− | |||
− | *POODLE-I |
||
− | ''Description:'' Integrates structural information predictors.<br> |
||
− | ''Source:'' [[http://www.bioinfo.de/isb/2010/10/0015/ POODLE-I: Disordered region prediction by integrating POODLE series and structural information predictors based on a workflow approach]]<br> |
||
− | |||
− | *POODLE-W |
||
− | ''Description:'' Compares different sequences and predicts which sequence is the most disordered one. (is not used in this analysis)<br><br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://mbs.cbrc.jp/poodle/poodle.html POODLE webserver]] for our analysis. We paste our sequence in FASTA-format in the input window and chose the POODLE variant. |
||
− | <br> |
||
− | |||
− | ''Output:'' <br> |
||
− | The result of this method is a file with the single amino acids, the prediction if it is ordered or not and the probability for the state. Furtheremore, you get a graphical view of the result. |
||
− | <br> |
||
− | |||
− | |||
− | ''' IUPred ''' |
||
− | |||
− | ''Authors:'' Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon <br> |
||
− | ''Year:'' 2005<br> |
||
− | ''Source:'' [[http://bioinformatics.oxfordjournals.org/content/21/16/3433.short IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content]] <br> |
||
− | |||
− | ''Description:''<br> |
||
− | IUpred calculates the pairwise energy profile along a sequence. After that the algorithm transforms the energy values into a probabilisitic score, which is between 0 and 1. A score of 0 means complete order, whereas scores up to 1 mean complete disorder. The cutoff is 0.5. All residues with a score more than 0.5 are predicted as disordered.<br> |
||
− | The [[http://iupred.enzim.hu/index.html Webserver]] offers three different prediction methods, one focus on long disordered regions, the other method focus on short disordered regions. Furthermore, with the third method, it is possible to predict disordered regions with additional structure information.<br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://iupred.enzim.hu/index.html Webserver]] for our analysis. The input for the webserver is only the sequence in FASTA-format.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | As an output you get a graphical representation of the prediction and a detail list of the scores of each amino acid. Sadly, it is not possible to download the scores in a text file, so therefore you have to use the picture or to copy the data manually.<br> |
||
− | |||
− | |||
− | ''' Meta-Disorder ''' |
||
− | |||
− | ''Authors:''Schlessinger A, Punta M, Yachdav G, Kajan L, Rost B<br> |
||
− | ''Year:'' 2009<br> |
||
− | ''Source:'' [[http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0004433 Improved Disorder Prediction by Combination of Orthogonal Approaches.]] <br> |
||
− | |||
− | ''Description:''<br> |
||
− | Meta-Disorder combines different disordered prediction methods, which have differenct focus. Therefore, it is possible to avoid a bias to one prediction method. The combined methods are NORSnet (uses NORS for the prediction of disordered regions), PROFbval (uses mobility of the residues for prediction), Ucon (uses contacts for prediction) and DISOPRED (see above). Furthermore, Meta-Disorder also uses additional useful features like solvent accessibility or secondary structure.<br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[https://www.predictprotein.org/ Webserver ]] for our prediction. The input for the server is only the amino acid sequence of the protein.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | The user has the possibility to choose between different output formats. All formats are available for each prediction and the user chooses the format after the prediction and therefore, has the possility to jump between the formats. Very useful for our purpose was the visual output, because there the user gets a nice picture of the prediction. Also very useful is the text or html output, because there is a detailed list of the different predictions, used scores and probabilites.<br> |
||
− | |||
---- |
---- |
||
=== Prediction of transmembrane helices and signal peptides === |
=== Prediction of transmembrane helices and signal peptides === |
||
+ | The third big analysis section is the prediction of transmembrane helices and signal peptides. We merged the prediction of transmembrane helices and signal peptides in one section, because there are several prediction methods which can predict both and therefore we looked at both predictions in this section. |
||
− | ''' TMHMM (transmembrane helices hidden markov model) ''' |
||
+ | Therefore we used several methods, some which only predict transmembrane helices, some which only predict signal peptides and some combined methods. |
||
− | ''Authors:'' E. L.L. Sonnhammer, G. von Heijne, and A. Krogh <br> |
||
− | ''Year:'' 1998 <br> |
||
− | ''Source:'' [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.5429&rep=rep1&type=pdf A hidden Markov model for predicting transmembrane helices in protein sequences]] <br> |
||
− | |||
− | ''Description:''<br> |
||
− | TMHMM is a hidden markov model-based prediction method for transmembrane helices in proteins. The HMM consists of three different main locations (core, cap, loop) and seven different states (cytoplasmic loop, cytoplasmic cap, helix core, non-cytoplasmic cap, short non-cytoplasmic loop, long non-cytoplasmic loop and globular domain).<br><br> |
||
− | |||
− | ''Prediction:'' <br> |
||
− | This method searches for a given protein sequence in FASTA-format the best path through the hidden markov model. There are two output possibilities, the short one and the long one. The long output format gives additional statistic information (i.e. expected numbers of amino acids in transmembrane helices).<br> |
||
− | |||
− | ''Input:'' <br> |
||
− | The method only needs the protein sequence in FASTA-format for the prediction.<br> |
||
− | |||
− | |||
− | ''' Phobius and PolyPhobius ''' |
||
− | |||
− | * Phobius:<br> |
||
− | ''Authors:'' Lukas Käll, Anders Krogh and Erik L. L. Sonnhammer<br> |
||
− | ''Year:'' 2004<br> |
||
− | ''Source:'' [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.4706&rep=rep1&type=pdf A Combined Transmembrane Topology and Signal Peptide Prediction Method]] <br> |
||
− | |||
− | * PolyPhobius:<br> |
||
− | ''Authors:'' Lukas Käll, Anders Krogh and Erik Sonnhammer<br> |
||
− | ''Year:'' 2005<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed/15961464 An HMM posterior decoder for sequence feature prediction that includes homology information]] <br> |
||
− | |||
− | ''Description:''<br> |
||
− | Phobius and PolyPhobius are combined methods, which predict transmembrane helices and signal peptides. These both methods are based on a hidden markov model and combine the methods from TMHMM and SignalP. The basic of these methods are the HMM from TMHMM with an additional start state for signal peptides. The difference between Phobius and PolyPhobius is, that PolyPhobius also use homology information for the prediction.<br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://phobius.sbc.su.se/ Webserver]] for Phobius and PolyPhobius and so it was only necessary to paste the protein sequence in FASTA-format.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | The Server outputs a textfile with the prediction of the position of the signal peptide, the type of the signal peptide and also the positions of the transmembrane helices. Furthermore, it outputs a detailed file, with the probabilties for each residue to be located in a transmembrane helix or signal peptide. Additionally, the server outputs a picture of the prediction.<br> |
||
− | |||
− | |||
− | ''' OCTOPUS and SPOCTOPUS ''' |
||
− | |||
− | * OCTOPUS:<br> |
||
− | ''Authors:'' Håkan Viklund and Arne Elofsson<br> |
||
− | ''Year:'' 2008<br> |
||
− | ''Source:'' [[http://bioinformatics.oxfordjournals.org/content/24/15/1662.full OCTOPUS: Improving topology prediction by two-track ANN-based preference scores and an extended topological grammar.]]<br> |
||
− | |||
− | * SPOCTOPUS:<br> |
||
− | ''Authors:'' Håkan Viklund, Andreas Bernsel, Marcin Skwark and Arne Elofsson<br> |
||
− | ''Year:'' 2008<br> |
||
− | ''Source:'' [[http://bioinformatics.oxfordjournals.org/content/24/24/2928.full SPOCTOPUS: A combined predictor of signal peptides and membrane protein topology. ]]<br><br> |
||
− | |||
− | ''Description:''<br> |
||
− | OCTOPUS is a method, which is based on neuronal networks and hidden markov models. To make a prediction, first a multiple sequence alignment is generated by BLAST. Next the algorithm calculates the PSSM profile and a raw sequence profile and both profiles are used as the input for the neuronal networks. These neuronal networks (one for the PSSM profile and one for the raw sequence profile) predict the preference of each residue to be located in a transmembrane helix or not. The outputs of these networks are used as input for a Hidden Markov Model, which generates the final prediction.<br> |
||
− | OCTOPUS only predicts transmembrane helices, whereas SPOCTOPUS can also predict signal peptides. The basis of OCTOPUS and SPOCTOPUS is the same, above described, algorithm.<br> |
||
− | |||
− | ''Input:'' <br> |
||
− | We used the [[http://octopus.cbr.su.se/index.php Webserver]] for our predictions. The server is very easy to use, because it has only one input field, where you can paste your protein sequence in FASTA-format. Than it is possible to choose between OCTOPUS and SPOCTOPUS and the prediction starts.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | The [[http://octopus.cbr.su.se/index.php Webserver]] gives 3 files as output. The first file contains the exact probabilities for each residue to be located inside, outside or in a transmembrane helix (nnprf file). The next file contains the result of the prediction (topo file) and the last file visualise the prediction (png file).<br> |
||
− | |||
− | |||
− | ''' TargetP ''' |
||
− | |||
− | ''Authors:'' Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne<br> |
||
− | ''Year:'' 1997<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed/9051728 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites.]]<br> |
||
− | |||
− | ''Description:''<br> |
||
− | The TargetP method is based on two different neuronal networks. The input for the first neuronal network is the protein sequence in FASTA-format. The first network can recoginze the cleavage site. This part of the protein goes to the next neuronal network, which can distinguish between singal peptides and non-singal peptides and also which signal peptide it is. <br> |
||
− | The user has the possibility to specify if he wants to predict a plant or non-plant signal peptide.<br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://www.cbs.dtu.dk/services/TargetP/ TargetP webserver]] for our analysis and paste our sequence in FASTA-format in the sequence field.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | As an output you get one file which shows the probability for each signal peptide. Therefore, you have exact values and can decide on your own, if the probability is high enough to trust the prediction.<br> |
||
− | |||
− | |||
− | ''' SingalP ''' |
||
− | |||
− | ''Authors:'' Henrik Nielsen and Anders Krogh<br> |
||
− | ''Year:'' 1998<br> |
||
− | ''Source:'' [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.47.4026&rep=rep1&type=pdf Prediction of signal peptides and signal anchors by a hidden Markov model.]] <br> |
||
− | |||
− | ''Description:''<br> |
||
− | The SignalP method is based on two hidden markov models. The one hidden markov model has defined states for the different parts of the signal peptide and is used for the signal peptide prediction. The second hidden markov model is used to distinguish between signal peptides and sequence anchors to improve the prediction accurancy. <br> |
||
− | SignalP gives the user the possibility to predict specifically for eukaryotes, gram negative or gram positive bacterias to get a more precise prediction. <br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://www.cbs.dtu.dk/services/SignalP/ Webserver]] for our prediction and therefore it was only necessary to paste the protein sequence in FASTA-format. <br> |
||
− | |||
− | ''Output:''<br> |
||
− | The server gave a detailed output about the probability for each residue for different locations. At the end of the file there is a short prediction summary which gives information about the prediction result, the signal peptide probability and some other statistical measurements. <br> |
||
+ | To have a closer look at the different methods we again provide an [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/transmembrane_signal_peptide_general information page.]] |
||
+ | <br><br> |
||
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
---- |
---- |
||
=== Prediction of GO Terms === |
=== Prediction of GO Terms === |
||
+ | The last section is about the analysis of GO Terms. As before, we used several methods and compared them to each other. |
||
− | ''' GOPET (Gene Ontology Term Prediction and Evaluation Tool) '''<br> |
||
+ | Again we also provide an [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_terms_general general information page]] about the GO Term methods, we used in our analysis. |
||
− | ''Authors:'' Vinayagam A, König R, Moormann J, Schubert F, Eils R, Glatting KH, Suhai S <br> |
||
− | + | <br><br> |
|
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC517617/?tool=pubmed Applying Support Vector Machines for Gene Ontology based gene function prediction.]]<br> |
||
− | |||
− | ''Description:''<br> |
||
− | GOPET is a homology-based GO term prediction methods. It tries to assign uncharacterised cDNA sequences to GO molecular function terms. Therefore, the method uses in the first step a Blast search against GO-mapped proteins in a database. The found GO terms and attributes are used as input for a Support Vector Machine, which makes the final classification.<br<> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://genius.embnet.dkfz-heidelberg.de/menu/cgi-bin/w2h-open/w2h.open/w2h.startthis?SIMGO=w2h.welcome&INTRA_CONTINUE=1 Webserver]] for our prediction. Therefore, it was only necessary to paste our sequences in FASTA-format and to sumbit the job. <br> |
||
− | |||
− | ''Output:''<br> |
||
− | GOPET returns a table with the predicted GOid, the Aspect (Molecular Function Ontology (F), Biological Process Ontology (P) and Cellular Component Ontology (C)), the confidence for the prediction and the GO term itself. |
||
− | <br> |
||
− | |||
− | |||
− | ''' Pfam '''<br> |
||
− | |||
− | ''Authors:'' R.D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J.E. Pollington, O.L. Gavin, P. Gunesekaran, G. Ceric, K. Forslund, L. Holm, E.L. Sonnhammer, S.R. Eddy, A. Bateman <br> |
||
− | ''Year:'' 2010<br> |
||
− | ''Source:'' [[http://nar.oxfordjournals.org/content/38/suppl_1/D211.full The Pfam protein families database]]<br> |
||
− | |||
− | ''Description:'' <br> |
||
− | Pfam is also a homology-based prediction method. The domains are saved as hidden markov models. The method uses a naive bayes classifactor and classify the proteins with the aid of the hidden markov models.<br> |
||
− | |||
− | ''Input:''<br> |
||
− | We used the [[http://pfam.sanger.ac.uk/ Webserver]] for our predictions. Therefore, we chose the point "Sequence search", pasted the protein sequence in FASTA-format and sumbitted the job. <br> |
||
− | |||
− | ''Output:'' <br> |
||
− | The webserver shows a graphical representation of the prediction and also the matches. There are two categories of matches, significant and insignificant Pfam A-family matches. These matches are listed with family name, a short description, the entry type, Clan, some information about the HMM and the E-Value. <br> |
||
− | |||
− | |||
− | ''' ProtFun2.2 '''<br> |
||
− | |||
− | ''Authors:'' L. Juhl Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames, C. Kesmir, H. Nielsen, H. H. Stærfeldt, K. Rapacki, C. Workman, C. A. F. Andersen, S. Knudsen, A. Krogh, A. Valencia and S. Brunak.<br> |
||
− | ''Year:'' 2002<br> |
||
− | ''Source:'' [[http://www.ncbi.nlm.nih.gov/pubmed/12079362 Prediction of human protein function from post-translational modifications and localization features.]]<br> |
||
− | |||
− | ''Description:'' <br> |
||
− | ProtFun2.2 is an ab initio prediction method, which try to assign orphan proteins to functional classes. It integrates relevant features which are related to the linear amino acid sequence. Furthermore, it queries a large number of other feature prediction servers (PsiPred, TMHMM and so on). This explains why the prediction with ProtFun is very slow and you have to wait a long time for the prediction result. Techniqually, uses this method an ensemble of five different neuronal networks (which are three-layer feed-forward networks).<br> |
||
− | |||
− | ''Input:'' <br> |
||
− | We used the [[http://www.cbs.dtu.dk/services/ProtFun/ Webserver]] in our prediction. The prediction takes a long time and your request is queued, so you have to wait some hours. For the prediciton it is only necessary to paste the sequence in FASTA-format to the input field.<br> |
||
− | |||
− | ''Output:''<br> |
||
− | As output, you get a list with different functional categories and with a probability and an odd score. The probability shows you how likely your protein belongs to this class. But the probability is influenced by the prior probability of the class. The second score is an odd score, which shows you if the sequence belongs to this class or not. We decided to make a cutoff by 2. Furthermore, it predicts if your protein is an enzyme and the probability and odd score that this protein belongs to different enzyme classes. The last prediction section of the result file is the prediction for the gene ontology category and also the probabilities and odd scores for that. |
||
− | <br> |
||
== Secondary Structure prediction == |
== Secondary Structure prediction == |
||
+ | === Results === |
||
+ | The detailed output of the different prediction methods can be found [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Secondary_Structure_Prediction here]] |
||
+ | Here we only present a short summary of the output of the different methods. |
||
− | === PSIPRED === |
||
+ | * Predicted Helices |
||
− | PSIPRED delivers many different kind of output file formats. The pictures show the pdf-output which shows the secondary structure in a graphical kind. It predicts 14 alpha-helices and 15 beta-sheets. The rest are predicted coils. |
||
+ | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | {| class="centered" |
||
+ | |method |
||
− | | [[Image:Psipred1.png|thumb|center|First part of the PSIPRED output]] |
||
+ | |#helices |
||
− | | [[Image:Psipred2.png|thumb|center|Second part of the PSIPRED output]] |
||
+ | |- |
||
− | | [[Image:Psipred3.png |thumb|center| Legend for the PSIPRED output ]] |
||
+ | |PSIPRED |
||
+ | |14 |
||
+ | |- |
||
+ | |Jpred3 |
||
+ | |14 |
||
+ | |- |
||
+ | |DSSP |
||
+ | |16 |
||
+ | |- |
||
|} |
|} |
||
+ | * Predicted Beta-Sheets |
||
− | === Jpred3 === |
||
+ | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | The following alignemt shows the output of Jpred3. The secondary structure elements is marked by ourself. It predicted 14 alpha-helices and 15 beta-sheets. |
||
+ | |method |
||
+ | |#sheets |
||
+ | |- |
||
+ | |PSIPRED |
||
+ | |15 |
||
+ | |- |
||
+ | |Jpred3 |
||
+ | |15 |
||
+ | |- |
||
+ | |DSSP |
||
+ | |0 |
||
+ | |- |
||
+ | |} |
||
+ | === Comparison of the different methods === |
||
− | [[File:jpred_pic.png|thumb|center| Jpred output with colored secondary structure elements]] |
||
+ | To determine how successful our secondary structure prediction with PSIPRED and Jpred were, we had to compare it with the secondary structure assignment of DSSP. First of all, DSSP assigns no beta-sheets whereas both prediction methods predict some beta-sheets. Therefore, the main comparison in this case refers to the alpha-helices. |
||
− | === DSSP === |
||
+ | For PSIPRED the prediction of the alpha-helices was good. In most cases the alpha-helices of DSSP and PSIPRED correspond. There is only one helix which is predicted by PSIPRED which is not assigned as helix by DSSP. Furthermore there are three helices which are allocated as helices by DSSP which were not predicted by PSIPRED. The most of these helices which were presented only in one output are very small ones. |
||
− | We started DSSP on the webserver with the PDBB-id. Therefore, we get the secondary structure assignment for the whole protein and not only for the alpha-subunit. The following sequence with the according secondary structure is the output for our sequence (we extracted it from the whole). It assigned 16 alpha-helices and no beta-sheet. |
||
+ | For Jpred3 the prediction of the alpha-helices was sufficiently good. In the most cases it agrees with DSSP. There are only two helices which are predicted by Jpred and which are not assigned by DSSP. In contrary, there are three small helices which are allocated to an alpha-helices by DSSP but are not predicted by Jpred. There is another special case where DSSP assigns two helices which are separated by a turn and Jpred predicts there only one big helix. |
||
− | 10 20 30 40 50 |
||
− | | | | | | |
||
− | 1 - 52 LWPWPQNFQTSDQRYVLYPNNFQFQYDVSSAAQPGCSVLDEAFQRYRDLLFG |
||
− | 1 - 52 T TT T TTT TT TT TT HHHHHHHHHHHHHHH |
||
− | 1 - 52 * ** * * |
||
− | 1 - 52 A AAAA AAA A AA A A AA AAA AA A A |
||
− | |||
− | 60 70 80 90 100 110 |
||
− | | | | | | | |
||
− | 53 - 112 TLEKNVLVVSVVTPGCNQLPTLESVENYTLTINDDQCLLLSETVWGALRGLETFSQLVWK |
||
− | 53 - 112 SSSSTT TTT TT SSSSSTTT SSSSSTTHHHHHHHHHHHHHHSSS |
||
− | 53 - 112 * *** |
||
− | 53 - 112 AA AA A AAAA AAA A A A A AAA A A A AA |
||
− | 120 130 140 150 160 170 |
||
− | | | | | | | |
||
− | 113 - 172 SAEGTFFINKTEIEDFPRFPHRGLLLDTSRHYLPLSSILDTLDVMAYNKLNVFHWHLVDD |
||
− | 113 - 172 TT SSS SSSSS T TSSSSSSSTTTT HHHHHHHHHHHHHTT SSSSS T |
||
− | 113 - 172 |
||
− | 113 - 172 AA A A A A A A AAA AA A |
||
− | 180 190 200 210 220 230 |
||
− | | | | | | | |
||
− | 173 - 232 PSFPYESFTFPELMRKGSYNPVTHIYTAQDVKEVIEYARLRGIRVLAEFDTPGHTLSWGP |
||
− | 173 - 232 T TT THHHHHHTT TTTT HHHHHHHHHHHHHTT SSSSS TTT TTTTT |
||
− | 173 - 232 |
||
− | 173 - 232 A AAA A AA A A A A AA AA A A |
||
− | 240 250 260 270 280 290 |
||
− | | | | | | | |
||
− | 233 - 292 GIPGLLTPCYSGSEPSGTFGPVNPSLNNTYEFMSTFFLEVSSVFPDFYLHLGGDEVDFTC |
||
− | 233 - 292 TTTT SSSSSTTTTSSSSSSSS TT HHHHHHHHHHHHHHHHH TTSSS T THH |
||
− | 233 - 292 |
||
− | 233 - 292 AA A AAAAAAAAAA AAA A A A AA A A A AAA |
||
− | 300 310 320 330 340 350 |
||
− | | | | | | | |
||
− | 293 - 352 WKSNPEIQDFMRKKGFGEDFKQLESFYIQTLLDIVSSYGKGYVVWQEVFDNKVKIQPDTI |
||
− | 293 - 352 HHH HHHHHHHHHHT TT THHHHHHHHHHHHHHHHTTT SSSSSHHHHHTT TT S |
||
− | 293 - 352 * **** * |
||
− | 293 - 352 AA AA AAA AAAAAAAA AA A AA A AAAA A A A AAA |
||
− | 360 370 380 390 400 410 |
||
− | | | | | | | |
||
− | 353 - 412 IQVWREDIPVNYMKELELVTKAGFRALLSAPWYLNRISYGPDWKDFYVVEPLAFEGTPEQ |
||
− | 353 - 412 SSS TTTTT HHHHHHHHHHTT SSSS TT TTT TT THHHHHH TT TT HHH |
||
− | 353 - 412 * |
||
− | 353 - 412 AAAAAAAAAAA A AAA A A AA A AA A AA A AAA |
||
− | 420 430 440 450 460 470 |
||
− | | | | | | | |
||
− | 413 - 472 KALVIGGEACMWGEYVDNTNLVPRLWPRAGAVAERLWSNKLTSDLTFAYERLSHFRCELL |
||
− | 413 - 472 HTTSSSSSSSS TTT TTTTHHHHHTTHHHHHHHHHHT TT HHHHHHHHHHHHHHHH |
||
− | 413 - 472 * ** |
||
− | 413 - 472 AAA A A AAA AAAAA AA A AA |
||
− | 480 490 |
||
− | | | |
||
− | 473 - 492 RRGVQAQPLNVGFCEQEFEQ |
||
− | 473 - 492 HTT TTT TT |
||
− | 473 - 492 |
||
− | 473 - 492 A A A AA A AA AA |
||
+ | All in all, the prediction of the helices is probably good because they correspond mostly with the assignment of DSSP. The only negative aspect is, that both prediction methods predict a lot of sheets which were not assigned by DSSP at all. |
||
− | === Discussion === |
||
+ | <br><br> |
||
− | |||
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
− | To determine how succesful our secondary structure prediction with PSIPRED and Jpred are, we had to compare it with the secondary structure assignment of DSSP. First of all, DSSP assigns no beta-sheets whereas both prediction methods predict some beta-sheets. Therefore the main comparison in this case refers to the alpha-helices. |
||
− | |||
− | For PSIPRED the prediction of the alpha-helices was good. In the most cases the alpha-helices of DSSP und PSIPRED corrspond. There is only one helix which is predicted by PSIPRED which is not assigned as helix by DSSP. Furthermore there are three helices which are allocated as helices by DSSP which were not predicted by PSIPRED. The most of these helices which were presented only in one output are very small ones. |
||
− | |||
− | For Jpred3 the prediction of the alpha-helices was sufficiently good. In the most cases it agrees with DSSP. There are only two helices which are predicted by Jpred and which are not also assigned by DSSP. In contrary there are three small helics which are allocated to an alpha-helices by DSSP but are not predicted by Jpred. There is another special case where DSSP assignes two helices which are separated by a turn and Jpred predicts there only one big helix. |
||
− | |||
− | All in all, the prediction of the helices is probably good because they correspond mostly with the assignmet of DSSP. The only negative aspect is, that both prediction methods predict a lot of sheets which were not assigned by DSSP at all. |
||
== Prediction of disordered regions == |
== Prediction of disordered regions == |
||
− | Before we start with the analysis of the results of the different methods, we checked, if our protein has one or more |
+ | Before we start with the analysis of the results of the different methods, we checked, if our protein has one or more disordered regions. Therefore, we search our protein in the [[http://www.disprot.org/ DisProt database]] and did not find it, so our protein does not have any disordered regions. Another possibility to find out if the protein has disordered regions, is to check [[http://www.uniprot.org/ UniProt]], if there is an entry for [[http://www.disprot.org DisProt]]. |
+ | === Results === |
||
+ | The detailed results of the different methods can be found [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_Disordered_Regions here]] |
||
− | === Disopred === |
||
− | Disopred predicts two disordered regions in our protein. The first region is at the beginning of the protein (first two residues) and the second region is at the end (last three regions). This prediction is wrong, because it is normal, that the electrons from the first and the last amino acids lack in the electron density map. So, our protein Hexosamidase A has no disordered regions. |
||
+ | In this section, we only want to give a summary of the output of the different methods. |
||
− | [[Image:disopred_result.png|center|thumb|Result of the Disopred prediction. * shows that this amino acid belongs to a disordered regions, whereas . signs for a non-disordered region.]] |
||
− | |||
− | === POODLE === |
||
− | We decided to test several POODLE variants and to compare the results. |
||
− | |||
− | * POODLE-I |
||
− | |||
− | POODLE-I predicted five disordered regions: |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |method |
||
+ | |#disordered regions in the protein |
||
+ | |#disordered regions on the brink |
||
|- |
|- |
||
+ | |Disopred |
||
− | |start position |
||
+ | |0 |
||
− | |end position |
||
+ | |2 |
||
− | |length |
||
|- |
|- |
||
+ | |POODLE-I |
||
− | |1 |
||
− | | |
+ | |3 |
|2 |
|2 |
||
|- |
|- |
||
+ | |POODLE-L |
||
− | |14 |
||
− | | |
+ | |0 |
− | | |
+ | |0 |
|- |
|- |
||
+ | |POODLE-S (B-factors) |
||
− | |83 |
||
− | |89 |
||
− | |7 |
||
− | |- |
||
− | |105 |
||
− | |109 |
||
− | |5 |
||
− | |- |
||
− | |527 |
||
− | |529 |
||
|3 |
|3 |
||
− | |- |
||
− | |} |
||
− | |||
− | |||
− | * POODLE-L |
||
− | |||
− | POODLE-L found no disordered regions. Therefore, there is no disordered region with a length more than 40aa in our protein. |
||
− | |||
− | |||
− | * POODLE-S (High B-factor residues) |
||
− | This POODLE-S variant searches for high B-factor values in the crystallography, which implies uncertainty in the assignment of the atom positions. |
||
− | |||
− | POODLE-S predicted five disordered regions: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |length |
||
− | |- |
||
− | |0 |
||
− | |2 |
||
|2 |
|2 |
||
|- |
|- |
||
+ | |POODLE-S (missing residues) |
||
− | |13 |
||
− | |19 |
||
− | |7 |
||
− | |- |
||
− | |83 |
||
− | |88 |
||
− | |6 |
||
− | |- |
||
− | |105 |
||
− | |109 |
||
− | |5 |
||
− | |- |
||
− | |526 |
||
− | |529 |
||
|4 |
|4 |
||
+ | |2 |
||
|- |
|- |
||
+ | |IUPred (short) |
||
− | |} |
||
+ | |0 |
||
− | |||
− | |||
− | * POODLE-S (missing residues) |
||
− | |||
− | POODLE-S (missing residues) predicts a disordered region, if there is an amino acid in the sequence record, but not on the electron density map. |
||
− | |||
− | Poodle-S found 6 disordered regions. |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |length |
||
− | |- |
||
− | |17 |
||
− | |18 |
||
|2 |
|2 |
||
|- |
|- |
||
+ | |IUPred (long) |
||
− | |53 |
||
− | | |
+ | |0 |
− | | |
+ | |0 |
|- |
|- |
||
+ | |IUPred (structural information) |
||
− | |78 |
||
− | | |
+ | |0 |
− | | |
+ | |0 |
|- |
|- |
||
+ | |Meta-Disorder |
||
− | |153 |
||
− | | |
+ | |0 |
− | | |
+ | |0 |
− | |- |
||
− | |280 |
||
− | |280 |
||
− | |1 |
||
− | |- |
||
− | |345 |
||
− | |345 |
||
− | |1 |
||
|- |
|- |
||
|} |
|} |
||
+ | === Comparison of the different POODLE variants === |
||
+ | POODLE-L does not find any disordered regions. This is the result we expected, because our protein does not possess any disordered regions. |
||
+ | Both POODLE-S variants found several short disordered regions, which is a false positive result. Interestingly, there seems to be more missing electrons in the electron density map, than residues with high B-factor value. |
||
− | Graphical Output: |
||
− | {| |
||
− | | [[Image:POODLE_S_B.png|thumb|Prediction of POODLE-S (High B-factor residues)]] |
||
− | | [[Image:POODLE_S_M.png|thumb|Prediction of POODLE-S (missing residues)]] |
||
− | | [[Image:POODLE_I_hexa.png|thumb|center|Prediction of POODLE-I]] |
||
− | | [[Image:POODLE_L.png |thumb|Prediction of POODLE-L]] |
||
− | |} |
||
− | |||
− | |||
− | * Comparison of the different POODLE variants: |
||
− | POODLE-L doesn't find any disordered regions. This is the result we expected, because our protein doesn't posses any disordered regions. |
||
− | |||
− | Both POODLE-S variants found several short disordered regions, which is a false positive result. Interesstingly, there seems to be more missing electrons in the electron density map, than residues with high B-factor value. |
||
POODLE-I found the same result as POODLE-S with high B-factor, which was expected, because POODLE-I combines POODLE-L and POODLE-S (high B-factor). |
POODLE-I found the same result as POODLE-S with high B-factor, which was expected, because POODLE-I combines POODLE-L and POODLE-S (high B-factor). |
||
Line 537: | Line 154: | ||
Therefore, the predictions of short disordered regions are wrong results. Only the prediction of POODLE-L is correct. |
Therefore, the predictions of short disordered regions are wrong results. Only the prediction of POODLE-L is correct. |
||
− | In general, these predictions are used, if nothing is known about the protein. Therefore, normally we |
+ | In general, these predictions are used, if nothing is known about the protein. Therefore, normally we do not know, that the prediction is wrong. Because of that, we want to trust the result and we want to check if the disordered regions overlap with the functionally important residues, because it seems that disordered regions are functionally very important. |
We check this for POODLE-S with missing residues and POODLE-I, because POODLE-S with high B-factor values shows the same result as POODLE-I. |
We check this for POODLE-S with missing residues and POODLE-I, because POODLE-S with high B-factor values shows the same result as POODLE-I. |
||
Line 613: | Line 230: | ||
As you can see in the table above, only one disulfide bond is located in a disordered region, all other functionally important residues are located in ordered regions. This is a further good hint, that the predictions are wrong. |
As you can see in the table above, only one disulfide bond is located in a disordered region, all other functionally important residues are located in ordered regions. This is a further good hint, that the predictions are wrong. |
||
− | |||
− | === IUPred === |
||
− | |||
− | We tested the three different IUPred variants, which are offered by the webserver. <br> |
||
− | <br> |
||
− | |||
− | * IUPred (short) |
||
− | [[Image:iupred_shortt.png|center|thumb|Result of the IUPred prediction, which is focus on short disordered regions.]] |
||
− | |||
− | As you can see in the picture, IUPred which is focus on short disordered regions found only at the beginning and at the end of the protein a disordered region. This may be wrong, because at the beginning and at the end there are often regions without defined secondary structure, but also without function. |
||
− | <br><br> |
||
− | * IUPred (long) |
||
− | Next we take a look to the prediction of the long disordered regions:<br> |
||
− | |||
− | [[Image:iupred_long_right.png|center|thumb|Result of the IUPred prediction, which is focus on long disordered regions.]] |
||
− | |||
− | The picture above shows the result of this prediction. There is no disordered region predicted, not even at the beginning or at the end of the protein. This prediction is quite good, because the HEXA_HUMAN protein does not posses any disordered regions. |
||
− | |||
− | |||
− | *IUPred (with structural information) |
||
− | |||
− | As last, we analysed the prediction of IUPred with the additional usage of structural information. |
||
− | |||
− | [[Image:iupred_stur.png|center|thumb|Result of the IUPred prediction with additional structural information]] |
||
− | |||
− | As before, the method did not find any disordered regions. Therefore, the method predict three times the right result. Only by the method with focus on short disordered regions was a prediction of two disordered regions, but these regions were located at the beginning and at the end of the protein, which is obviously wrong. |
||
− | |||
− | === Meta-Disorder === |
||
− | |||
− | Meta-Disorder did not predict any disordered region in our protein. The different methods of which Meta-Disorder consists predicted some disordered regions, but Meta-Disorder build the consensus over all of these methods, and therefore it did not predict any disordered regions. |
||
− | |||
− | Graphical representation of the result: |
||
− | [[Image:metadisorder.png|center|800px|Result of the Meta-Disorder prediction]] |
||
− | |||
− | |||
− | The result is very good, because HEXA_HUMAN does not have any disordered regions. Therefore, the prediction of Meta-Disorder is right. |
||
=== Comparison of the different methods === |
=== Comparison of the different methods === |
||
Line 682: | Line 263: | ||
<br><br> |
<br><br> |
||
POODLE-L, IUPred(long) and IUPred(structure) predict the disordered regions correct. |
POODLE-L, IUPred(long) and IUPred(structure) predict the disordered regions correct. |
||
− | The |
+ | The worst prediction result gave POODLE-S (B-factor) which predicts 47 residues as disordered, followed by POODLE-S (missing) (24 wrong predicted residues) and POODLE-I (23 wrong predicted residues).<br><br> |
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
== Prediction of transmembrane alpha-helices and signal peptides == |
== Prediction of transmembrane alpha-helices and signal peptides == |
||
Line 729: | Line 311: | ||
|} |
|} |
||
+ | The detailed output for the different organism and the different prediction methods can be found here: |
||
− | === TMHMM === |
||
+ | * [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_transmembrane_alpha-helices_and_signal_peptides_HEXA_HUMAN HEXA_HUMAN]] |
||
− | We analysed the six sequences with TMHMM. |
||
+ | * [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_transmembrane_alpha-helices_and_signal_peptides_BACR_HALSA BACR_HALSA]] |
||
− | <br><br> |
||
+ | * [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_transmembrane_alpha-helices_and_signal_peptides_RET4_HUMAN RET4_HUMAN]] |
||
− | *HEXA_HUMAN |
||
+ | * [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_transmembrane_alpha-helices_and_signal_peptides_INSL5_HUMAN INSL5_HUMAN]] |
||
− | <br><br> |
||
+ | * [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_transmembrane_alpha-helices_and_signal_peptides_LAMP1_HUMAN LAMP1_HUMAN]] |
||
− | [[Image:hexa_human_tmhmm.png|thumb|Prediction of TMHMM for the transmembrane helices of HEXA_HUMAN]] |
||
+ | * [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Prediction_of_transmembrane_alpha-helices_and_signal_peptides_A4_HUMAN A4_HUMAN]] |
||
+ | === Results === |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |location |
||
− | |- |
||
− | |1 |
||
− | |529 |
||
− | |outside |
||
− | |- |
||
− | |} |
||
+ | ==== Transmembrane Helices ==== |
||
− | TMHMM predicts no transmembrane helix at all. The whole protein is located at the extracellular space. To evaluate this result, we compared the data from UniProt with our prediction. |
||
− | |||
− | [[Image:hexa_human_tmhmm_vs_real.png|center|thumb|Comparison between real occuring transmembrane helices and the TMHMM result.]] |
||
− | |||
− | As you can see above, the TMHMM prediction result is completly right, expect of the signal peptide, which can't be predicted by TMHMM. |
||
− | <br><br> |
||
− | * BACR_HALSA |
||
− | <br><br> |
||
− | [[Image:bacr_halsa_tmhmm.png|thumb|Prediction of TMHMM for the transmembrane helices of BACR_HALSA]] |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
|- |
|- |
||
+ | | |
||
+ | |colspan="3" | TMHMM |
||
+ | |colspan="3" | Phobius |
||
+ | |colspan="3" | PolyPhobius |
||
+ | |colspan="3" | OCTOPUS |
||
+ | |colspan="3" | SPOCTOPUS |
||
+ | |- |
||
+ | |protein |
||
|start position |
|start position |
||
|end position |
|end position |
||
|location |
|location |
||
− | |- |
||
− | |1 |
||
− | |22 |
||
− | |outside |
||
− | |- |
||
− | |23 |
||
− | |42 |
||
− | |TM Helix |
||
− | |- |
||
− | |43 |
||
− | |54 |
||
− | |inside |
||
− | |- |
||
− | |55 |
||
− | |77 |
||
− | |TM Helix |
||
− | |- |
||
− | |78 |
||
− | |91 |
||
− | |outside |
||
− | |- |
||
− | |92 |
||
− | |114 |
||
− | |TM Helix |
||
− | |- |
||
− | |115 |
||
− | |120 |
||
− | |inside |
||
− | |- |
||
− | |121 |
||
− | |143 |
||
− | |TM Helix |
||
− | |- |
||
− | |144 |
||
− | |147 |
||
− | |outside |
||
− | |- |
||
− | |148 |
||
− | |170 |
||
− | |TM Helix |
||
− | |- |
||
− | |171 |
||
− | |189 |
||
− | |inside |
||
− | |- |
||
− | |190 |
||
− | |212 |
||
− | |TM Helix |
||
− | |- |
||
− | |213 |
||
− | |262 |
||
− | |outside |
||
− | |- |
||
− | |} |
||
− | |||
− | TMHMM predicts six transmembrane helices for BACR_HALSA. We decided to compare the TMHMM prediction with the real occuring transmembrane helices in BACR_HALSA: |
||
− | |||
− | [[Image:h_tmhmm_vs_real.png|center|thumb|Comparison between real occuring transmembrane helices and the TMHMM result.]] |
||
− | Especially at the beginning is the prediction very good. There is almost 100% overlap between predicted and real helices. Only in the end of the protein lacks one transmembrane helix in the TMHMM prediction. Therefore, in real there are 7 transmembrane helices, whereas TMHMM only predicts 6. This is really bad, because it is a different for the function if there are 6 or 7 helices, but in general the prediction of TMHMM was quite good. |
||
− | <br><br> |
||
− | * RET4_HUMAN |
||
− | <br><br> |
||
− | [[Image:ret4_human_tmhmm.png|thumb|Prediction of TMHMM for the transmembrane helices of RET4_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
|start position |
|start position |
||
|end position |
|end position |
||
|location |
|location |
||
− | |- |
||
− | |1 |
||
− | |201 |
||
− | |outside |
||
− | |- |
||
− | |} |
||
− | |||
− | TMHMM predicts no transmembrane helices. The whole protein is loacted at the extracellular space. |
||
− | <br> |
||
− | <br> |
||
− | Comparison with the real structure of the protein: |
||
− | [[Image:r_human_tmhmm_vs_real.png|center|thumb|Comparison between real occuring transmembrane helices and the TMHMM result.]] |
||
− | The TMHMM prediction is completely right. Therefore, you can see TMHMM can also predict, that a protein is not a transmembrane protein. |
||
− | <br><br> |
||
− | * INSL5_HUMAN |
||
− | <br><br> |
||
− | [[Image:insl5_human_tmhmm.png|thumb|Prediction of TMHMM for the transmembrane helices of INSL5_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
|start position |
|start position |
||
|end position |
|end position |
||
|location |
|location |
||
− | |- |
||
− | |1 |
||
− | |135 |
||
− | |outside |
||
− | |- |
||
− | |} |
||
− | |||
− | TMHMM predicts no transmembrane helices. The whole protein is loacted at the extracellular space. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | [[Image:insl5_human_tmhmm_vs_real.png|center|thumb|Comparison between real occuring transmembrane helices and the TMHMM result.]] |
||
− | The TMHMM prediction is again completely right. |
||
− | <br><br> |
||
− | * LAMP1_HUMAN |
||
− | <br><br> |
||
− | [[Image:lamp1_human_tmhmm.png|thumb|Prediction of TMHMM for the transmembrane helices of LAMP1_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
|start position |
|start position |
||
|end position |
|end position |
||
|location |
|location |
||
− | |- |
||
− | |1 |
||
− | |10 |
||
− | |inside |
||
− | |- |
||
− | |11 |
||
− | |33 |
||
− | |TM Helix |
||
− | |- |
||
− | |34 |
||
− | |383 |
||
− | |outside |
||
− | |- |
||
− | |384 |
||
− | |406 |
||
− | |TM Helix |
||
− | |- |
||
− | |407 |
||
− | |417 |
||
− | |inside |
||
− | |- |
||
− | |} |
||
− | |||
− | TMHMM predicts two transmembrane helices, which are divided by a very long loop which is loacted at the extracellular space. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | [[Image:lamp1_human_tmhmm_vs_real.png|center|thumb|Comparison between real occuring transmembrane helices and the TMHMM result.]] |
||
− | |||
− | The prediction of TMHMM is quite good. Only at the beginning of the protein TMHMM predicts one wrong transmembrane helix (which is a signal peptide in real), but the rest of the prediction is correct. |
||
− | <br><br> |
||
− | * A4_HUMAN |
||
− | <br><br> |
||
− | [[Image:a4_human_tmhmm.png|thumb|Prediction of TMHMM for the transmembrane helices of A4_HUMAN]] |
||
− | |||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
|start position |
|start position |
||
|end position |
|end position |
||
|location |
|location |
||
|- |
|- |
||
+ | |rowspan="3" | HEXA HUMAN |
||
|1 |
|1 |
||
− | | |
+ | |529 |
|outside |
|outside |
||
− | |- |
||
− | |701 |
||
− | |723 |
||
− | |TM Helix |
||
− | |- |
||
− | |724 |
||
− | |770 |
||
− | |inside |
||
− | |- |
||
− | |} |
||
− | |||
− | TMHMM predicts one transmembrane helix at the end of the protein. As we already know is A4_HUMAN a single-spanning transmembrane protein and therefore the numbers of transmembrane helices is right predicted. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | [[Image:a4_human_tmhmm_vs_real.png|center|thumb|Comparison between real occuring transmembrane helices and the TMHMM result.]] |
||
− | |||
− | The result of the TMHMM prediction is pretty well. Except of the first residues at the beginning and the exact start position of the transmembrane helix, the prediction is correct. |
||
− | <br><br> |
||
− | === Phobius and PolyPhobius === |
||
− | <br><br> |
||
− | * HEXA_HUMAN |
||
− | <br><br> |
||
− | [[Image:phobius.png|thumb|Prediction of Phobius for the transmembrane helices and signal peptides of HEXA_HUMAN]] |
||
− | [[Image:polyphobius.png|thumb|Prediction of PolyPhobius for the transmembrane helices and signal peptides of HEXA_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''Phobius''' |
||
− | |colspan="3"|'''PolyPhobius''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | !colspan="6" | Signal peptide prediction |
||
− | |- |
||
− | |1 |
||
− | |5 |
||
− | |N-Region |
||
− | |1 |
||
− | |5 |
||
− | |N-Region |
||
− | |- |
||
− | |6 |
||
− | |17 |
||
− | |H-Region |
||
− | |6 |
||
− | |15 |
||
− | |H-Region |
||
− | |- |
||
− | |18 |
||
− | |22 |
||
− | |C-Region |
||
− | |16 |
||
− | |19 |
||
− | |C-Region |
||
− | |- |
||
− | !colspan="6" | Summary signal peptide |
||
− | |- |
||
− | |1 |
||
− | |22 |
||
− | |Signal Peptide |
||
− | |1 |
||
− | |19 |
||
− | |Signal Peptide |
||
− | |- |
||
− | !colspan="6" | Transmembrane helices prediction |
||
− | |- |
||
|23 |
|23 |
||
|529 |
|529 |
||
Line 998: | Line 360: | ||
|520 |
|520 |
||
|outside |
|outside |
||
− | |} |
||
− | |||
− | Both methods don't predict a transmembrane helix, which is correct, because HEXA_HUMAN is located at the lysosmal space. |
||
− | We compared the results of Phobius and PolyPhobius with the real protein. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:hexa_phobius_vs_real.png|thumb|Comparison between the prediction of Phobius and the real protein]] |
||
− | | [[Image:hexa_poly_vs_real.png|thumb|Comparison between the prediction of PolyPhobius and the real protein]] |
||
− | |} |
||
− | |||
− | The prediction of Phobius is a little bit better than the PolyPhobius prediction, because Phobius predicts the beginning and the end of the signal peptide totally correct, whereas PolyPhobius cuts two residues of the signal peptide. |
||
− | <br><br> |
||
− | * BACR_HALSA |
||
− | <br><br> |
||
− | [[Image:bacr_halsa_phobius.png|thumb|Prediction of Phobius for the transmembrane helices and signal peptides of BACR_HALSA]] |
||
− | [[Image:bacr_halsa_polyphobius.png|thumb|Prediction of PolyPhobius for the transmembrane helices and signal peptides of BACR_HALSA]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''Phobius''' |
||
− | |colspan="3"|'''PolyPhobius''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | !colspan="6" | Signal peptide prediction |
||
− | |- |
||
− | |colspan="6" | No prediction available |
||
− | |- |
||
− | !colspan="6" | Transmembrane helices prediction |
||
− | |- |
||
− | |23 |
||
− | |42 |
||
− | |TM helix |
||
− | |22 |
||
− | |43 |
||
− | |TM helix |
||
− | |- |
||
− | |43 |
||
− | |53 |
||
− | |inside |
||
− | |44 |
||
− | |54 |
||
− | |inside |
||
− | |- |
||
− | |54 |
||
− | |76 |
||
− | |TM helix |
||
− | |55 |
||
− | |77 |
||
− | |TM helix |
||
− | |- |
||
− | |77 |
||
− | |95 |
||
− | |outside |
||
− | |78 |
||
− | |94 |
||
− | |outside |
||
− | |- |
||
− | |96 |
||
− | |114 |
||
− | |TM helix |
||
− | |95 |
||
− | |114 |
||
− | |TM helix |
||
− | |- |
||
− | |115 |
||
− | |120 |
||
− | |inside |
||
− | |115 |
||
− | |120 |
||
− | |inside |
||
− | |- |
||
− | |121 |
||
− | |142 |
||
− | |TM helix |
||
− | |121 |
||
− | |141 |
||
− | |TM helix |
||
− | |- |
||
− | |143 |
||
− | |147 |
||
− | |outside |
||
− | |142 |
||
− | |147 |
||
− | |outside |
||
− | |- |
||
− | |148 |
||
− | |169 |
||
− | |TM helix |
||
− | |148 |
||
− | |166 |
||
− | |TM helix |
||
− | |- |
||
− | |170 |
||
− | |189 |
||
− | |inside |
||
− | |167 |
||
− | |186 |
||
− | |inside |
||
− | |- |
||
− | |190 |
||
− | |212 |
||
− | |TM helix |
||
− | |187 |
||
− | |205 |
||
− | |TM helix |
||
− | |- |
||
− | |213 |
||
− | |217 |
||
− | |outside |
||
− | |206 |
||
− | |215 |
||
− | |outside |
||
− | |- |
||
− | |218 |
||
− | |237 |
||
− | |TM helix |
||
− | |216 |
||
− | |237 |
||
− | |TM helix |
||
− | |- |
||
− | |238 |
||
− | |262 |
||
− | |inside |
||
− | |238 |
||
− | |262 |
||
− | |inside |
||
− | |- |
||
− | |} |
||
− | |||
− | Both methods don't predict a signal peptide, but both recognize, that this protein is a transmembrane protein with seven helices. The predictions only differ at the beginning and the end of the helix positions, but the differences between these two predictions is only about 1 to 3 residues. |
||
− | |||
− | To evaluate the predictions, we compared the predictions with the real occuring transmembrane helices.<br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:bacr_halsa_phobius_vs_real.png|thumb|Comparison between the prediction of Phobius and the real protein]] |
||
− | | [[Image:bacr_halsa_poly_vs_real.png|thumb|Comparison between the prediction of PolyPhobius and the real protein]] |
||
− | |} |
||
− | <br><br> |
||
− | *RET4_HUMAN |
||
− | <br><br> |
||
− | [[Image:ret4_human_phobius.png|thumb|Prediction of Phobius for the transmembrane helices and signal peptides of RET4_HUMAN]] |
||
− | [[Image:ret4_human_polyphobius.png|thumb|Prediction of PolyPhobius for the transmembrane helices and signal peptides of RET4_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''Phobius''' |
||
− | |colspan="3"|'''PolyPhobius''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | !colspan="6" | Signal peptide prediction |
||
− | |- |
||
|1 |
|1 |
||
|2 |
|2 |
||
+ | |inside |
||
− | |N-Region |
||
− | |1 |
||
− | |3 |
||
− | |N-Region |
||
− | |- |
||
− | |3 |
||
− | |13 |
||
− | |H-Region |
||
− | |4 |
||
− | |13 |
||
− | |H-Region |
||
− | |- |
||
− | |14 |
||
− | |18 |
||
− | |C-Region |
||
− | |14 |
||
− | |18 |
||
− | |C-Region |
||
− | |- |
||
− | !colspan="6" | Summary signal peptide |
||
− | |- |
||
− | |1 |
||
− | |18 |
||
− | |secretory signal peptide |
||
− | |1 |
||
− | |18 |
||
− | |secretoy signal peptide |
||
− | |- |
||
− | !colspan="6" | Transmembrane helices prediction |
||
− | |- |
||
− | |19 |
||
− | |201 |
||
− | |outside |
||
− | |19 |
||
− | |201 |
||
− | |outside |
||
− | |- |
||
− | |} |
||
− | |||
− | Both methods predict a signal peptide for the secretory pathway. This result is correct. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:ret4_human_phobius_vs_real.png|thumb|Comparison between the prediction of Phobius and the real protein]] |
||
− | | [[Image:ret4_human_poly_vs_real.png|thumb|Comparison between the prediction of PolyPhobius and the real protein]] |
||
− | |} |
||
− | |||
− | Both methods show exactly the same result. |
||
− | <br><br> |
||
− | *INSL5_HUMAN |
||
− | <br><br> |
||
− | [[Image:insl5_human_phobius.png|thumb|Prediction of Phobius for the transmembrane helices and signal peptides of INSL5_HUMAN]] |
||
− | [[Image:insl5_human_polyphobius.png|thumb|Prediction of PolyPhobius for the transmembrane helices and signal peptides of INSL5_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''Phobius''' |
||
− | |colspan="3"|'''PolyPhobius''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | !colspan="6" | Signal peptide prediction |
||
− | |- |
||
− | |1 |
||
− | |5 |
||
− | |N-Region |
||
− | |1 |
||
− | |4 |
||
− | |N-Region |
||
− | |- |
||
− | |6 |
||
− | |17 |
||
− | |H-Region |
||
− | |5 |
||
− | |16 |
||
− | |H-Region |
||
− | |- |
||
− | |18 |
||
|22 |
|22 |
||
+ | |529 |
||
− | |C-Region |
||
− | |17 |
||
− | |22 |
||
− | |C-Region |
||
− | |- |
||
− | !colspan="6" | Summary signal peptide |
||
− | |- |
||
− | |1 |
||
− | |22 |
||
− | |Secretory signal peptide |
||
− | |1 |
||
− | |22 |
||
− | |Secretoy signal peptide |
||
− | |- |
||
− | !colspan="6" | Transmembrane helices prediction |
||
− | |- |
||
− | |23 |
||
− | |135 |
||
− | |outside |
||
− | |23 |
||
− | |135 |
||
|outside |
|outside |
||
|- |
|- |
||
+ | |colspan="9" | |
||
− | |} |
||
− | Both methods predict a signale peptide for the secretory pathway and both prediction results are totally equal. |
||
− | |||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:insl5_human_phobius_vs_real.png|thumb|Comparison between the prediction of Phobius and the real protein]] |
||
− | | [[Image:insl5_human_poly_vs_real.png|thumb|Comparison between the prediction of PolyPhobius and the real protein]] |
||
− | |} |
||
− | |||
− | The complete prediction is correct. |
||
− | <br><br> |
||
− | * LAMP1_HUMAN |
||
− | <br><br> |
||
− | [[Image:lamp1_human_phobius.png|thumb|Prediction of Phobius for the transmembrane helices and signal peptides of LAMP1_HUMAN]] |
||
− | [[Image:lamp1_human_polyphobius.png|thumb|Prediction of PolyPhobius for the transmembrane helices and signal peptides of LAMP1_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''Phobius''' |
||
− | |colspan="3"|'''PolyPhobius''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | !colspan="6" | Signal peptide prediction |
||
− | |- |
||
− | |1 |
||
− | |10 |
||
− | |N-Region |
||
− | |1 |
||
− | |9 |
||
− | |N-Region |
||
− | |- |
||
− | |11 |
||
− | |22 |
||
− | |H-Region |
||
− | |10 |
||
− | |22 |
||
− | |H-Region |
||
− | |- |
||
− | |23 |
||
− | |28 |
||
− | |C-Region |
||
− | |23 |
||
− | |28 |
||
− | |C-Region |
||
− | |- |
||
− | !colspan="6" | Summary signal peptide |
||
− | |- |
||
− | |1 |
||
− | |28 |
||
− | |secretory signal peptide |
||
− | |1 |
||
− | |28 |
||
− | |secretory signal peptide |
||
− | |- |
||
− | !colspan="6" | Transmembrane helices prediction |
||
− | |- |
||
− | |29 |
||
− | |381 |
||
− | |outside |
||
− | |29 |
||
− | |381 |
||
− | |outside |
||
− | |- |
||
− | |382 |
||
− | |405 |
||
− | |TM helix |
||
− | |382 |
||
− | |405 |
||
− | |TM helix |
||
− | |- |
||
− | |406 |
||
− | |417 |
||
− | |outside |
||
− | |406 |
||
− | |417 |
||
− | |outside |
||
− | |- |
||
− | |} |
||
− | |||
− | The results of both methods are quite equal. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:lam_human_phobius_vs_real.png|thumb|Comparison between the prediction of Phobius and the real protein]] |
||
− | | [[Image:lam_human_poly_vs_real.png|thumb|Comparison between the prediction of PolyPhobius and the real protein]] |
||
− | |} |
||
− | |||
− | Both results of the prediction methods are equal and furthermore, the are equal to the real protein. |
||
− | <br><br> |
||
− | * A4_HUMAN |
||
− | <br><br> |
||
− | [[Image:a4_human_phobius.png|thumb|Prediction of Phobius for the transmembrane helices and signal peptides of A4_HUMAN]] |
||
− | [[Image:a4_human_polyphobius.png|thumb|Prediction of PolyPhobius for the transmembrane helices and signal peptides of A4_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''Phobius''' |
||
− | |colspan="3"|'''PolyPhobius''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | !colspan="6" | Signal peptide prediction |
||
− | |- |
||
− | |1 |
||
− | |1 |
||
− | |N-Region |
||
− | |1 |
||
− | |3 |
||
− | |N-Region |
||
− | |- |
||
− | |2 |
||
− | |12 |
||
− | |H-Region |
||
− | |4 |
||
− | |12 |
||
− | |H-Region |
||
− | |- |
||
− | |13 |
||
− | |17 |
||
− | |C-Region |
||
− | |13 |
||
− | |17 |
||
− | |C-Region |
||
− | |- |
||
− | !colspan="6" | Summary signal peptide |
||
− | |- |
||
− | |1 |
||
− | |17 |
||
− | |secretory signal peptide |
||
− | |1 |
||
− | |17 |
||
− | |secretory signal peptide |
||
− | |- |
||
− | !colspan="6" | Transmembrane helices prediction |
||
− | |- |
||
− | |18 |
||
− | |700 |
||
− | |outside |
||
− | |18 |
||
− | |700 |
||
− | |outside |
||
− | |- |
||
− | |701 |
||
− | |723 |
||
− | |TM helix |
||
− | |701 |
||
− | |723 |
||
− | |TM helix |
||
− | |- |
||
− | |724 |
||
− | |770 |
||
− | |inside |
||
− | |724 |
||
− | |770 |
||
− | |inside |
||
− | |- |
||
− | |} |
||
− | |||
− | The results of both methods are quite equal.<br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:a4_human_phobius_vs_real.png|thumb|Comparison between the prediction of Phobius and the real protein]] |
||
− | | [[Image:a4_human_poly_vs_real.png|thumb|Comparison between the prediction of PolyPhobius and the real protein]] |
||
− | |} |
||
− | |||
− | Both results of the prediction methods are equal and furthermore, the are equal to the real protein. |
||
− | <br><br> |
||
− | === OCTOPUS and SPOCTOPUS === |
||
− | <br><br> |
||
− | *HEXA_HUMAN |
||
− | <br><br> |
||
− | [[Image:hexa_human_octopus.png|thumb|Prediction of OCTOPUS for the transmembrane helices of HEXA_HUMAN]] |
||
− | [[Image:hexa_human_spoctopus.png|thumb|Prediction of SPOCTOPUS for the transmembrane helices of HEXA_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''OCTOPUS''' |
||
− | |colspan="3"|'''SPOCTOPUS''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
− | |1 |
||
− | |2 |
||
− | |inside |
||
− | |1 |
||
− | |6 |
||
− | |N-terminal of a signal peptide |
||
− | |- |
||
|3 |
|3 |
||
|23 |
|23 |
||
|TM helix |
|TM helix |
||
+ | |colspan="3" | |
||
− | |7 |
||
− | |21 |
||
− | |signal peptide |
||
|- |
|- |
||
+ | |colspan="9" | |
||
|24 |
|24 |
||
|529 |
|529 |
||
|outside |
|outside |
||
+ | |colspan="3" | |
||
+ | |- |
||
+ | |rowspan="15" | BACR HALSA |
||
+ | |1 |
||
|22 |
|22 |
||
− | |529 |
||
|outside |
|outside |
||
− | | |
+ | | |
− | | |
+ | | |
+ | | |
||
− | |||
+ | | |
||
− | The results of these two predictions differ. |
||
+ | | |
||
− | OCTOPUS predicts a transmembrane helix, whereas SPOCTOPUS predicts at the same location a signal peptide. |
||
+ | | |
||
− | <br> |
||
− | To check which method predicted right, we compared the protein and the prediction. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:hexa_human_octopus_vs_real.png|thumb|Comparison between the prediction of OCTOPUS and the real protein]] |
||
− | | [[Image:hexa_human_spoctopus_vs_real.png|thumb|Comparison between the prediction of SPOCTOPUS and the real protein]] |
||
− | |} |
||
− | |||
− | SPOCTOPUS gave us the better result, because SPOCTOPUS recognices the signal peptide, whereas OCTOPUS predicts a transmembrane helix instead. |
||
− | <br><br> |
||
− | * BACR_HALSA |
||
− | <br><br> |
||
− | [[Image:bacr_halsa_octopus.png|thumb|Prediction of OCTOPUS for the transmembrane helices of BACR_HALSA]] |
||
− | [[Image:bacr_halsa_spoctopus.png|thumb|Prediction of SPOCTOPUS for the transmembrane helices of BACR_HALSA]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''OCTOPUS''' |
||
− | |colspan="3"|'''SPOCTOPUS''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
|1 |
|1 |
||
|22 |
|22 |
||
Line 1,521: | Line 396: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |23 |
||
+ | |42 |
||
+ | |TM Helix |
||
+ | |23 |
||
+ | |42 |
||
+ | |TM helix |
||
+ | |22 |
||
+ | |43 |
||
+ | |TM helix |
||
|23 |
|23 |
||
|43 |
|43 |
||
Line 1,528: | Line 412: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |43 |
||
+ | |54 |
||
+ | |inside |
||
+ | |43 |
||
+ | |53 |
||
+ | |inside |
||
+ | |44 |
||
+ | |54 |
||
+ | |inside |
||
|44 |
|44 |
||
|54 |
|54 |
||
Line 1,535: | Line 428: | ||
|inside |
|inside |
||
|- |
|- |
||
+ | |55 |
||
+ | |77 |
||
+ | |TM Helix |
||
+ | |54 |
||
+ | |76 |
||
+ | |TM helix |
||
+ | |55 |
||
+ | |77 |
||
+ | |TM helix |
||
|55 |
|55 |
||
|75 |
|75 |
||
Line 1,542: | Line 444: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |78 |
||
+ | |91 |
||
+ | |outside |
||
+ | |77 |
||
+ | |95 |
||
+ | |outside |
||
+ | |78 |
||
+ | |94 |
||
+ | |outside |
||
|76 |
|76 |
||
|95 |
|95 |
||
Line 1,549: | Line 460: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |92 |
||
+ | |114 |
||
+ | |TM Helix |
||
+ | |96 |
||
+ | |114 |
||
+ | |TM helix |
||
+ | |95 |
||
+ | |114 |
||
+ | |TM helix |
||
|96 |
|96 |
||
|116 |
|116 |
||
Line 1,556: | Line 476: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |115 |
||
+ | |120 |
||
+ | |inside |
||
+ | |115 |
||
+ | |120 |
||
+ | |inside |
||
+ | |115 |
||
+ | |120 |
||
+ | |inside |
||
|117 |
|117 |
||
|121 |
|121 |
||
Line 1,563: | Line 492: | ||
|inside |
|inside |
||
|- |
|- |
||
+ | |121 |
||
+ | |143 |
||
+ | |TM Helix |
||
+ | |121 |
||
+ | |142 |
||
+ | |TM helix |
||
+ | |121 |
||
+ | |141 |
||
+ | |TM helix |
||
|122 |
|122 |
||
|142 |
|142 |
||
Line 1,570: | Line 508: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |144 |
||
+ | |147 |
||
+ | |outside |
||
+ | |143 |
||
+ | |147 |
||
+ | |outside |
||
+ | |142 |
||
+ | |147 |
||
+ | |outside |
||
|143 |
|143 |
||
|147 |
|147 |
||
Line 1,577: | Line 524: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |148 |
||
+ | |170 |
||
+ | |TM Helix |
||
+ | |148 |
||
+ | |169 |
||
+ | |TM helix |
||
+ | |148 |
||
+ | |166 |
||
+ | |TM helix |
||
|148 |
|148 |
||
|168 |
|168 |
||
Line 1,584: | Line 540: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |171 |
||
+ | |189 |
||
+ | |inside |
||
+ | |170 |
||
+ | |189 |
||
+ | |inside |
||
+ | |167 |
||
+ | |186 |
||
+ | |inside |
||
|169 |
|169 |
||
|185 |
|185 |
||
Line 1,591: | Line 556: | ||
|inside |
|inside |
||
|- |
|- |
||
+ | |190 |
||
+ | |212 |
||
+ | |TM Helix |
||
+ | |190 |
||
+ | |212 |
||
+ | |TM helix |
||
+ | |187 |
||
+ | |205 |
||
+ | |TM helix |
||
|186 |
|186 |
||
|206 |
|206 |
||
Line 1,598: | Line 572: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |213 |
||
+ | |262 |
||
+ | |outside |
||
+ | |213 |
||
+ | |217 |
||
+ | |outside |
||
+ | |206 |
||
+ | |215 |
||
+ | |outside |
||
|207 |
|207 |
||
|216 |
|216 |
||
Line 1,605: | Line 588: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |colspan="3" | |
||
+ | |218 |
||
+ | |237 |
||
+ | |TM helix |
||
+ | |216 |
||
+ | |237 |
||
+ | |TM helix |
||
|217 |
|217 |
||
|237 |
|237 |
||
Line 1,612: | Line 602: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |colspan="3" | |
||
+ | |238 |
||
+ | |262 |
||
+ | |inside |
||
+ | |238 |
||
+ | |262 |
||
+ | |inside |
||
|238 |
|238 |
||
|262 |
|262 |
||
Line 1,619: | Line 616: | ||
|inside |
|inside |
||
|- |
|- |
||
+ | |rowspan="3" | RET4 HUMAN |
||
− | |} |
||
+ | |colspan="9" | |
||
− | |||
− | Both methods have a very similar result, which is identical with the exception of some residues. Both predicted the seven transmembrane helices, which is a very good result. |
||
− | <br> |
||
− | <br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:bacr_halsa_octopus_vs_real.png|thumb|Comparison between the prediction of OCTOPUS and the real protein]] |
||
− | | [[Image:bacr_halsa_spoctopus_vs_real.png|thumb|Comparison between the prediction of SPOCTOPUS and the real protein]] |
||
− | |} |
||
− | <br><br> |
||
− | *RET4_HUMAN |
||
− | <br><br> |
||
− | [[Image:ret4_human_octopus.png|thumb|Prediction of OCTOPUS for the transmembrane helices of RET4_HUMAN]] |
||
− | [[Image:ret4_human_spoctopus.png|thumb|Prediction of SPOCTOPUS for the transmembrane helices of RET4_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''OCTOPUS''' |
||
− | |colspan="3"|'''SPOCTOPUS''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
|1 |
|1 |
||
|1 |
|1 |
||
|inside |
|inside |
||
+ | |colspan="3" | |
||
− | |1 |
||
− | |5 |
||
− | |N-terminal of a signal peptide |
||
|- |
|- |
||
+ | |colspan="9" | |
||
|2 |
|2 |
||
|23 |
|23 |
||
|TM helix |
|TM helix |
||
+ | |colspan="3" | |
||
− | |6 |
||
+ | |- |
||
+ | |1 |
||
+ | |201 |
||
+ | |outside |
||
|19 |
|19 |
||
+ | |201 |
||
− | |signal peptide |
||
+ | |outside |
||
− | |- |
||
+ | |19 |
||
+ | |201 |
||
+ | |outside |
||
|24 |
|24 |
||
|201 |
|201 |
||
Line 1,667: | Line 645: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |rowspan="3" | INSL5 HUMAN |
||
− | |} |
||
+ | |colspan="9" | |
||
− | |||
− | |||
− | As before by HEXA_HUMAN, OCTOPUS predicts a transmembrane helix, whereas SPOCTOPUS predicts the signal peptide. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:ret4_human_octopus_vs_real.png|thumb|Comparison between the prediction of OCTOPUS and the real protein]] |
||
− | | [[Image:ret4_human_spoctopus_vs_real.png|thumb|Comparison between the prediction of SPOCTOPUS and the real protein]] |
||
− | |} |
||
− | <br><br> |
||
− | * INSL5_HUMAN |
||
− | <br><br> |
||
− | [[Image:insl5_human_octopus.png|thumb|Prediction of OCTOPUS for the transmembrane helices of INSL5_HUMAN]] |
||
− | [[Image:insl5_human_spoctopus.png|thumb|Prediction of SPOCTOPUS for the transmembrane helices of INSL5_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''OCTOPUS''' |
||
− | |colspan="3"|'''SPOCTOPUS''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
|1 |
|1 |
||
|1 |
|1 |
||
|inside |
|inside |
||
+ | |colspan="3" | |
||
− | |1 |
||
− | |5 |
||
− | |N-terminal of a signale peptide |
||
|- |
|- |
||
+ | |colspan="9" | |
||
|2 |
|2 |
||
|32 |
|32 |
||
|TM helix |
|TM helix |
||
+ | |colspan="3" | |
||
− | |6 |
||
+ | |- |
||
+ | |1 |
||
+ | |135 |
||
+ | |outside |
||
|23 |
|23 |
||
+ | |135 |
||
− | |signal peptide |
||
+ | |outside |
||
− | |- |
||
+ | |23 |
||
+ | |135 |
||
+ | |outside |
||
|33 |
|33 |
||
|135 |
|135 |
||
Line 1,715: | Line 674: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |rowspan="5" | LAMP1 HUMAN |
||
− | |} |
||
− | |||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:insl5_human_octopus_vs_real.png|thumb|Comparison between the prediction of OCTOPUS and the real protein]] |
||
− | | [[Image:insl5_human_spoctopus_vs_real.png|thumb|Comparison between the prediction of SPOCTOPUS and the real protein]] |
||
− | |} |
||
− | |||
− | As we already have seen before, OCTOPUS predicts a transmembrane helix, whereas SPOCTOPUS predicts this region as signal peptid, which is correct. |
||
− | <br><br> |
||
− | * LAMP1_HUMAN |
||
− | <br><br> |
||
− | [[Image:lamp1_human_octopus.png|thumb|Prediction of OCTOPUS for the transmembrane helices of LAMP1_HUMAN]] |
||
− | [[Image:lamp1_human_spoctopus.png|thumb|Prediction of SPOCTOPUS for the transmembrane helices of LAMP1_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''OCTOPUS''' |
||
− | |colspan="3"|'''SPOCTOPUS''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
|1 |
|1 |
||
|10 |
|10 |
||
|inside |
|inside |
||
+ | |colspan="6" | |
||
|1 |
|1 |
||
− | | |
+ | |10 |
+ | |inside |
||
− | |N-terminal of a signal peptide |
||
+ | |colspan="3" | |
||
|- |
|- |
||
+ | |11 |
||
+ | |33 |
||
+ | |TM Helix |
||
+ | |colspan="6" | |
||
|11 |
|11 |
||
|31 |
|31 |
||
|TM helix |
|TM helix |
||
+ | |colspan="3" | |
||
− | |12 |
||
− | |29 |
||
− | |signal peptide |
||
|- |
|- |
||
+ | |34 |
||
+ | |383 |
||
+ | |outside |
||
+ | |29 |
||
+ | |381 |
||
+ | |outside |
||
+ | |29 |
||
+ | |381 |
||
+ | |outside |
||
|32 |
|32 |
||
|383 |
|383 |
||
Line 1,763: | Line 709: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |384 |
||
+ | |406 |
||
+ | |TM Helix |
||
+ | |382 |
||
+ | |405 |
||
+ | |TM helix |
||
+ | |382 |
||
+ | |405 |
||
+ | |TM helix |
||
|384 |
|384 |
||
|404 |
|404 |
||
Line 1,770: | Line 725: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |407 |
||
+ | |417 |
||
+ | |inside |
||
+ | |406 |
||
+ | |417 |
||
+ | |outside |
||
+ | |406 |
||
+ | |417 |
||
+ | |outside |
||
|405 |
|405 |
||
|417 |
|417 |
||
Line 1,777: | Line 741: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |rowspan="5" | A4 HUMAN |
||
− | |} |
||
+ | |colspan="9" | |
||
− | |||
− | As before by HEXA_HUMAN and RET4_HUMAN, OCTOPUS predicts a transmembrane helix, whereas SPOCTOPUS predicts the signal peptide. |
||
− | <br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:lamp1_human_octopus_vs_real.png|thumb|Comparison between the prediction of OCTOPUS and the real protein]] |
||
− | | [[Image:lamp1_human_spoctopus_vs_real.png|thumb|Comparison between the prediction of SPOCTOPUS and the real protein]] |
||
− | |} |
||
− | <br><br> |
||
− | *A4_HUMAN |
||
− | <br><br> |
||
− | [[Image:a4_human_octopus.png|thumb|Prediction of OCTOPUS for the transmembrane helices of LAMP1_HUMAN]] |
||
− | [[Image:a4_human_spoctopus.png|thumb|Prediction of SPOCTOPUS for the transmembrane helices of LAMP1_HUMAN]] |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="3"|'''OCTOPUS''' |
||
− | |colspan="3"|'''SPOCTOPUS''' |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
|1 |
|1 |
||
|5 |
|5 |
||
|outside |
|outside |
||
+ | |colspan="3" | |
||
− | |1 |
||
− | |4 |
||
− | |N-terminal of signal peptide |
||
|- |
|- |
||
+ | |colspan="9" | |
||
|6 |
|6 |
||
|11 |
|11 |
||
|R |
|R |
||
+ | |colspan="3" | |
||
− | |5 |
||
− | |18 |
||
− | |Signal peptide |
||
|- |
|- |
||
+ | |1 |
||
+ | |700 |
||
+ | |outside |
||
+ | |18 |
||
+ | |700 |
||
+ | |outside |
||
+ | |18 |
||
+ | |700 |
||
+ | |outside |
||
|12 |
|12 |
||
|701 |
|701 |
||
Line 1,824: | Line 770: | ||
|outside |
|outside |
||
|- |
|- |
||
+ | |701 |
||
+ | |723 |
||
+ | |TM Helix |
||
+ | |701 |
||
+ | |723 |
||
+ | |TM helix |
||
+ | |701 |
||
+ | |723 |
||
+ | |TM helix |
||
|702 |
|702 |
||
|722 |
|722 |
||
Line 1,831: | Line 786: | ||
|TM helix |
|TM helix |
||
|- |
|- |
||
+ | |724 |
||
+ | |770 |
||
+ | |inside |
||
+ | |724 |
||
+ | |770 |
||
+ | |inside |
||
+ | |724 |
||
+ | |770 |
||
+ | |inside |
||
|723 |
|723 |
||
|770 |
|770 |
||
Line 1,839: | Line 803: | ||
|- |
|- |
||
|} |
|} |
||
− | |||
− | As before by HEXA_HUMAN and RET4_HUMAN, OCTOPUS predicts a transmembrane helix, whereas SPOCTOPUS predicts the signal peptide. |
||
<br><br> |
<br><br> |
||
+ | On the table above, you can see the summary of the results of the different methods which predict transmembrane helices. As you can see on this table, OCTOPUS often predicts a transmembrane helix, although all other methods do not predict one. Phobis, PolyPhobius and SPOCTOPUS show always very similar result, whereas TMHMM and OCTOPUS differ from these results.<br><br> |
||
− | Comparison with the real structure of the protein: |
||
− | {| |
||
− | | [[Image:a4_human_octopus_vs_real.png|thumb|Comparison between the prediction of OCTOPUS and the real protein]] |
||
− | | [[Image:a4_human_spoctopus_vs_real.png|thumb|Comparison between the prediction of SPOCTOPUS and the real protein]] |
||
− | |} |
||
− | === |
+ | ==== Signal Peptide ==== |
− | All of our proteins are proteins from human and archaea, so therefore we only use the non-plant option of TargetP. <br> |
||
− | <br><br> |
||
− | * HEXA_HUMAN |
||
− | <br><br> |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | | |
||
+ | |colspan="2" | Phobius |
||
+ | |colspan="2" | PolyPhobius |
||
+ | |colspan="2" | SPOCTOPUS |
||
+ | |colspan="1" | TargetP |
||
+ | |colspan="2" | SignalP |
||
|- |
|- |
||
+ | |protein |
||
− | |Location |
||
− | |Probability |
||
− | |- |
||
− | |mitochondrial targeting SP |
||
− | |0.214 |
||
− | |- |
||
− | |secretory pathway SP |
||
− | |0.877 |
||
− | |- |
||
− | |other |
||
− | |0.009 |
||
− | |- |
||
− | |} |
||
− | |||
− | TargetP predicts a secretory pathway signal peptide for this protein, which is correct. <br><br> |
||
− | <br><br> |
||
− | * BACR_HALSA |
||
− | <br><br> |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |Location |
||
− | |Probability |
||
− | |- |
||
− | |mitochondiral targeting SP |
||
− | |0.019 |
||
− | |- |
||
− | |secretory pathway SP |
||
− | |0.897 |
||
− | |- |
||
− | |other |
||
− | |0.562 |
||
− | |- |
||
− | |} |
||
− | |||
− | TargetP predicts that this protein contains a secretory pathway signal peptide. The probability for this signal peptide is very high, although the result is wrong, because BACR_HALSA is a transmembrane protein. |
||
− | <br><br> |
||
− | * RET4_HUMAN |
||
− | <br><br> |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |Location |
||
− | |Probability |
||
− | |- |
||
− | |mitochondrial targeting SP |
||
− | |0.242 |
||
− | |- |
||
− | |secretory pathway SP |
||
− | |0.928 |
||
− | |- |
||
− | |other |
||
− | |0.020 |
||
− | |- |
||
− | |} |
||
− | |||
− | TargetP predicts a secretory pathway signal peptide for this protein, which is completely correct. |
||
− | <br><br> |
||
− | * INSL5_HUMAN |
||
− | <br><br> |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |Location |
||
− | |Probability |
||
− | |- |
||
− | |mitochondrial targeting SP |
||
− | |0.074 |
||
− | |- |
||
− | |secretory pathway SP |
||
− | |0.899 |
||
− | |- |
||
− | |other |
||
− | |0.037 |
||
− | |- |
||
− | |} |
||
− | |||
− | As before, TargetP predicts a secretory pathway signal peptide, which is again correct. |
||
− | <br><br> |
||
− | * LAMP1_HUMAN |
||
− | <br><br> |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |Location |
||
− | |Probability |
||
− | |- |
||
− | |mitochondrial targeting SP |
||
− | |0.043 |
||
− | |- |
||
− | |secretory pathway SP |
||
− | |0.953 |
||
− | |- |
||
− | |other |
||
− | |0.017 |
||
− | |- |
||
− | |} |
||
− | |||
− | The prediction of the secretory pathway signal peptide is wrong, because LAMP1_HUMAN is a transmembrane protein. |
||
− | <br><br> |
||
− | *A4_HUMAN |
||
− | <br><br> |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |- |
||
− | |Location |
||
− | |Probability |
||
− | |- |
||
− | |mitochondrial targeting SP |
||
− | |0.035 |
||
− | |- |
||
− | |secretory pathway SP |
||
− | |0.937 |
||
− | |- |
||
− | |other |
||
− | |0.084 |
||
− | |- |
||
− | |} |
||
− | |||
− | Because A4_HUMAN is a transmembrane protein, the prediction for the secretory pathway signal peptide is wrong. <br><br> |
||
− | |||
− | === SignalP === |
||
− | |||
− | For our analysis we used the hidden markov model based and also the neuronal network based prediction. <br> |
||
− | The prediction with the hidden markov model used three different scores. The S-score which is the score for the signal peptide, the C-score which is the score for the clevage site and the Y-score which is a combination of the S-score and the C-score and is used to predict the cleavage site, because the Y-score is more precise than the C-score. |
||
− | <br><br> |
||
− | * HEXA_HUMAN |
||
− | <br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
|start position |
|start position |
||
|end position |
|end position |
||
|start position |
|start position |
||
|end position |
|end position |
||
− | |prediction |
||
− | |- |
||
− | |1 |
||
− | |22 |
||
− | |22 |
||
− | |23 |
||
− | |signal peptide |
||
− | |- |
||
− | |} |
||
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |signal peptide |
||
− | |1.000 |
||
− | |0.000 |
||
− | |22 |
||
− | |23 |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:hexa_human_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network]] |
||
− | | [[Image:hexa_human_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model]] |
||
− | |} |
||
− | |||
− | Both methods predict the same start and end position of the cleavage site and also both methods predict a signal peptide, which is correct because HEXA_HUMAN takes part at the secretory pathway. |
||
− | <br><br> |
||
− | *BACR_HALSA |
||
− | <br><br> |
||
− | BACR_HALSA is an archaea protein. SignalP gave the possibility to predict eukaryotic or bacteria (gram-positive and gram-negative) signal peptides. Therefore, we decided to use all three possible prediction methods and to compare the results with the real signal peptide. |
||
− | <br><br> |
||
− | '''eukaryotes'''<br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
|start position |
|start position |
||
|end position |
|end position |
||
+ | |location |
||
|start position |
|start position |
||
|end position |
|end position |
||
− | |prediction |
||
|- |
|- |
||
+ | |HEXA HUMAN |
||
|1 |
|1 |
||
− | | |
+ | |22 |
− | |38 |
||
− | |39 |
||
− | |signal peptide |
||
− | |- |
||
− | |} |
||
− | |||
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |signal peptide |
||
− | |0.017 |
||
− | |0.859 |
||
− | |15 |
||
− | |16 |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:bacr_halsa_eu_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for BACR_HALSA with the prediction method for eukaryotes]] |
||
− | | [[Image:bacr_halsa_eu_human_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for BACR_HALSA with the prediction method for eukaryotes]] |
||
− | |} |
||
− | <br><br> |
||
− | '''gram-negative bacteria'''<br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
− | |- |
||
|1 |
|1 |
||
− | | |
+ | |19 |
− | | |
+ | |7 |
− | | |
+ | |21 |
+ | |secretory pathway |
||
− | |no signal peptide |
||
− | | |
+ | |1 |
− | | |
+ | |22 |
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |Non-secretory protein |
||
− | |0.000 |
||
− | |0.000 |
||
− | | |
||
− | | |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:bacr_halsa_neg_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for BACR_HALSA with the prediction method for gram-negative bacteria]] |
||
− | | [[Image:bacr_halsa_neg_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for BACR_HALSA with the prediction method for gram-negative bacteria]] |
||
− | |} |
||
− | <br><br> |
||
− | '''gram-positive bacteria'''<br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
|- |
|- |
||
+ | |BACR HALSA |
||
+ | |colspan="6" | no prediction available |
||
+ | |secretory pathway |
||
|1 |
|1 |
||
− | | |
+ | |38 |
− | |33 |
||
− | |34 |
||
− | |no signal peptide |
||
− | |- |
||
− | |} |
||
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |Non-secretoy protein |
||
− | |0.000 |
||
− | |0.000 |
||
− | | |
||
− | | |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:bacr_halsa_pos_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for BACR_HALSA with the prediction method for gram-positive bacteria]] |
||
− | | [[Image:bacr_halsa_pos_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for BACR_HALSA with the prediction method for gram-positive bacteria]] |
||
− | |} |
||
− | <br> |
||
− | Only the eukaryotic prediction method predicts a signal peptide, whereas the both methods for bacteria predict, that this protein has no signal peptide. Otherwise, only the eukaryotic prediction method predict the protein as a signal anchor, which is correct, because BACR_HALSA is a transmembrane protein. Therefore, it seemds, that the eukaryotic prediction method suited better for BACR_HALSA |
||
− | <br><br> |
||
− | *RET4_HUMAN |
||
− | <br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
|- |
|- |
||
+ | |RET4 HUMAN |
||
|1 |
|1 |
||
|18 |
|18 |
||
+ | |1 |
||
|18 |
|18 |
||
+ | |6 |
||
|19 |
|19 |
||
+ | |secretory pathway |
||
− | |signal peptide |
||
− | | |
+ | |1 |
− | |} |
||
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |signal peptide |
||
− | |1.000 |
||
− | |0.000 |
||
|18 |
|18 |
||
− | |19 |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:ret4_human_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for RET4_HUMAN]] |
||
− | | [[Image:ret4_human_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for RET4_HUMAN]] |
||
− | |} |
||
− | |||
− | Both methods predict a signal peptide for RET4_HUMAN, which is correct. |
||
− | <br><br> |
||
− | *INSL5_HUMAN |
||
− | <br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
|- |
|- |
||
+ | |INSL5 HUMAN |
||
|1 |
|1 |
||
|22 |
|22 |
||
+ | |1 |
||
|22 |
|22 |
||
+ | |6 |
||
|23 |
|23 |
||
+ | |secretory pathway |
||
− | |signal peptide |
||
− | | |
+ | |1 |
− | |} |
||
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |signal peptide |
||
− | |0.999 |
||
− | |0.000 |
||
|22 |
|22 |
||
− | |23 |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:insl5_human_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for INSL5_HUMAN]] |
||
− | | [[Image:insl5_human_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for INSL5_HUMAN]] |
||
− | |} |
||
− | |||
− | Both methods predict a signal peptide for RET4_HUMAN, which is correct. |
||
− | <br><br> |
||
− | *LAMP1_HUMAN |
||
− | <br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
|- |
|- |
||
+ | |LAMP1 HUMAN |
||
|1 |
|1 |
||
|28 |
|28 |
||
+ | |1 |
||
|28 |
|28 |
||
+ | |12 |
||
|29 |
|29 |
||
+ | |secretory pathway |
||
− | |signal peptide |
||
− | | |
+ | |1 |
− | |} |
||
− | |||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |signal peptide |
||
− | |1.000 |
||
− | |0.000 |
||
|28 |
|28 |
||
− | |29 |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:lamp1_human_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for LAMP1_HUMAN]] |
||
− | | [[Image:lamp1_human_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for LAMP1_HUMAN]] |
||
− | |} |
||
− | |||
− | Both methods predict a signal peptide for LAMP1_HUMAN, which is not correct, because LAMP1_HUMAN is a transmembrane protein. |
||
− | <br><br> |
||
− | *A4_HUMAN |
||
− | <br><br> |
||
− | '''Result of the neuronal network''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="2" | Signal peptide |
||
− | |colspan="2" | Clevage site |
||
− | |colspan="1" | |
||
− | |- |
||
− | |start position |
||
− | |end position |
||
− | |start position |
||
− | |end position |
||
− | |prediction |
||
|- |
|- |
||
+ | |A4 HUMAN |
||
|1 |
|1 |
||
|17 |
|17 |
||
+ | |1 |
||
|17 |
|17 |
||
+ | |5 |
||
|18 |
|18 |
||
+ | |secretory pathway |
||
− | |signal peptide |
||
+ | |1 |
||
+ | |15 |
||
|- |
|- |
||
|} |
|} |
||
+ | <br> |
||
+ | In the last table there is a list with the results of the prediction of the signal peptides created by different methods. As we can see on the first look, all methods predict always a signal peptide, although the stop position of this signal differ. Phobius, PolyPhobius and SPOCTOPUS failed by predicting the signal peptide from BACR_HALSA. Furthermore, TargetP do not predict the position of the signal peptide, instead it only predicts the location of the protein.<br><br> |
||
− | '''Result of the hidden markov model''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |prediction |
||
− | |signal peptide probability |
||
− | |signal anchor probability |
||
− | |cleavage site start |
||
− | |cleavage site end |
||
− | |- |
||
− | |signal peptide |
||
− | |1.000 |
||
− | |0.000 |
||
− | |17 |
||
− | |18 |
||
− | |- |
||
− | |} |
||
− | |||
− | {| |
||
− | | [[Image:a4_human_signalp_nn.png|thumb|Result of the SignalP method based on the neuronal network for A4_HUMAN]] |
||
− | | [[Image:a4_human_signalp_hmm.png|thumb|Result of the SignalP method based on the hidden markov model for A4_HUMAN]] |
||
− | |} |
||
− | |||
− | Both methods predict a signal peptide for A4_HUMAN, which is not correct, because A4_HUMAN is a transmembrane protein. |
||
− | <br><br> |
||
− | <br><br> |
||
=== Comparison of the different methods === |
=== Comparison of the different methods === |
||
<br><br> |
<br><br> |
||
Line 2,603: | Line 1,159: | ||
|} |
|} |
||
− | TMHMM is the |
+ | TMHMM is the worst prediction method. This can also be seen on the example of BACR_HALSA, because TMHMM is the only prediction method, which do not recognize the 7 transmembrane helices. |
SPOCTOPUS and PolyPhobius are the best prediction methods.<br><br> |
SPOCTOPUS and PolyPhobius are the best prediction methods.<br><br> |
||
− | In general the prediction of transmembrane helices works quite good and almost all predictions are very close to the real protein. |
+ | In general, the prediction of transmembrane helices works quite good and almost all predictions are very close to the real protein. |
<br><br> |
<br><br> |
||
* Comparison of signal peptide prediction |
* Comparison of signal peptide prediction |
||
<br><br> |
<br><br> |
||
− | Now we compared TargetP and SignalP which |
+ | Now we compared TargetP and SignalP which only predict signal peptides. Furthermore, we compared SPOCTOPUS, Phobius and PolyPhobius. |
TargetP does not predict the start and end position of the signal peptide, instead it predicts only the location of the protein. |
TargetP does not predict the start and end position of the signal peptide, instead it predicts only the location of the protein. |
||
Line 2,808: | Line 1,364: | ||
In contrast, TargetP only predicts the location of the protein, not the start and stop position of the signal peptide. Only Phobius and PolyPhobius predict both.<br> |
In contrast, TargetP only predicts the location of the protein, not the start and stop position of the signal peptide. Only Phobius and PolyPhobius predict both.<br> |
||
Therefore, it is difficult to compare the different methods. First of all, Phobius and PolyPhobius have more power than the other prediction methods, because they predict both. In average they predict the location and also the position as good as the other prediction methods. None of the methods could predict the transmembrane proteins, all methods predict them as proteins of the secretory pathway. Therefore, it is useful to use Phobius or PolyPhobius, because they predict more than the other methods. Furthermore, both methods can also predict transmembrane helices. |
Therefore, it is difficult to compare the different methods. First of all, Phobius and PolyPhobius have more power than the other prediction methods, because they predict both. In average they predict the location and also the position as good as the other prediction methods. None of the methods could predict the transmembrane proteins, all methods predict them as proteins of the secretory pathway. Therefore, it is useful to use Phobius or PolyPhobius, because they predict more than the other methods. Furthermore, both methods can also predict transmembrane helices. |
||
− | The results of Phobius were a |
+ | The results of Phobius were a little bit better than the results of PolyPhobius.<br> |
− | We also wanted to mention, that SignalP gave you the possibility to choose between the prediction for eukaryotes, gram-positive bacteria and gram-negative bacteria. In our analyse we also |
+ | We also wanted to mention, that SignalP gave you the possibility to choose between the prediction for eukaryotes, gram-positive bacteria and gram-negative bacteria. In our analyse we also analysed BACR_HALSA, which is an archaea protein. We tested all three prediction methods for this protein and all three methods failed. BACR_HALSA do not possess a signal peptide, but every method predicts one. Only the eukaryotic prediction method recognized a signal anchor for BACR_HALSA, whereas the other two methods could not give a prediction of the location.<br><br> |
<br><br> |
<br><br> |
||
* Comparison of the combined methods |
* Comparison of the combined methods |
||
<br><br> |
<br><br> |
||
− | The last |
+ | The last issue, we wanted to compare, was the combined methods. SPOCTOPUS, Phobius and PolyPhobius can predict transmembrane helices as well as signal peptides. Therefore we combined our two further comparisons. |
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
Line 2,824: | Line 1,380: | ||
|SPOCTOPUS |
|SPOCTOPUS |
||
|- |
|- |
||
− | |rowspan="3" | |
+ | |rowspan="3" | HEXA_HUMAN |
|#wrong predicted residues (TM) |
|#wrong predicted residues (TM) |
||
|0 |
|0 |
||
Line 2,842: | Line 1,398: | ||
!colspan="5" | |
!colspan="5" | |
||
|- |
|- |
||
− | |rowspan="3" | |
+ | |rowspan="3" | BACR_HALSA |
|#wrong predicted residues (TM) |
|#wrong predicted residues (TM) |
||
|29 |
|29 |
||
Line 2,930: | Line 1,486: | ||
|no prediction |
|no prediction |
||
|- |
|- |
||
− | !colspan="5" | |
+ | !colspan="5" | Average |
|- |
|- |
||
|rowspan="3" | |
|rowspan="3" | |
||
Line 2,950: | Line 1,506: | ||
|} |
|} |
||
− | In general, PolyPhobius gave the best results. Although it predicts the |
+ | In general, PolyPhobius gave the best results. Although it predicts the signal peptide stop position a little bit worse than Phobius, the transmembrane prediction is significant better than by the prediction of Phobius. The predictions of SPOCTOPUS are also good, but sadly SPOCTOPUS does not predict the location of the protein.<br> |
Therefore, it seems a good choice to use PolyPhobius, which is in average the best method for transmembrane and signal peptide prediction.<br><br> |
Therefore, it seems a good choice to use PolyPhobius, which is in average the best method for transmembrane and signal peptide prediction.<br><br> |
||
+ | <br><br> |
||
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
||
− | == |
+ | ==== Signal Peptide ==== |
− | |||
− | Before we start with out analysis, we decided to check the GO annotations for the six sequences: |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | | |
||
− | !colspan="2" | HEXA_HUMAN |
||
+ | |colspan="2" | Phobius |
||
+ | |colspan="2" | PolyPhobius |
||
+ | |colspan="2" | SPOCTOPUS |
||
+ | |colspan="1" | TargetP |
||
+ | |colspan="2" | SignalP |
||
|- |
|- |
||
+ | |protein |
||
− | |rowspan="14" | Process |
||
+ | |start position |
||
− | |skeletal system development |
||
+ | |end position |
||
+ | |start position |
||
+ | |end position |
||
+ | |start position |
||
+ | |end position |
||
+ | |location |
||
+ | |start position |
||
+ | |end position |
||
|- |
|- |
||
+ | |HEXA HUMAN |
||
− | |carbohydrate metabolic process |
||
+ | |1 |
||
+ | |22 |
||
+ | |1 |
||
+ | |19 |
||
+ | |7 |
||
+ | |21 |
||
+ | |secretory pathway |
||
+ | |1 |
||
+ | |22 |
||
|- |
|- |
||
+ | |BACR HALSA |
||
− | |ganglioside catabolic process |
||
+ | |colspan="6" | no prediction available |
||
+ | |secretory pathway |
||
+ | |1 |
||
+ | |38 |
||
|- |
|- |
||
+ | |RET4 HUMAN |
||
− | |lysosome organization |
||
+ | |1 |
||
+ | |18 |
||
+ | |1 |
||
+ | |18 |
||
+ | |6 |
||
+ | |19 |
||
+ | |secretory pathway |
||
+ | |1 |
||
+ | |18 |
||
|- |
|- |
||
+ | |INSL5 HUMAN |
||
− | |sensory perception of sound |
||
+ | |1 |
||
+ | |22 |
||
+ | |1 |
||
+ | |22 |
||
+ | |6 |
||
+ | |23 |
||
+ | |secretory pathway |
||
+ | |1 |
||
+ | |22 |
||
|- |
|- |
||
+ | |LAMP1 HUMAN |
||
− | |locomotory behavior |
||
+ | |1 |
||
+ | |28 |
||
+ | |1 |
||
+ | |28 |
||
+ | |12 |
||
+ | |29 |
||
+ | |secretory pathway |
||
+ | |1 |
||
+ | |28 |
||
|- |
|- |
||
+ | |A4 HUMAN |
||
− | |adult walking behavior |
||
− | | |
+ | |1 |
+ | |17 |
||
− | |lipid storage |
||
− | | |
+ | |1 |
+ | |17 |
||
− | |sexual reproduction |
||
− | | |
+ | |5 |
+ | |18 |
||
− | |glycosaminoglycan metabolic process |
||
+ | |secretory pathway |
||
− | |- |
||
+ | |1 |
||
− | |myelination |
||
− | | |
+ | |15 |
− | |cell morphogenesis involved in neuron differentiation |
||
− | |- |
||
− | |neuromuscular process controlling posture |
||
− | |- |
||
− | |neuromuscular process controlling balance |
||
− | |- |
||
− | |rowspan="8" |Function |
||
− | |catalytic activity |
||
− | |- |
||
− | |hydrolase activity, hydrolyzing O-glycosyl compounds |
||
− | |- |
||
− | |beta-N-acetylhexosaminidase activity |
||
− | |- |
||
− | |protein binding |
||
− | |- |
||
− | |hydrolase activity |
||
− | |- |
||
− | |hydrolase activity, acting on glycosyl bonds |
||
− | |- |
||
− | |cation binding |
||
− | |- |
||
− | |protein heterodimerization activity |
||
− | |- |
||
− | |rowspan="2" |Component |
||
− | |lysosome |
||
− | |- |
||
− | |membrane |
||
− | |- |
||
− | !colspan="2" | BACR_HALSA |
||
− | |- |
||
− | |rowspan="6" | Process |
||
− | |transport |
||
− | |- |
||
− | |ion transport |
||
− | |- |
||
− | |phototransduction |
||
− | |- |
||
− | |proton transport |
||
− | |- |
||
− | |protein-chromophore linkage |
||
− | |- |
||
− | |response to stimulus |
||
− | |- |
||
− | |rowspan="3" | Function |
||
− | |receptor activity |
||
− | |- |
||
− | |ion channel activity |
||
− | |- |
||
− | |photoreceptor activity |
||
− | |- |
||
− | |rowspan="3" | Component |
||
− | |plasma membrane |
||
− | |- |
||
− | |membrane |
||
− | |- |
||
− | |integral to membrane |
||
− | |- |
||
− | !colspan="2" | RET4_HUMAN |
||
− | |- |
||
− | |rowspan="30"| Process |
||
− | |eye development |
||
− | |- |
||
− | |gluconeogenesis |
||
− | |- |
||
− | |transport |
||
− | |- |
||
− | |spermatogenesis |
||
− | |- |
||
− | |heart development |
||
− | |- |
||
− | |visual perception |
||
− | |- |
||
− | |male gonad development |
||
− | |- |
||
− | |embryo development |
||
− | |- |
||
− | |maintenance of gastrointestinal epithelium |
||
− | |- |
||
− | |lung development |
||
− | |- |
||
− | |positive regulation of insulin secretion |
||
− | |- |
||
− | |response to retinoic acid |
||
− | |- |
||
− | |response to insulin stimulus |
||
− | |- |
||
− | |retinol transport |
||
− | |- |
||
− | |retinol metabolic process |
||
− | |- |
||
− | |glucose homeostasis |
||
− | |- |
||
− | |response to ethanol |
||
− | |- |
||
− | |embryonic organ morphogenesis |
||
− | |- |
||
− | |embryonic skeletal system development |
||
− | |- |
||
− | |cardiac muscle tissue development |
||
− | |- |
||
− | |female genitalia morphogenesis |
||
− | |- |
||
− | |detection of light stimulus involved in visual perception |
||
− | |- |
||
− | |positive regulation of immunoglobulin secretion |
||
− | |- |
||
− | |retina development in camera-type eye |
||
− | |- |
||
− | |negative regulation of cardiac muscle cell proliferation |
||
− | |- |
||
− | |embryonic retina morphogenesis in camera-type eye |
||
− | |- |
||
− | |uterus development |
||
− | |- |
||
− | |vagina development |
||
− | |- |
||
− | |urinary bladder development |
||
− | |- |
||
− | |heart trabecula formation |
||
− | |- |
||
− | |rowspan="7" | Function |
||
− | |transporter activity |
||
− | |- |
||
− | |binding |
||
− | |- |
||
− | |retinoid binding |
||
− | |- |
||
− | |protein binding |
||
− | |- |
||
− | |retinal binding |
||
− | |- |
||
− | |retinol binding |
||
− | |- |
||
− | |retinol transporter activity |
||
− | |- |
||
− | |rowspan="2" | Component |
||
− | |extracellular region |
||
− | |- |
||
− | |extracellular space |
||
− | |- |
||
− | !colspan="2" | INSL5_HUMAN |
||
− | |- |
||
− | |Process |
||
− | |biological_process |
||
− | |- |
||
− | |Function |
||
− | |hormone activity |
||
− | |- |
||
− | |rowspan="2" | Component |
||
− | |cellular_component |
||
− | |- |
||
− | |extracellular region |
||
− | |- |
||
− | !colspan="2" | LAMP1_HUMAN |
||
− | |- |
||
− | |Process |
||
− | |autophagy |
||
− | |- |
||
− | |rowspan="16" | Component |
||
− | |membrane fraction |
||
− | |- |
||
− | |lysosome |
||
− | |- |
||
− | |lysosomal membrane |
||
− | |- |
||
− | |endosome |
||
− | |- |
||
− | |late endosome |
||
− | |- |
||
− | |multivesicular body |
||
− | |- |
||
− | |plasma membrane |
||
− | |- |
||
− | |integral to plasma membrane |
||
− | |- |
||
− | |external side of plasma membrane |
||
− | |- |
||
− | |cell surface |
||
− | |- |
||
− | |endosome membrane |
||
− | |- |
||
− | |membrane |
||
− | |- |
||
− | |integral to membrane |
||
− | |- |
||
− | |vesicle |
||
− | |- |
||
− | |sarcolemma |
||
− | |- |
||
− | |melanosome |
||
− | |- |
||
− | !colspan="2" | A4_HUMAN |
||
− | |- |
||
− | |rowspan="42" | Process |
||
− | |G2 phase of mitotic cell cycle |
||
− | |- |
||
− | |suckling behavior |
||
− | |- |
||
− | |platelet degranulation |
||
− | |- |
||
− | |mRNA polyadenylation |
||
− | |- |
||
− | |regulation of translation |
||
− | |- |
||
− | |protein phosphorylation |
||
− | |- |
||
− | |cellular copper ion homeostasis |
||
− | |- |
||
− | |endocytosis |
||
− | |- |
||
− | |apoptosis |
||
− | |- |
||
− | |induction of apoptosis |
||
− | |- |
||
− | |cell adhesion |
||
− | |- |
||
− | |regulation of epidermal growth factor receptor activity |
||
− | |- |
||
− | |Notch signaling pathway |
||
− | |- |
||
− | |axonogenesis |
||
− | |- |
||
− | |blood coagulation |
||
− | |- |
||
− | |mating behavior |
||
− | |- |
||
− | |locomotory behavior |
||
− | |- |
||
− | |axon cargo transport |
||
− | |- |
||
− | |cell death |
||
− | |- |
||
− | |adult locomotory behavior |
||
− | |- |
||
− | |visual learning |
||
− | |- |
||
− | |negative regulation of peptidase activity |
||
− | |- |
||
− | |positive regulation of peptidase activity |
||
− | |- |
||
− | |axon midline choice point recognition |
||
− | |- |
||
− | |neuron remodeling |
||
− | |- |
||
− | |dendrite development |
||
− | |- |
||
− | |platelet activation |
||
− | |- |
||
− | |extracellular matrix organization |
||
− | |- |
||
− | |forebrain development |
||
− | |- |
||
− | |neuron projection development |
||
− | |- |
||
− | |ionotropic glutamate receptor signaling pathway |
||
− | |- |
||
− | |regulation of multicellular organism growth |
||
− | |- |
||
− | |innate immune response |
||
− | |- |
||
− | |negative regulation of neuron differentiation |
||
− | |- |
||
− | |positive regulation of mitotic cell cycle |
||
− | |- |
||
− | |positive regulation of transcription from RNA polymerase II promoter |
||
− | |- |
||
− | |collateral sprouting in absence of injury |
||
− | |- |
||
− | |regulation of synapse structure and activity |
||
− | |- |
||
− | |neuromuscular process controlling balance |
||
− | |- |
||
− | |synaptic growth at neuromuscular junction |
||
− | |- |
||
− | |neuron apoptosis |
||
− | |- |
||
− | |smooth endoplasmic reticulum calcium ion homeostasis |
||
− | |- |
||
− | |rowspan="11" | Function |
||
− | |DNA binding |
||
− | |- |
||
− | |serine-type endopeptidase inhibitor activity |
||
− | |- |
||
− | |receptor binding |
||
− | |- |
||
− | |binding |
||
− | |- |
||
− | |protein binding |
||
− | |- |
||
− | |peptidase activator activity |
||
− | |- |
||
− | |peptidase inhibitor activity |
||
− | |- |
||
− | |acetylcholine receptor binding |
||
− | |- |
||
− | |identical protein binding |
||
− | |- |
||
− | |metal ion binding |
||
− | |- |
||
− | |PTB domain binding |
||
− | |- |
||
− | |rowspan="24" | Component |
||
− | |extracellular region |
||
− | |- |
||
− | |membrane fraction |
||
− | |- |
||
− | |cytoplasm |
||
− | |- |
||
− | |Golgi apparatus |
||
− | |- |
||
− | |plasma membrane |
||
− | |- |
||
− | |integral to plasma membrane |
||
− | |- |
||
− | |coated pit |
||
− | |- |
||
− | |cell surface |
||
− | |- |
||
− | |membrane |
||
− | |- |
||
− | |integral to membrane |
||
− | |- |
||
− | |synaptosome |
||
− | |- |
||
− | |axon |
||
− | |- |
||
− | |platelet alpha granule lumen |
||
− | |- |
||
− | |cytoplasmic vesicle |
||
− | |- |
||
− | |neuromuscular junction |
||
− | |- |
||
− | |ciliary rootlet |
||
− | |- |
||
− | |neuron projection |
||
− | |- |
||
− | |dendritic spine |
||
− | |- |
||
− | |dendritic shaft |
||
− | |- |
||
− | |intracellular membrane-bounded organelle |
||
− | |- |
||
− | |apical part of cell |
||
− | |- |
||
− | |synapse |
||
− | |- |
||
− | |perinuclear region of cytoplasm |
||
− | |- |
||
− | |spindle midzone |
||
|- |
|- |
||
|} |
|} |
||
<br> |
<br> |
||
+ | In the last table there is a list with the results of the prediction of the signal peptides created by different methods.<br><br> |
||
− | A detailed list of the GO annotation terms of each protein can be found [[https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Go_annotations_here here]]. |
||
− | ===GOPET=== |
||
+ | === Comparison of the different methods === |
||
− | We tried to predict the GO annotations with GOPET for our six different proteins. |
||
<br><br> |
<br><br> |
||
+ | We decided to split the comparison of the methods, because it is unfair to directly compare a method which can not predict a signal peptide and a method which predicts signal peptides. Therefore, we split the comparison in one comparison for transmembrane helices, one for signal peptides and one for the combination of both. |
||
− | * '''HEXA_HUMAN''' |
||
<br><br> |
<br><br> |
||
+ | * Comparison of transmembrane helix prediction |
||
− | [[Image:hexa_human_gopet.png|center|Result of the GOPET prediction for HEXA_HUMAN]] |
||
+ | <br><br> |
||
− | |||
+ | Here we compared TMHMM, OCTOPUS and the transmembrane predictions of SPOCTOPUS, Phobius and PolyPhobius. In this comparison we skipped the first residues which are signal peptides, because all only-transmembrane prediction methods predicted these region as transmembrane helices, which is wrong. |
||
− | The method only predicts functional GO terms. HEXA_HUMAN has 8 annotated GO functions. The methods predicts also 8 GO function terms. Therefore we decided to check if all predictions are correct. We checked if the general term is correct and also if the GO number is correct. |
||
+ | <br> |
||
+ | For this comparison we counted the wrong predicted transmembrane residues, the wrong predicted outside located residues and the wrong predicted inside residues. |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |rowspan="2" | |
||
− | |GO term |
||
+ | |rowspan="2" | |
||
− | |confidence |
||
+ | |colspan="5" | methods |
||
− | |prediction term |
||
+ | |rowspan="1" | |
||
− | |prediction GOid |
||
|- |
|- |
||
+ | |TMHMM |
||
− | |hexosamidase activity |
||
+ | |Phobius |
||
− | |97% |
||
+ | |PolyPhobius |
||
− | |right |
||
+ | |OCTOPUS |
||
− | |wrong |
||
+ | |SPOCTOPUS |
||
+ | |Transmembrane protein |
||
|- |
|- |
||
+ | |rowspan="5" | HEXA_HUMAN |
||
− | |beta-N-acetylhexosamidase activity |
||
+ | |#wrong transmembrane |
||
− | |96% |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
+ | |0 |
||
+ | |rowspan="5" | no |
||
|- |
|- |
||
+ | |#wrong outside |
||
− | |hydrolase activity |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong insde |
||
− | |hydrolase activity acting on glycosyl bonds |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong sum |
||
− | |hydrolase activity hydrolyzing O-glycosyl compounds |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |%wrong predicted |
||
− | |catalytic activity |
||
− | | |
+ | |0% |
+ | |0% |
||
− | |right |
||
+ | |0% |
||
− | |right |
||
+ | |0% |
||
+ | |0% |
||
+ | |- |
||
+ | !colspan="8" | |
||
|- |
|- |
||
+ | |rowspan="5" | BACR_HALSA |
||
− | |hydrolase activity hydrolyzing N-glycosyl compounds |
||
+ | |#wrong transmembrane |
||
− | |78% |
||
+ | |24 |
||
− | |wrong |
||
+ | |20 |
||
− | |wrong |
||
+ | |12 |
||
+ | |16 |
||
+ | |11 |
||
+ | |rowspan="5" | yes (7 transmembrane helices) |
||
|- |
|- |
||
+ | |#wrong outside |
||
− | |protein heterodimerization activity |
||
− | | |
+ | |46 |
+ | |5 |
||
− | |right |
||
+ | |3 |
||
− | |right |
||
+ | |4 |
||
+ | |6 |
||
|- |
|- |
||
+ | |#wrong inside |
||
− | |} |
||
+ | |4 |
||
− | <br><br> |
||
+ | |4 |
||
− | * '''BACR_HALSA''' |
||
+ | |2 |
||
− | <br><br> |
||
+ | |0 |
||
− | [[Image:bacr_halsa_gopet.png|center|Result of the GOPET prediction for BACR_HALSA]] |
||
+ | |0 |
||
− | |||
− | The method only predicts functional GO terms. BACR_HALSA has 3 annotated GO functions. The methods predicts also 3 GO function terms. Therefore we decided to check if all predictions are correct. |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |GO term |
||
− | |confidence |
||
− | |prediction term |
||
− | |prediction GOid |
||
|- |
|- |
||
+ | |#wrong sum |
||
− | |ion channel activity |
||
− | | |
+ | |74 |
+ | |29 |
||
− | |right |
||
+ | |17 |
||
− | |right |
||
+ | |20 |
||
+ | |17 |
||
|- |
|- |
||
+ | |%wrong predicted |
||
− | |G-protein coupled photoreceptor activity |
||
− | | |
+ | |29% |
+ | |11% |
||
− | |right |
||
+ | |6% |
||
− | |wrong |
||
+ | |8% |
||
+ | |6% |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |hydrogen ion transmembrane transporter activity |
||
− | |60% |
||
− | |wrong |
||
− | |wrong |
||
|- |
|- |
||
+ | |rowspan="5" | RET4_HUMAN |
||
− | |} |
||
+ | |#wrong transmembrane |
||
− | <br><br> |
||
+ | |0 |
||
− | * '''RET4_HUMAN''' |
||
+ | |0 |
||
− | <br><br> |
||
+ | |0 |
||
− | [[Image:ret4_human_gopet.png|center|Result of the GOPET prediction for RET4_HUMAN]] |
||
+ | |5 |
||
− | |||
+ | |0 |
||
− | The method only predicts functional GO terms. RET4_HUMAN has 7 annotated GO functions. The methods predicts 8 GO function terms. Therefore we decided to check if all predictions are correct. |
||
+ | |rowspan="5" | no |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |GO term |
||
− | |confidence |
||
− | |prediction term |
||
− | |prediction GOid |
||
|- |
|- |
||
+ | |#wrong outside |
||
− | |binding |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong inside |
||
− | |retiniod binding |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong sum |
||
− | |lipid binding |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
− | |wrong |
||
+ | |5 |
||
+ | |0 |
||
|- |
|- |
||
+ | |%wrong predicted |
||
− | |retional binding |
||
− | | |
+ | |0% |
+ | |0% |
||
− | |right |
||
+ | |0% |
||
− | |right |
||
+ | |2% |
||
+ | |0% |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |transporter activity |
||
− | |78% |
||
− | |right |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="5" | INSL5_HUMAN |
||
− | |retinal binding |
||
+ | |#wrong transmembrane |
||
− | |78% |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |10 |
||
+ | |0 |
||
+ | |rowspan="5" | no |
||
|- |
|- |
||
+ | |#wrong outside |
||
− | |lipid transport activity |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong inside |
||
− | |high-density lipoprotein particle binding |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong sum |
||
− | |} |
||
+ | |0 |
||
− | <br><br> |
||
+ | |0 |
||
− | *''' INSL5_HUMAN''' |
||
+ | |0 |
||
− | <br><br> |
||
+ | |10 |
||
− | [[Image:insl5_human_gopet.png|center|Result of the GOPET prediction for INSL5_HUMAN]] |
||
+ | |0 |
||
− | |||
− | The method only predicts functional GO terms. INSL5_HUMAN has 1 annotated GO functions. The methods predicts also 1 GO function terms. Therefore we decided to check if all predictions are correct. |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |GO term |
||
− | |confidence |
||
− | |prediction term |
||
− | |prediction GOid |
||
|- |
|- |
||
+ | |%wrong predicted |
||
− | |hormone activity |
||
− | | |
+ | |0% |
+ | |0% |
||
− | |right |
||
+ | |0% |
||
− | |right |
||
+ | |8% |
||
+ | |0% |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |} |
||
− | <br><br> |
||
− | * '''LAMP1_HUMAN''' |
||
− | <br><br> |
||
− | [[Image:lamp1_human_gopet.png|center|Result of the GOPET prediction for LAMP1_HUMAN]] |
||
− | |||
− | The method only predicts functional GO terms. LAMP1_HUMAN has 0 annotated GO functions. The methods predicts 2 GO function terms. Therefore the predictions are wrong. |
||
− | <br><br> |
||
− | * '''A4_HUMAN''' |
||
− | <br><br> |
||
− | [[Image:a4_human_gopet.png|center|Result of the GOPET prediction for A4_HUMAN]] |
||
− | |||
− | The method only predicts functional GO terms. A4_HUMAN has 11 annotated GO functions. The methods predicts 13 GO function terms. Therefore we decided to check if all predictions are correct. |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |GO term |
||
− | |confidence |
||
− | |prediction term |
||
− | |prediction GOid |
||
|- |
|- |
||
+ | |rowspan="5" | LAMP1_HUMAN |
||
− | |endopeptidase inhibitor activity |
||
+ | |#wrong transmembrane |
||
− | |87% |
||
+ | |5 |
||
− | |right |
||
+ | |3 |
||
− | |wrong |
||
+ | |4 |
||
+ | |3 |
||
+ | |1 |
||
+ | |rowspan="5" | yes (single-spanning) |
||
|- |
|- |
||
+ | |#wrong outside |
||
− | |serine-type endopeptidase inhibitor activity |
||
− | | |
+ | |2 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |1 |
||
+ | |1 |
||
|- |
|- |
||
+ | |#wrong inside |
||
− | |plasmin inhibitor activity |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
− | |wrong |
||
+ | |1 |
||
+ | |1 |
||
|- |
|- |
||
+ | |#wrong sum |
||
− | |trypsin inhibitor activtiy |
||
− | | |
+ | |7 |
+ | |3 |
||
− | |wrong |
||
+ | |4 |
||
− | |wrong |
||
+ | |5 |
||
+ | |3 |
||
|- |
|- |
||
+ | |%wrong predicted |
||
− | |peptidase inhibitor activity |
||
− | | |
+ | |2% |
+ | |0% |
||
− | |right |
||
+ | |1% |
||
− | |right |
||
+ | |1% |
||
+ | |0% |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |binding |
||
− | |79% |
||
− | |right |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="5" | A4_HUMAN |
||
− | |protein binding |
||
+ | |#wrong transmembrane |
||
− | |74% |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
+ | |0 |
||
+ | |0 |
||
+ | |rowspan="5" | yes (single-spanning) |
||
|- |
|- |
||
+ | |#wrong outside |
||
− | |metal ion binding |
||
− | | |
+ | |1 |
+ | |1 |
||
− | |right |
||
+ | |1 |
||
− | |right |
||
+ | |1 |
||
+ | |2 |
||
|- |
|- |
||
+ | |#wrong inside |
||
− | |DNA binding |
||
− | | |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
− | |right |
||
+ | |1 |
||
+ | |1 |
||
|- |
|- |
||
+ | |#wrong sum |
||
− | |heparin binding |
||
− | | |
+ | |1 |
+ | |1 |
||
− | |wrong |
||
+ | |1 |
||
− | |right |
||
+ | |2 |
||
+ | |3 |
||
|- |
|- |
||
+ | |%wrong predicted |
||
− | |zinc ion binding |
||
− | | |
+ | |0% |
+ | |0% |
||
− | |wrong |
||
+ | |0% |
||
− | |wrong |
||
+ | |0% |
||
+ | |0% |
||
|- |
|- |
||
+ | !colspan="8" | Average number of wrong predicted residues |
||
− | |copper ion binding |
||
− | |69% |
||
− | |wrong |
||
− | |wrong |
||
− | |- |
||
− | |iron ion binding |
||
− | |67% |
||
− | |wrong |
||
− | |wrong |
||
|- |
|- |
||
+ | | |
||
+ | | |
||
+ | |13.6 |
||
+ | |5.5 |
||
+ | |3.6 |
||
+ | |7 |
||
+ | |3.8 |
||
+ | | |
||
|} |
|} |
||
− | <br><br> |
||
− | <br><br> |
||
− | |||
− | === Pfam === |
||
− | We used the webserver for our analysis. We decided to only trust the significant Pfam-A matches. To check if the predictions are correct we mapped the Pfam ids to the Go ids with help of a mapping website [[http://www.geneontology.org/external2go/pfam2go]]. If a successful mapping was not possible, we compared the names of the predicted Pfam family with the names of the GO terms. If the names are similar or equal, we decided to trust the mapping. |
||
+ | TMHMM is the baddest prediction method. This can also be seen at the example of BACR_HALSA, because TMHMM is the only prediction method, which do not recognize the 7 transmembrane helices. |
||
+ | SPOCTOPUS and PolyPhobius are the best prediction methods.<br><br> |
||
+ | In general the prediction of transmembrane helices works quite good and almost all predictions are very close to the real protein. |
||
<br><br> |
<br><br> |
||
+ | * Comparison of signal peptide prediction |
||
− | * '''HEXA_HUMAN''' |
||
− | |||
− | Graphical representation of the prediction result of Pfam: |
||
− | [[Image:hexa_human_pfam.png|center|Result of the Pfam prediction for HEXA_HUMAN]] |
||
− | |||
− | Pfam found two significant Pfam-A matches: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Family |
||
− | |E-Value |
||
− | |GO id |
||
− | |prediction |
||
− | |- |
||
− | |Glycosyl hydrolase family 20, domain 2 |
||
− | |3.7e-43 |
||
− | |GO:0004553 |
||
− | |right |
||
− | |- |
||
− | |Glycosyl hydrolase family 20, catalytic domain |
||
− | |1.8e-84 |
||
− | |GO:0005975 |
||
− | |right |
||
− | |- |
||
− | |} |
||
<br><br> |
<br><br> |
||
+ | Now we compared TargetP and SignalP which can only predict signal peptides. Furthermore we compared SPOCTOPUS, Phobius and PolyPhobius. |
||
− | * '''BACR_HALSA''' |
||
+ | TargetP does not predict the start and end position of the signal peptide, instead it predicts only the location of the protein. |
||
− | <br><br> |
||
− | Graphical representation of the prediction result of Pfam: |
||
− | [[Image:bacr_halsa_pfam.png|center|Result of the Pfam prediction for BACR_HALSA]] |
||
− | Pfam found one significant Pfam-A matches: |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |rowspan="2" | |
||
− | |Family |
||
+ | |rowspan="2" | |
||
− | |E-Value |
||
+ | |colspan="6" | methods |
||
− | |GOid |
||
− | |prediction |
||
|- |
|- |
||
+ | |real position |
||
− | |rowspan="3" | Bacteriorhodopsin-like protein |
||
+ | |Phobius |
||
− | |rowspan="3" | 2e-88 |
||
+ | |PolyPhobius |
||
− | |GO:0005216 |
||
+ | |SPOCTOPUS |
||
− | |right |
||
+ | |TargetP |
||
+ | |SignalP |
||
|- |
|- |
||
+ | |rowspan="3" | HEXA_HUMAN |
||
− | |GO:0006811 |
||
+ | |stop position |
||
− | |right |
||
+ | |22 |
||
+ | |22 |
||
+ | |19 |
||
+ | |21 |
||
+ | |no prediction |
||
+ | |22 |
||
|- |
|- |
||
+ | |#wrong residues |
||
− | |GO:0016020 |
||
+ | | |
||
− | |right |
||
+ | |0 |
||
+ | |3 |
||
+ | |3 |
||
+ | |no prediction |
||
+ | |0 |
||
|- |
|- |
||
+ | |location |
||
− | |} |
||
+ | |secretory pathway |
||
− | <br><br> |
||
+ | |secretory pathway |
||
− | * '''RET4_HUMAN''' |
||
+ | |secretory pathway |
||
− | <br><br> |
||
+ | |no prediction |
||
− | Graphical representation of the prediction result of Pfam: |
||
+ | |secretory pathway |
||
− | [[Image:ret4_human_pfam.png|center|Result of the Pfam prediction for RET4_HUMAN]] |
||
+ | |no prediction |
||
− | |||
− | Pfam found one significant Pfam-A matches: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Family |
||
− | |E-Value |
||
− | |GOid |
||
− | |prediction |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |Lipocalin/cytosolic fatty-acid binding protein family |
||
− | |1.7e-22 |
||
− | |GO:0005488 |
||
− | |right |
||
− | |} |
||
− | <br><br> |
||
− | * '''INSL5_HUMAN''' |
||
− | <br><br> |
||
− | Graphical representation of the prediction result of Pfam: |
||
− | [[Image:insl5_human_pfam.png|center|Result of the Pfam prediction for LAMP1_HUMAN]] |
||
− | |||
− | Pfam found two significant Pfam-A matches: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Family |
||
− | |E-Value |
||
− | |GOid |
||
− | |prediction |
||
|- |
|- |
||
− | |rowspan=" |
+ | |rowspan="3" | BACR_HALSA |
+ | |stop position |
||
− | |rowspan="2" | 6.7e-08 |
||
+ | |not available |
||
− | |GO:0005179 |
||
+ | |no prediction |
||
− | |right |
||
+ | |no prediction |
||
+ | |no prediction |
||
+ | |no prediction |
||
+ | |no consensus prediction |
||
|- |
|- |
||
+ | |#wrong predicted |
||
− | |GO:0005576 |
||
+ | |not available |
||
− | |right |
||
+ | |not available |
||
+ | |not available |
||
+ | |not available |
||
+ | |no prediction |
||
+ | |not available |
||
|- |
|- |
||
+ | |location |
||
− | |} |
||
+ | |membrane |
||
− | <br><br> |
||
+ | |not available |
||
− | * '''LAMP1_HUMAN''' |
||
+ | |not available |
||
− | <br><br> |
||
+ | |not available |
||
− | Graphical representation of the prediction result of Pfam: |
||
+ | |secretory pathway |
||
− | [[Image:lamp1_human_pfam2.png|center|Result of the Pfam prediction for LAMP1_HUMAN]] |
||
+ | |non-signal peptide |
||
− | |||
− | Pfam found one significant Pfam-A matches: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Family |
||
− | |E-Value |
||
− | |GOid |
||
− | |prediction |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |Lysosome-associated membrane glyoprotein (LAMP) |
||
− | |2.3e-135 |
||
− | |GO:0016020 |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="3" | RET4_HUMAN |
||
− | |} |
||
+ | |stop position |
||
− | <br><br> |
||
+ | |18 |
||
− | *''' A4_HUMAN''' |
||
+ | |18 |
||
− | <br><br> |
||
+ | |18 |
||
− | Graphical representation of the prediction result of Pfam: |
||
+ | |19 |
||
− | [[Image:a4_human_pfam.png|center|Result of the Pfam prediction for A4_HUMAN]] |
||
+ | |no prediction |
||
− | |||
+ | |18 |
||
− | Pfam found six significant Pfam-A matches: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Family |
||
− | |E-Value |
||
− | |GOid |
||
− | |prediction |
||
|- |
|- |
||
+ | |#wrong predicted |
||
− | |Amyloid A4 N-terminal heparin-binding |
||
+ | | |
||
− | |4e-42 |
||
+ | |0 |
||
− | |none |
||
+ | |0 |
||
− | |right |
||
+ | |1 |
||
+ | |no prediction |
||
+ | |0 |
||
|- |
|- |
||
+ | |location |
||
− | |Copper-binding of amyloid precursor CuBD |
||
+ | |secretory pathway |
||
− | |2.3e-27 |
||
+ | |secretory pathway |
||
− | |none |
||
+ | |secretory pathway |
||
− | |right |
||
+ | |no prediction |
||
− | |- |
||
+ | |secretory pathway |
||
− | |Kunitz/Bovine pancreatic trypsin inhibitor domain |
||
+ | |no prediction |
||
− | |3e-19 |
||
− | |GO:0004867 |
||
− | |right |
||
|- |
|- |
||
− | |E2 domain of amyloid precursor protein |
||
− | |1.6e-74 |
||
− | |none |
||
− | |right |
||
− | |- |
||
− | |rowspan="2" | Beta-amyloid peptide (beta-APP) |
||
− | |rowspan="2" | 4.3e-28 |
||
− | |GO:0005488 |
||
− | |right |
||
− | |- |
||
− | |GO:0016021 |
||
− | |right |
||
− | |- |
||
− | |Beta-amyloid precursor protein C-terminus |
||
− | |1.1e-29 |
||
− | |none |
||
− | |right |
||
− | |- |
||
− | |} |
||
− | <br><br> |
||
− | <br><br> |
||
+ | !colspan="8" | |
||
− | === ProtFun 2.2 === |
||
− | <br><br> |
||
− | ProtFun 2.2 does not give clear predictions if the protein belongs to this class or not, instead it gives probabilities and odd scores. |
||
− | We decided to make a cutoff by 2. So all classes with an odd score of 2 or higher are right results for us. You can also find a "=>" sign in the result file. This sign shows the result with the highest information content. We also take this line as result, although if the odd score is lower than 2. If we only have result with a odd score lower than 2, the line with this sign is our onlyest result.<br> |
||
− | Because the prediction categories are very general, it was not possible to map the GOids. Therefore, we checked the known GO annotations. If there was a hint for a category and the protein was predicted to be in this category, we decided that the prediction is right, otherwise if the known GO annotations and the categories conflict, we count the prediction as wrong. |
||
− | <br><br> |
||
− | * ''' HEXA_HUMAN |
||
− | <br><br> |
||
− | The ProtFun Server calculated following prediction result for HEXA_HUMAN: |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="4" | Functional category |
||
|- |
|- |
||
+ | |rowspan="3" | INSL5_HUMAN |
||
− | |Functional category |
||
+ | |stop position |
||
− | |Probability |
||
+ | |22 |
||
− | |Odd score |
||
+ | |22 |
||
− | |Prediction |
||
+ | |22 |
||
+ | |22 |
||
+ | |no prediction |
||
+ | |22 |
||
|- |
|- |
||
+ | |#wrong residues |
||
− | |Amino acid biosynthesis |
||
+ | | |
||
− | |0.161 |
||
+ | |0 |
||
− | |7.331 |
||
+ | |0 |
||
− | |wrong |
||
+ | |0 |
||
+ | |no prediction |
||
+ | |0 |
||
|- |
|- |
||
+ | |location |
||
− | |Biosynthesis of cofactors |
||
+ | |secretory pathway |
||
− | |0.332 |
||
+ | |secretory pathway |
||
− | |4.609 |
||
+ | |secretory pathway |
||
− | |right |
||
+ | |no prediction |
||
+ | |secretory pathway |
||
+ | |no prediction |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |Cell envelope |
||
− | |0.804 => |
||
− | |13.186 => |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="3" | LAMP1_HUMAN |
||
− | |Cellular processes |
||
+ | |stop position |
||
− | |0.110 |
||
+ | |28 |
||
− | |1.506 |
||
+ | |28 |
||
− | |right |
||
+ | |28 |
||
+ | |29 |
||
+ | |no prediction |
||
+ | |28 |
||
|- |
|- |
||
+ | |#wrong residues |
||
− | |Central intermediary metabolism |
||
+ | | |
||
− | |0.432 |
||
+ | |0 |
||
− | |6.856 |
||
+ | |0 |
||
− | |right |
||
+ | |1 |
||
+ | |no prediction |
||
+ | |0 |
||
|- |
|- |
||
+ | |location |
||
− | |Engergy metabolism |
||
+ | |transmembrane helix |
||
− | |0.113 |
||
+ | |secretory pathway |
||
− | |1.259 |
||
+ | |secretory pathway |
||
− | |right |
||
+ | |no prediction |
||
+ | |secretory pathway |
||
+ | |no prediction |
||
|- |
|- |
||
+ | !colspan="8" | |
||
− | |Fatty acid metabolsim |
||
− | |0.019 |
||
− | |1.427 |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="3" | A4_HUMAN |
||
− | |Purines and Pyrimidines |
||
+ | |stop position |
||
− | |0.519 |
||
+ | |17 |
||
− | |2.136 |
||
+ | |17 |
||
− | |wrong |
||
+ | |17 |
||
+ | |18 |
||
+ | |no prediction |
||
+ | |17 |
||
|- |
|- |
||
+ | |#wrong residues |
||
− | |Regulatory functions |
||
+ | | |
||
− | |0.018 |
||
− | |0 |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |1 |
||
+ | |no prediction |
||
+ | |0 |
||
|- |
|- |
||
+ | |location |
||
− | |Replication and transcription |
||
+ | |transmembrane helix |
||
− | |0.073 |
||
+ | |secretory pathway |
||
− | |0.271 |
||
+ | |secretory pathway |
||
− | |right |
||
+ | |no prediction |
||
+ | |secretory pathway |
||
+ | |secretory pathway |
||
|- |
|- |
||
+ | !colspan="8" | Average number of wrong prediction |
||
− | |Translation |
||
− | |0.040 |
||
− | |0.904 |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="2" | |
||
− | |Transport and binding |
||
+ | |sum of wrong predicted residues |
||
− | |0.685 |
||
+ | | |
||
− | |1.670 |
||
+ | |0 |
||
− | |right |
||
− | | |
+ | |3 |
+ | |2 |
||
− | !colspan="4" | Enyzme/non-enzyme |
||
+ | |no prediction |
||
− | |- |
||
+ | |0 |
||
− | |Enzyme/non-enzyme |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Enzyme |
||
− | |0.792 => |
||
− | |2.764 => |
||
− | |right |
||
− | |- |
||
− | |Nonenzyme |
||
− | |0.208 |
||
− | |0.292 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme class |
||
− | |- |
||
− | |Enzyme class |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Oxidoreductase (EC 1.-.-.-) |
||
− | |0.143 |
||
− | |0.685 |
||
− | |right |
||
− | |- |
||
− | |Transferase (EC 2.-.-.-) |
||
− | |0.201 |
||
− | |0.582 |
||
− | |right |
||
− | |- |
||
− | |Hydrolase (EC 3.-.-.-) |
||
− | |0.329 |
||
− | |1.039 |
||
− | |wrong |
||
− | |- |
||
− | |Lyase (EC 4.-.-.-) |
||
− | |0.054 |
||
− | |1.143 |
||
− | |right |
||
− | |- |
||
− | |Isomerase (EC 5.-.-.-) |
||
− | |0.027 |
||
− | |0.856 |
||
− | |right |
||
− | |- |
||
− | |Ligase (EC 6.-.-.-) |
||
− | |0.085 => |
||
− | |1.661 => |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Gene ontology category |
||
− | |- |
||
− | |Gene ontology category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Signal transducer |
||
− | |0.083 |
||
− | |0.389 |
||
− | |right |
||
− | |- |
||
− | |Receptor |
||
− | |0.105 |
||
− | |0.617 |
||
− | |right |
||
− | |- |
||
− | |Hormone |
||
− | |0.001 |
||
− | |0.206 |
||
− | |right |
||
− | |- |
||
− | |Structural protein |
||
− | |0.010 |
||
− | |0.357 |
||
− | |right |
||
− | |- |
||
− | |Transporter |
||
− | |0.024 |
||
− | |0.222 |
||
− | |right |
||
− | |- |
||
− | |Ion channel |
||
− | |0.018 |
||
− | |0.310 |
||
− | |right |
||
− | |- |
||
− | |Volatge-gated ion channel |
||
− | |0.002 |
||
− | |0.082 |
||
− | |right |
||
− | |- |
||
− | |Cation channel |
||
− | |0.010 |
||
− | |0.218 |
||
− | |right |
||
− | |- |
||
− | |Transcription |
||
− | |0.058 |
||
− | |0.453 |
||
− | |right |
||
− | |- |
||
− | |Transcription regulation |
||
− | |0.026 |
||
− | |0.205 |
||
− | |right |
||
− | |- |
||
− | |Stress response |
||
− | |0.004 |
||
− | |0.500 |
||
− | |right |
||
− | |- |
||
− | |Immune response |
||
− | |0.014 |
||
− | |0.167 |
||
− | |right |
||
− | |- |
||
− | |Growth factor |
||
− | |0.005 |
||
− | |0.372 |
||
− | |right |
||
− | |- |
||
− | |Metal ion transport |
||
− | |0.009 |
||
− | |0.020 |
||
− | |right |
||
|- |
|- |
||
+ | |#right predicted locations / #predicted locations |
||
+ | | |
||
+ | |3/5 |
||
+ | |3/5 |
||
+ | |no prediction |
||
+ | |3/5 |
||
+ | |no prediction |
||
|} |
|} |
||
+ | SPOCTOPUS and SignalP do not predict the location of the protein, they only predict the start and stop position of the signal peptide. Furthermore, SignalP predicts if it is a signal peptide or not. |
||
+ | In contrast, TargetP only predicts the location of the protein, not the start and stop position of the signal peptide. Only Phobius and PolyPhobius predict both.<br> |
||
+ | Therefore, it is difficult to compare the different methods. First of all, Phobius and PolyPhobius have more power than the other prediction methods, because they predict both. In average they predict the location and also the position as good as the other prediction methods. None of the methods could predict the transmembrane proteins, all methods predict them as proteins of the secretory pathway. Therefore, it is useful to use Phobius or PolyPhobius, because they predict more than the other methods. Furthermore, both methods can also predict transmembrane helices. |
||
+ | The results of Phobius were a litte bit better than the results of PolyPhobius.<br> |
||
+ | We also wanted to mention, that SignalP gave you the possibility to choose between the prediction for eukaryotes, gram-positive bacteria and gram-negative bacteria. In our analyse we also analysied BACR_HALSA, which is an archaea protein. We tested all three prediction methods for this protein and all three methods failed. BACR_HALSA don't posses a signal peptide, but every method predicts one. Only the eukaryotic prediction method recogniced a signal anchor for BACR_HALSA, whereas the other two methods could not give a prediction of the location.<br><br> |
||
<br><br> |
<br><br> |
||
+ | * Comparison of the combined methods |
||
− | * '''BACR_HALSA''' |
||
<br><br> |
<br><br> |
||
+ | The last thing, which we wanted to compare, was the combined methods. SPOCTOPUS, Phobius and PolyPhobius can predict transmembrane helices as well as signal peptides. Therefore we combined our two further comparisons. |
||
− | The ProtFun Server calculated following prediction result for BACR_HALSA: |
||
+ | |||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |rowspan="2" | |
||
− | !colspan="4" | Functional category |
||
+ | |rowspan="2" | |
||
+ | |colspan="3" | methods |
||
|- |
|- |
||
+ | |Phobius |
||
− | |Functional category |
||
+ | |PolyPhobius |
||
− | |Probability |
||
+ | |SPOCTOPUS |
||
− | |Odd score |
||
− | |Prediction |
||
|- |
|- |
||
+ | |rowspan="3" | HEXA_HUMAN |
||
− | |Amino acid biosynthesis |
||
+ | |#wrong predicted residues (TM) |
||
− | |0.033 |
||
+ | |0 |
||
− | |1.495 |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong predicted residues (SP) |
||
− | |Biosynthesis of cofactors |
||
− | |0 |
+ | |0 |
+ | |3 |
||
− | |2.589 |
||
+ | |2 |
||
− | |wrong |
||
|- |
|- |
||
+ | |location |
||
− | |Cell envelope |
||
− | |0.029 |
||
− | |0.483 |
||
|right |
|right |
||
− | |- |
||
− | |Cellular processes |
||
− | |0.051 |
||
− | |0.698 |
||
|right |
|right |
||
+ | |no prediction |
||
|- |
|- |
||
+ | !colspan="5" | |
||
− | |Central intermediary metabolism |
||
− | |0.045 |
||
− | |0.711 |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="3" | BACR_HALSA |
||
− | |Engergy metabolism |
||
+ | |#wrong predicted residues (TM) |
||
− | |0.138 |
||
+ | |29 |
||
− | |1.537 |
||
+ | |17 |
||
− | |right |
||
+ | |17 |
||
|- |
|- |
||
+ | |#wrong predicted residues (SP) |
||
− | |Fatty acid metabolsim |
||
− | | |
+ | |n.a. |
− | | |
+ | |n.a. |
+ | |n.a. |
||
− | |right |
||
|- |
|- |
||
+ | |location |
||
− | |Purines and Pyrimidines |
||
− | | |
+ | |n.a |
− | | |
+ | |n.a |
+ | |no prediction |
||
− | |right |
||
|- |
|- |
||
+ | !colspan="5" | |
||
− | |Regulatory functions |
||
− | |0.013 |
||
− | |0.080 |
||
− | |wrong |
||
|- |
|- |
||
+ | |rowspan="3" | RET4_HUMAN |
||
− | |Replication and transcription |
||
+ | |#wrong predicted residues (TM) |
||
− | |0.019 |
||
− | |0 |
+ | |0 |
+ | |0 |
||
− | |right |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong predicted residues (SP) |
||
− | |Translation |
||
− | |0 |
+ | |0 |
+ | |0 |
||
− | |1.339 |
||
+ | |0 |
||
− | |right |
||
|- |
|- |
||
+ | |location |
||
− | |Transport and binding |
||
− | |0.791 => |
||
− | |1.929 => |
||
|right |
|right |
||
− | |- |
||
− | !colspan="4" | Enyzme/non-enzyme |
||
− | |- |
||
− | |Enzyme/non-enzyme |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Enzyme |
||
− | |0.199 |
||
− | |0.696 |
||
|right |
|right |
||
+ | |no prediction |
||
|- |
|- |
||
+ | !colspan="5" | |
||
− | |Nonenzyme |
||
− | |0.801 => |
||
− | |1.122 => |
||
− | |right |
||
|- |
|- |
||
− | + | |rowspan="3" | INSL5_HUMAN |
|
+ | |#wrong predicted residues (TM) |
||
+ | |0 |
||
+ | |0 |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong predicted residues (SP) |
||
− | |Enzyme class |
||
+ | |0 |
||
− | |Probability |
||
+ | |0 |
||
− | |Odd score |
||
+ | |1 |
||
− | |Prediction |
||
|- |
|- |
||
+ | |location |
||
− | |Oxidoreductase (EC 1.-.-.-) |
||
− | |0.114 |
||
− | |0.549 |
||
|right |
|right |
||
− | |- |
||
− | |Transferase (EC 2.-.-.-) |
||
− | |0.031 |
||
− | |0.091 |
||
|right |
|right |
||
+ | |no prediction |
||
|- |
|- |
||
+ | !colspan="5" | |
||
− | |Hydrolase (EC 3.-.-.-) |
||
− | |0.057 |
||
− | |0.180 |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="3" | LAMP1_HUMAN |
||
− | |Lyase (EC 4.-.-.-) |
||
+ | |#wrong predicted residues (TM) |
||
− | |0.020 |
||
+ | |3 |
||
− | |0.430 |
||
+ | |4 |
||
− | |right |
||
+ | |3 |
||
|- |
|- |
||
+ | |#wrong predicted residues (SP) |
||
− | |Isomerase (EC 5.-.-.-) |
||
− | |0 |
+ | |0 |
− | |0 |
+ | |0 |
+ | |0 |
||
− | |right |
||
|- |
|- |
||
+ | |location |
||
− | |Ligase (EC 6.-.-.-) |
||
− | |0.017 |
||
− | |0.625 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Gene ontology category |
||
− | |- |
||
− | |Gene ontology category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Signal transducer |
||
− | |0.258 |
||
− | |1.205 |
||
|wrong |
|wrong |
||
− | |- |
||
− | |Receptor |
||
− | |0.355 |
||
− | |2.087 |
||
− | |right |
||
− | |- |
||
− | |Hormone |
||
− | |0.001 |
||
− | |0.206 |
||
− | |right |
||
− | |- |
||
− | |Structural protein |
||
− | |0.006 |
||
− | |0.200 |
||
− | |right |
||
− | |- |
||
− | |Transporter |
||
− | |0.440 => |
||
− | |4.036 => |
||
− | |right |
||
− | |- |
||
− | |Ion channel |
||
− | |0.010 |
||
− | |0.169 |
||
|wrong |
|wrong |
||
+ | |no prediction |
||
|- |
|- |
||
+ | !colspan="5" | |
||
− | |Volatge-gated ion channel |
||
− | |0.004 |
||
− | |0.172 |
||
− | |right |
||
|- |
|- |
||
+ | |rowspan="3" | A4_HUMAN |
||
− | |Cation channel |
||
+ | |#wrong predicted residues (TM) |
||
− | |0.078 |
||
+ | |0 |
||
− | |1.689 |
||
+ | |0 |
||
− | |right |
||
+ | |0 |
||
|- |
|- |
||
+ | |#wrong predicted residues (SP) |
||
− | |Transcription |
||
+ | |1 |
||
− | |0.026 |
||
+ | |1 |
||
− | |0.205 |
||
+ | |3 |
||
− | |right |
||
|- |
|- |
||
+ | |location |
||
− | |Transcription regulation |
||
− | |0.028 |
||
− | |0.226 |
||
− | |right |
||
− | |- |
||
− | |Stress response |
||
− | |0.012 |
||
− | |0.139 |
||
− | |right |
||
− | |- |
||
− | |Immune response |
||
− | |0.011 |
||
− | |0.128 |
||
− | |right |
||
− | |- |
||
− | |Growth factor |
||
− | |0.010 |
||
− | |0.727 |
||
− | |right |
||
− | |- |
||
− | |Metal ion transport |
||
− | |0.049 |
||
− | |0.106 |
||
− | |right |
||
− | |- |
||
− | |} |
||
− | <br><br> |
||
− | * '''RET4_HUMAN''' |
||
− | <br><br> |
||
− | The ProtFun Server calculated following prediction result for RET4_HUMAN: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="4" | Functional category |
||
− | |- |
||
− | |Functional category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Amino acid biosynthesis |
||
− | |0.017 |
||
− | |0.751 |
||
− | |right |
||
− | |- |
||
− | |Biosynthesis of cofactors |
||
− | |0.044 |
||
− | |0.610 |
||
− | |right |
||
− | |- |
||
− | |Cell envelope |
||
− | |0.804 => |
||
− | |13.186 => |
||
− | |right |
||
− | |- |
||
− | |Cellular processes |
||
− | |0.075 |
||
− | |1.021 |
||
|wrong |
|wrong |
||
− | |- |
||
− | |Central intermediary metabolism |
||
− | |0.197 |
||
− | |3.128 |
||
− | |right |
||
− | |- |
||
− | |Engergy metabolism |
||
− | |0.043 |
||
− | |0.475 |
||
− | |right |
||
− | |- |
||
− | |Fatty acid metabolsim |
||
− | |0.016 |
||
− | |1.265 |
||
− | |right |
||
− | |- |
||
− | |Purines and Pyrimidines |
||
− | |0.275 |
||
− | |1.131 |
||
− | |right |
||
− | |- |
||
− | |Regulatory functions |
||
− | |0.013 |
||
− | |0.080 |
||
− | |right |
||
− | |- |
||
− | |Replication and transcription |
||
− | |0.022 |
||
− | |0.084 |
||
− | |right |
||
− | |- |
||
− | |Translation |
||
− | |0.032 |
||
− | |0.721 |
||
− | |right |
||
− | |- |
||
− | |Transport and binding |
||
− | |0.800 |
||
− | |1.951 |
||
|wrong |
|wrong |
||
+ | |no prediction |
||
|- |
|- |
||
− | !colspan=" |
+ | !colspan="5" | Average |
|- |
|- |
||
+ | |rowspan="3" | |
||
− | |Enzyme/non-enzyme |
||
+ | |avg(#wrong predicted residues (TM)) |
||
− | |Probabilty |
||
+ | |5.3 |
||
− | |Odd score |
||
+ | |3.5 |
||
− | |Prediction |
||
+ | |3.3 |
||
|- |
|- |
||
+ | |avg(#wrong predicted residues (SP)) |
||
− | |Enzyme |
||
− | |0. |
+ | |0.1 |
− | | |
+ | |0.6 |
+ | |1 |
||
− | |right |
||
|- |
|- |
||
+ | |#location (right predicted) / #location(predicted) |
||
− | |Nonenzyme |
||
+ | |3/5 |
||
− | |0.456 |
||
+ | |3/5 |
||
− | |0.639 |
||
+ | |no prediction |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme class |
||
− | |- |
||
− | |Enzyme class |
||
− | |Probabilty |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Oxidoreductase (EC 1.-.-.-) |
||
− | |0.095 |
||
− | |0.458 |
||
− | |right |
||
− | |- |
||
− | |Transferase (EC 2.-.-.-) |
||
− | |0.038 |
||
− | |0.109 |
||
− | |right |
||
− | |- |
||
− | |Hydrolase (EC 3.-.-.-) |
||
− | |0.235 |
||
− | |0.742 |
||
− | |right |
||
− | |- |
||
− | |Lyase (EC 4.-.-.-) |
||
− | |0.059 => |
||
− | |1.264 => |
||
− | |wrong |
||
− | |- |
||
− | |Isomerase (EC 5.-.-.-) |
||
− | |0.010 |
||
− | |0.321 |
||
− | |right |
||
− | |- |
||
− | |Ligase (EC 6.-.-.-) |
||
− | |0.017 |
||
− | |0.326 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Gene ontology category |
||
− | |- |
||
− | |Gene ontology category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Signal transducer |
||
− | |0.202 |
||
− | |0.942 |
||
− | |right |
||
− | |- |
||
− | |Receptor |
||
− | |0.147 |
||
− | |0.862 |
||
− | |right |
||
− | |- |
||
− | |Hormone |
||
− | |0.004 |
||
− | |0.667 |
||
− | |right |
||
− | |- |
||
− | |Structural protein |
||
− | |0.002 |
||
− | |0.058 |
||
− | |right |
||
− | |- |
||
− | |Transporter |
||
− | |0.025 |
||
− | |0.232 |
||
− | |right |
||
− | |- |
||
− | |Ion channel |
||
− | |0.016 |
||
− | |0.288 |
||
− | |right |
||
− | |- |
||
− | |Volatge-gated ion channel |
||
− | |0.003 |
||
− | |0.148 |
||
− | |right |
||
− | |- |
||
− | |Cation channel |
||
− | |0.010 |
||
− | |0.215 |
||
− | |right |
||
− | |- |
||
− | |Transcription |
||
− | |0.027 |
||
− | |0.207 |
||
− | |right |
||
− | |- |
||
− | |Transcription regulation |
||
− | |0.025 |
||
− | |0.196 |
||
− | |right |
||
− | |- |
||
− | |Stress response |
||
− | |0.161 |
||
− | |1.829 |
||
− | |right |
||
− | |- |
||
− | |Immune response |
||
− | |0.239 => |
||
− | |2.813 => |
||
− | |wrong |
||
− | |- |
||
− | |Growth factor |
||
− | |0.023 |
||
− | |1.617 |
||
− | |right |
||
− | |- |
||
− | |Metal ion transport |
||
− | |0.009 |
||
− | |0.020 |
||
− | |right |
||
|- |
|- |
||
|} |
|} |
||
+ | |||
− | <br><br> |
||
+ | In general, PolyPhobius gave the best results. Although it predicts the singal peptide stop position a little bit badder than Phobius, the transmembrane prediction is significant bettern than by Phobius. The predictions of SPOCTOPUS are also good, but sadly SPOCTOPUS does not predict the location of the protein.<br> |
||
− | *'''INSL5_HUMAN''' |
||
+ | Therefore, it seems a good choice to use PolyPhobius, which is in average the best method for transmembrane and signal peptide prediction.<br><br> |
||
− | <br><br> |
||
+ | |||
− | The ProtFun Server calculated following prediction result for INSL5_HUMAN: |
||
+ | == Prediction of GO terms == |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |||
− | !colspan="4" | Functional category |
||
+ | Before we start with our analysis, we decided to check the GO annotations for the six sequences, which can be found [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_annotation_of_the_proteins here]]: |
||
− | |- |
||
+ | |||
− | |Functional category |
||
+ | A detailed list of the GO annotation terms of each protein can be found [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Go_annotations_here here]]. |
||
− | |Probability |
||
+ | |||
− | |Odd score |
||
+ | === Results === |
||
− | |Prediction |
||
+ | |||
− | |- |
||
+ | We created for each protein an own result page. Sadly, it is not possible to summarize the results in a short way, so please have a look at the different result pages for a detailed output. |
||
− | |Amino acid biosynthesis |
||
+ | |||
− | |0.011 |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_Terms_HEXA_HUMAN HEXA HUMAN]] |
||
− | |0.484 |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_Terms_BACR_HALSA BACR HALSA]] |
||
− | |right |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_Terms_RET4_HUMAN RET4 HUMAN]] |
||
− | |- |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_Terms_INSL5_HUMAN INSL5 HUMAN]] |
||
− | |Biosynthesis of cofactors |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_Terms_LAMP1_HUMAN LAMP1 HUMAN]] |
||
− | |0.040 |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/GO_Terms_A4_HUMAN A4 HUMAN]] |
||
− | |0.558 |
||
− | |right |
||
− | |- |
||
− | |Cell envelope |
||
− | |0.756 => |
||
− | |12.393 => |
||
− | |right |
||
− | |- |
||
− | |Cellular processes |
||
− | |0.033 |
||
− | |0.448 |
||
− | |right |
||
− | |- |
||
− | |Central intermediary metabolism |
||
− | |0.048 |
||
− | |0.755 |
||
− | |right |
||
− | |- |
||
− | |Engergy metabolism |
||
− | |0.036 |
||
− | |0.397 |
||
− | |right |
||
− | |- |
||
− | |Fatty acid metabolsim |
||
− | |0.016 |
||
− | |1.265 |
||
− | |right |
||
− | |- |
||
− | |Purines and Pyrimidines |
||
− | |0.144 |
||
− | |0.592 |
||
− | |right |
||
− | |- |
||
− | |Regulatory functions |
||
− | |0.014 |
||
− | |0.087 |
||
− | |right |
||
− | |- |
||
− | |Replication and Transcription |
||
− | |0.020 |
||
− | |0.075 |
||
− | |right |
||
− | |- |
||
− | |Translation |
||
− | |0.032 |
||
− | |0.735 |
||
− | |right |
||
− | |- |
||
− | |Transport and binding |
||
− | |0.834 |
||
− | |2.033 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme/non-enzyme |
||
− | |- |
||
− | |Enzyme/non-enzyme |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Enzyme |
||
− | |0.209 |
||
− | |0.729 |
||
− | |right |
||
− | |- |
||
− | |Nonenzyme |
||
− | |0.791 => |
||
− | |1.109 => |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme class |
||
− | |- |
||
− | |Enzyme class |
||
− | |Probabilty |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Oxidoreductase (EC 1.-.-.-) |
||
− | |0.056 |
||
− | |0.268 |
||
− | |right |
||
− | |- |
||
− | |Transferase (EC 2.-.-.-) |
||
− | |0.031 |
||
− | |0.091 |
||
− | |right |
||
− | |- |
||
− | |Hydrolase (EC 3.-.-.-) |
||
− | |0.062 |
||
− | |0.195 |
||
− | |right |
||
− | |- |
||
− | |Lyase (EC 4.-.-.-) |
||
− | |0.020 |
||
− | |0.430 |
||
− | |right |
||
− | |- |
||
− | |Isomerase (EC 5.-.-.-) |
||
− | |0.010 |
||
− | |0.321 |
||
− | |right |
||
− | |- |
||
− | |Ligase (EC 6.-.-.-) |
||
− | |0.017 |
||
− | |0.327 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Gene ontology category |
||
− | |- |
||
− | |Gene ontology category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Signal transducer |
||
− | |0.374 |
||
− | |1.746 |
||
− | |right |
||
− | |- |
||
− | |Receptor |
||
− | |0.128 |
||
− | |0.750 |
||
− | |right |
||
− | |- |
||
− | |Hormone |
||
− | |0.247 => |
||
− | |37.936 => |
||
− | |right |
||
− | |- |
||
− | |Structural protein |
||
− | |0.001 |
||
− | |0.041 |
||
− | |right |
||
− | |- |
||
− | |Transporter |
||
− | |0.025 |
||
− | |0.228 |
||
− | |right |
||
− | |- |
||
− | |Ion channel |
||
− | |0.010 |
||
− | |0.168 |
||
− | |right |
||
− | |- |
||
− | |Volatge-gated ion channel |
||
− | |0.003 |
||
− | |0.131 |
||
− | |right |
||
− | |- |
||
− | |Cation channel |
||
− | |0.010 |
||
− | |0.215 |
||
− | |right |
||
− | |- |
||
− | |Transcription |
||
− | |0.054 |
||
− | |0.425 |
||
− | |right |
||
− | |- |
||
− | |Transcription regulation |
||
− | |0.091 |
||
− | |0.724 |
||
− | |right |
||
− | |- |
||
− | |Stress response |
||
− | |0.099 |
||
− | |1.128 |
||
− | |right |
||
− | |- |
||
− | |Immune response |
||
− | |0.178 |
||
− | |2.090 |
||
− | |wrong |
||
− | |- |
||
− | |Growth factor |
||
− | |0.061 |
||
− | |4.379 |
||
− | |wrong |
||
− | |- |
||
− | |Metal ion transport |
||
− | |0.009 |
||
− | |0.020 |
||
− | |right |
||
− | |- |
||
− | |} |
||
− | <br><br> |
||
− | * '''LAMP1_HUMAN''' |
||
− | <br><br> |
||
− | The ProtFun Server calculated following prediction result for LAMP1_HUMAN: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="4" | Functional category |
||
− | |- |
||
− | |Functional category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Amino acid biosynthesis |
||
− | |0.011 |
||
− | |0.484 |
||
− | |right |
||
− | |- |
||
− | |Biosynthesis of cofactors |
||
− | |0.053 |
||
− | |0.735 |
||
− | |right |
||
− | |- |
||
− | |Cell envelope |
||
− | |0.804 => |
||
− | |13.186 => |
||
− | |right |
||
− | |- |
||
− | |Cellular processes |
||
− | |0.027 |
||
− | |0.373 |
||
− | |right |
||
− | |- |
||
− | |Central intermediary metabolism |
||
− | |0.138 |
||
− | |2.188 |
||
− | |right |
||
− | |- |
||
− | |Engergy metabolism |
||
− | |0.037 |
||
− | |0.411 |
||
− | |right |
||
− | |- |
||
− | |Fatty acid metabolsim |
||
− | |0.016 |
||
− | |1.265 |
||
− | |right |
||
− | |- |
||
− | |Purines and Pyrimidines |
||
− | |0.533 |
||
− | |2.195 |
||
− | |wrong |
||
− | |- |
||
− | |Regulatory functions |
||
− | |0.015 |
||
− | |0.090 |
||
− | |right |
||
− | |- |
||
− | |Replication and transcription |
||
− | |0.019 |
||
− | |0.073 |
||
− | |right |
||
− | |- |
||
− | |Translation |
||
− | |0.027 |
||
− | |0.613 |
||
− | |right |
||
− | |- |
||
− | |Transport and binding |
||
− | |0.834 |
||
− | |2.033 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme/non-enzyme |
||
− | |- |
||
− | |Enzyme/non-enzyme |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Enzyme |
||
− | |0.276 |
||
− | |0.965 |
||
− | |right |
||
− | |- |
||
− | |Nonenzyme |
||
− | |0.724 => |
||
− | |1.014 => |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme class |
||
− | |- |
||
− | |Enzyme class |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Oxidoreductase (EC 1.-.-.-) |
||
− | |0.039 |
||
− | |0.187 |
||
− | |right |
||
− | |- |
||
− | |Transferase (EC 2.-.-.-) |
||
− | |0.046 |
||
− | |0.134 |
||
− | |right |
||
− | |- |
||
− | |Hydrolase (EC 3.-.-.-) |
||
− | |0.058 |
||
− | |0.184 |
||
− | |right |
||
− | |- |
||
− | |Lyase (EC 4.-.-.-) |
||
− | |0.020 |
||
− | |0.430 |
||
− | |right |
||
− | |- |
||
− | |Isomerase (EC 5.-.-.-) |
||
− | |0.010 |
||
− | |0.321 |
||
− | |right |
||
− | |- |
||
− | |Ligase (EC 6.-.-.-) |
||
− | |0.017 |
||
− | |0.326 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Gene ontology category |
||
− | |- |
||
− | |Gene ontology category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Signal transducer |
||
− | |0.396 |
||
− | |1.849 |
||
− | |right |
||
− | |- |
||
− | |Receptor |
||
− | |0.282 |
||
− | |1.659 |
||
− | |right |
||
− | |- |
||
− | |Hormone |
||
− | |0.001 |
||
− | |0.206 |
||
− | |right |
||
− | |- |
||
− | |Structural protein |
||
− | |0.011 |
||
− | |0.408 |
||
− | |right |
||
− | |- |
||
− | |Transporter |
||
− | |0.024 |
||
− | |0.222 |
||
− | |right |
||
− | |- |
||
− | |Ion channel |
||
− | |0.008 |
||
− | |0.147 |
||
− | |right |
||
− | |- |
||
− | |Volatge-gated ion channel |
||
− | |0.002 |
||
− | |0.111 |
||
− | |right |
||
− | |- |
||
− | |Cation channel |
||
− | |0.010 |
||
− | |0.215 |
||
− | |right |
||
− | |- |
||
− | |Transcription |
||
− | |0.032 |
||
− | |0.247 |
||
− | |right |
||
− | |- |
||
− | |Transcription regulation |
||
− | |0.018 |
||
− | |0.142 |
||
− | |right |
||
− | |- |
||
− | |Stress response |
||
− | |0.246 |
||
− | |2.795 |
||
− | |right |
||
− | |- |
||
− | |Immune response |
||
− | |0.371 => |
||
− | |4.368 => |
||
− | |right |
||
− | |- |
||
− | |Growth factor |
||
− | |0.013 |
||
− | |0.956 |
||
− | |right |
||
− | |- |
||
− | |Metal ion transport |
||
− | |0.009 |
||
− | |0.020 |
||
− | |right |
||
− | |- |
||
− | |} |
||
− | <br><br> |
||
− | * '''A4_HUMAN''' |
||
− | <br><br> |
||
− | The ProtFun Server calculated following prediction result for A4_HUMAN: |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | !colspan="4" | Functional category |
||
− | |- |
||
− | |Functional category |
||
− | |Probabilty |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Amino acid biosynthesis |
||
− | |0.020 |
||
− | |0.921 |
||
− | |right |
||
− | |- |
||
− | |Biosynthesis of cofactors |
||
− | |0.261 |
||
− | |3.623 |
||
− | |right |
||
− | |- |
||
− | |Cell envelope |
||
− | |0.804 => |
||
− | |13.186 => |
||
− | |right |
||
− | |- |
||
− | |Cellular processes |
||
− | |0.053 |
||
− | |0.070 |
||
− | |right |
||
− | |- |
||
− | |Central intermediary metabolism |
||
− | |0.184 |
||
− | |2.920 |
||
− | |right |
||
− | |- |
||
− | |Engergy metabolism |
||
− | |0.023 |
||
− | |0.259 |
||
− | |right |
||
− | |- |
||
− | |Fatty acid metabolsim |
||
− | |0.016 |
||
− | |1.265 |
||
− | |right |
||
− | |- |
||
− | |Purines and Pyrimidines |
||
− | |0.417 |
||
− | |1.716 |
||
− | |right |
||
− | |- |
||
− | |Regulatory functions |
||
− | |0.013 |
||
− | |0.084 |
||
− | |wrong |
||
− | |- |
||
− | |Replication and transcription |
||
− | |0.029 |
||
− | |0.109 |
||
− | |right |
||
− | |- |
||
− | |Translation |
||
− | |0.027 |
||
− | |0.613 |
||
− | |right |
||
− | |- |
||
− | |Transport and binding |
||
− | |0.827 |
||
− | |2.016 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme/non-enzyme |
||
− | |- |
||
− | |Enzyme/non-enzyme |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Enzyme |
||
− | |0.392 => |
||
− | |1.368 => |
||
− | |right |
||
− | |- |
||
− | |Nonenzyme |
||
− | |0.608 |
||
− | |0.852 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Enyzme class |
||
− | |- |
||
− | |Enzyme class |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Oxidoreductase (EC 1.-.-.-) |
||
− | |0.024 |
||
− | |0.114 |
||
− | |right |
||
− | |- |
||
− | |Transferase (EC 2.-.-.-) |
||
− | |0.208 |
||
− | |0.603 |
||
− | |right |
||
− | |- |
||
− | |Hydrolase (EC 3.-.-.-) |
||
− | |0.190 |
||
− | |0.600 |
||
− | |right |
||
− | |- |
||
− | |Lyase (EC 4.-.-.-) |
||
− | |0.020 |
||
− | |0.430 |
||
− | |right |
||
− | |- |
||
− | |Isomerase (EC 5.-.-.-) |
||
− | |0.010 |
||
− | |0.324 |
||
− | |right |
||
− | |- |
||
− | |Ligase (EC 6.-.-.-) |
||
− | |0.048 |
||
− | |0.946 |
||
− | |right |
||
− | |- |
||
− | !colspan="4" | Gene ontology category |
||
− | |- |
||
− | |Gene ontology category |
||
− | |Probability |
||
− | |Odd score |
||
− | |Prediction |
||
− | |- |
||
− | |Signal transducer |
||
− | |0.126 |
||
− | |0.586 |
||
− | |right |
||
− | |- |
||
− | |Receptor |
||
− | |0.036 |
||
− | |0.211 |
||
− | |right |
||
− | |- |
||
− | |Hormone |
||
− | |0.001 |
||
− | |0.206 |
||
− | |right |
||
− | |- |
||
− | |Structural protein |
||
− | |0.034 => |
||
− | |1.205 => |
||
− | |right |
||
− | |- |
||
− | |Transporter |
||
− | |0.024 |
||
− | |0.222 |
||
− | |right |
||
− | |- |
||
− | |Ion channel |
||
− | |0.009 |
||
− | |0.162 |
||
− | |right |
||
− | |- |
||
− | |Volatge-gated ion channel |
||
− | |0.002 |
||
− | |0.108 |
||
− | |right |
||
− | |- |
||
− | |Cation channel |
||
− | |0.010 |
||
− | |0.215 |
||
− | |right |
||
− | |- |
||
− | |Transcription |
||
− | |0.043 |
||
− | |0.335 |
||
− | |right |
||
− | |- |
||
− | |Transcription regulation |
||
− | |0.018 |
||
− | |0.143 |
||
− | |right |
||
− | |- |
||
− | |Stress response |
||
− | |0.076 |
||
− | |0.862 |
||
− | |right |
||
− | |- |
||
− | |Immune response |
||
− | |0.016 |
||
− | |0.183 |
||
− | |right |
||
− | |- |
||
− | |Growth factor |
||
− | |0.005 |
||
− | |0.372 |
||
− | |right |
||
− | |- |
||
− | |Metal ion transport |
||
− | |0.009 |
||
− | |0.020 |
||
− | |right |
||
− | |- |
||
− | |} |
||
− | <br><br> |
||
<br><br> |
<br><br> |
||
Line 5,015: | Line 2,269: | ||
|- |
|- |
||
|#GO terms |
|#GO terms |
||
− | |colspan="4" | |
+ | |colspan="4" | 25 |
|- |
|- |
||
|true positive (in %) |
|true positive (in %) |
||
Line 5,049: | Line 2,303: | ||
|- |
|- |
||
|#GO terms |
|#GO terms |
||
− | |colspan="4" | |
+ | |colspan="4" | 12 |
|- |
|- |
||
|true positive (in %) |
|true positive (in %) |
||
Line 5,083: | Line 2,337: | ||
|- |
|- |
||
|#GO terms |
|#GO terms |
||
− | |colspan="4" | |
+ | |colspan="4" | 41 |
|- |
|- |
||
|true positive (in %) |
|true positive (in %) |
||
Line 5,117: | Line 2,371: | ||
|- |
|- |
||
|#GO terms |
|#GO terms |
||
− | |colspan="4" | |
+ | |colspan="4" | 4 |
|- |
|- |
||
|true positive (in %) |
|true positive (in %) |
||
Line 5,151: | Line 2,405: | ||
|- |
|- |
||
|#GO terms |
|#GO terms |
||
− | |colspan="4" | |
+ | |colspan="4" | 17 |
|- |
|- |
||
|true positive (in %) |
|true positive (in %) |
||
Line 5,185: | Line 2,439: | ||
|- |
|- |
||
|#GO terms |
|#GO terms |
||
− | |colspan="4" | |
+ | |colspan="4" | 78 |
|- |
|- |
||
|true positive (in %) |
|true positive (in %) |
||
Line 5,200: | Line 2,454: | ||
|} |
|} |
||
− | As you can see in the |
+ | As you can see in the table above, each method only predicts a small subgroup of the real annotated GO terms. In general, GOPET seems to be the best method, because GOPET is the only method which predicts the GO Terms and in sum, it has mostly the best ratio by prediction true positive. Furthermore, it also predicts more GO terms than the other methods.<br> |
It was not possible to calculate the ratio between true positives and annotated GO terms for ProtFun, because this method has defined terms and only predicts the probability, that the protein belongs to these terms. <br> |
It was not possible to calculate the ratio between true positives and annotated GO terms for ProtFun, because this method has defined terms and only predicts the probability, that the protein belongs to these terms. <br> |
||
In general, you can say GO term prediction does not work very well and the prediction results only give hints of the function and localization of the protein.<br><br> |
In general, you can say GO term prediction does not work very well and the prediction results only give hints of the function and localization of the protein.<br><br> |
||
+ | Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs Disease]]<br> |
Latest revision as of 22:27, 30 August 2011
Contents
General Information
Secondary Structure Prediction
To analyse the secondary structure of our protein we used different methods. In our analysis we used PSIPRED, Jpred3 and DSSP. In the analysis section of this page we want to compare these three methods to see if the methods give similar results or if they differ extremely.
[Here] you can find some general information about these methods.
Back to [Tay-Sachs Disease]
Prediction of disordered regions
After analysing the secondary structure, we also want to have a look at disordered regions in this protein. Therefore, we used different methods. We used DISOPRED, POODLE in several variations, IUPred and Meta-Disorder. As before, with the the secondary structure prediction methods we want to compare the different methods and variants, if the predictions are similar. Therefore, we also want to decided which methods seems to be the best one for our purpose.
To get more insight into the methods and the theory behind them we also offer you an [general information page].
Back to [Tay-Sachs Disease]
Prediction of transmembrane helices and signal peptides
The third big analysis section is the prediction of transmembrane helices and signal peptides. We merged the prediction of transmembrane helices and signal peptides in one section, because there are several prediction methods which can predict both and therefore we looked at both predictions in this section.
Therefore we used several methods, some which only predict transmembrane helices, some which only predict signal peptides and some combined methods.
To have a closer look at the different methods we again provide an [information page.]
Back to [Tay-Sachs Disease]
Prediction of GO Terms
The last section is about the analysis of GO Terms. As before, we used several methods and compared them to each other.
Again we also provide an [general information page] about the GO Term methods, we used in our analysis.
Back to [Tay-Sachs Disease]
Secondary Structure prediction
Results
The detailed output of the different prediction methods can be found [here]
Here we only present a short summary of the output of the different methods.
- Predicted Helices
method | #helices |
PSIPRED | 14 |
Jpred3 | 14 |
DSSP | 16 |
- Predicted Beta-Sheets
method | #sheets |
PSIPRED | 15 |
Jpred3 | 15 |
DSSP | 0 |
Comparison of the different methods
To determine how successful our secondary structure prediction with PSIPRED and Jpred were, we had to compare it with the secondary structure assignment of DSSP. First of all, DSSP assigns no beta-sheets whereas both prediction methods predict some beta-sheets. Therefore, the main comparison in this case refers to the alpha-helices.
For PSIPRED the prediction of the alpha-helices was good. In most cases the alpha-helices of DSSP and PSIPRED correspond. There is only one helix which is predicted by PSIPRED which is not assigned as helix by DSSP. Furthermore there are three helices which are allocated as helices by DSSP which were not predicted by PSIPRED. The most of these helices which were presented only in one output are very small ones.
For Jpred3 the prediction of the alpha-helices was sufficiently good. In the most cases it agrees with DSSP. There are only two helices which are predicted by Jpred and which are not assigned by DSSP. In contrary, there are three small helices which are allocated to an alpha-helices by DSSP but are not predicted by Jpred. There is another special case where DSSP assigns two helices which are separated by a turn and Jpred predicts there only one big helix.
All in all, the prediction of the helices is probably good because they correspond mostly with the assignment of DSSP. The only negative aspect is, that both prediction methods predict a lot of sheets which were not assigned by DSSP at all.
Back to [Tay-Sachs Disease]
Prediction of disordered regions
Before we start with the analysis of the results of the different methods, we checked, if our protein has one or more disordered regions. Therefore, we search our protein in the [DisProt database] and did not find it, so our protein does not have any disordered regions. Another possibility to find out if the protein has disordered regions, is to check [UniProt], if there is an entry for [DisProt].
Results
The detailed results of the different methods can be found [here]
In this section, we only want to give a summary of the output of the different methods.
method | #disordered regions in the protein | #disordered regions on the brink |
Disopred | 0 | 2 |
POODLE-I | 3 | 2 |
POODLE-L | 0 | 0 |
POODLE-S (B-factors) | 3 | 2 |
POODLE-S (missing residues) | 4 | 2 |
IUPred (short) | 0 | 2 |
IUPred (long) | 0 | 0 |
IUPred (structural information) | 0 | 0 |
Meta-Disorder | 0 | 0 |
Comparison of the different POODLE variants
POODLE-L does not find any disordered regions. This is the result we expected, because our protein does not possess any disordered regions.
Both POODLE-S variants found several short disordered regions, which is a false positive result. Interestingly, there seems to be more missing electrons in the electron density map, than residues with high B-factor value.
POODLE-I found the same result as POODLE-S with high B-factor, which was expected, because POODLE-I combines POODLE-L and POODLE-S (high B-factor).
Therefore, the predictions of short disordered regions are wrong results. Only the prediction of POODLE-L is correct.
In general, these predictions are used, if nothing is known about the protein. Therefore, normally we do not know, that the prediction is wrong. Because of that, we want to trust the result and we want to check if the disordered regions overlap with the functionally important residues, because it seems that disordered regions are functionally very important. We check this for POODLE-S with missing residues and POODLE-I, because POODLE-S with high B-factor values shows the same result as POODLE-I.
functional residues | disordered | |||
---|---|---|---|---|
residue position | amino acid | function | POODLE-S (missing) | POODLE-I |
323 | E | active site | ordered | ordered |
115 | N | Glycolysation | ordered | ordered |
157 | N | Glycolysation | ordered | ordered |
259 | N | Glycolysation | ordered | ordered |
58 (connected with 104) | C | Disulfide bond | disordered | ordered |
104 (connected with 58) | C | Disulfide bond | disordered | ordered |
277 (connected with 328) | C | Disulfide bond | ordered | ordered |
328 (connected with 277) | C | Disulfide bond | ordered | ordered |
505 (connected with 522) | C | Disulfide bond | ordered | ordered |
522 (connected with 505) | C | Disulfide bond | ordered | ordered |
As you can see in the table above, only one disulfide bond is located in a disordered region, all other functionally important residues are located in ordered regions. This is a further good hint, that the predictions are wrong.
Comparison of the different methods
We decided to compare the results of the different methods. Therefore, we count how many residues are predicted as disordered, which is wrong in our case.
methods | |||||||||
Disopred | POODLE-I | POODLE-L | POODLE-S (missing) | POODLE-S (B-factor) | IUPred (short) | IUPred (long) | IUPred (structure) | Meta-Disorder | |
#wrong predicted residues | 5 | 23 | 0 | 47 | 24 | 3 | 0 | 0 | 0 |
POODLE-L, IUPred(long) and IUPred(structure) predict the disordered regions correct.
The worst prediction result gave POODLE-S (B-factor) which predicts 47 residues as disordered, followed by POODLE-S (missing) (24 wrong predicted residues) and POODLE-I (23 wrong predicted residues).
Back to [Tay-Sachs Disease]
Prediction of transmembrane alpha-helices and signal peptides
Because most of the proteins we used in this practical are not membrane proteins, we got five additional proteins for the transmembrane and signal peptide analyses.
Additional proteins:
name | organism | location | transmembrane protein | sequence |
BACR_HALSA | Halobacterium salinarium (Archaea) | Cell membrane | Multi-pass membrane protein | [P02945.fasta] |
RET4_HUMAN | Human (Homo sapiens) | extracellular space | No | [P02753.fasta] |
INSL5_HUMAN | Human (Homo sapiens) | extracellular region | No | [Q9Y5Q6.fasta] |
LAMP1_HUMAN | Human (Homo sapiens) | Cell membrane | Single-pass membrane protein | [P11279.fasta] |
A4_HUMAN | Human (Homo sapiens) | Cell membrane | Single-pass membrane protein | [P05067.fasta] |
The detailed output for the different organism and the different prediction methods can be found here:
- [HEXA_HUMAN]
- [BACR_HALSA]
- [RET4_HUMAN]
- [INSL5_HUMAN]
- [LAMP1_HUMAN]
- [A4_HUMAN]
Results
Transmembrane Helices
TMHMM | Phobius | PolyPhobius | OCTOPUS | SPOCTOPUS | |||||||||||
protein | start position | end position | location | start position | end position | location | start position | end position | location | start position | end position | location | start position | end position | location |
HEXA HUMAN | 1 | 529 | outside | 23 | 529 | outside | 20 | 520 | outside | 1 | 2 | inside | 22 | 529 | outside |
3 | 23 | TM helix | |||||||||||||
24 | 529 | outside | |||||||||||||
BACR HALSA | 1 | 22 | outside | 1 | 22 | outside | 1 | 22 | outside | ||||||
23 | 42 | TM Helix | 23 | 42 | TM helix | 22 | 43 | TM helix | 23 | 43 | TM helix | 23 | 43 | TM helix | |
43 | 54 | inside | 43 | 53 | inside | 44 | 54 | inside | 44 | 54 | inside | 44 | 54 | inside | |
55 | 77 | TM Helix | 54 | 76 | TM helix | 55 | 77 | TM helix | 55 | 75 | TM helix | 55 | 75 | TM helix | |
78 | 91 | outside | 77 | 95 | outside | 78 | 94 | outside | 76 | 95 | outside | 76 | 95 | outside | |
92 | 114 | TM Helix | 96 | 114 | TM helix | 95 | 114 | TM helix | 96 | 116 | TM helix | 96 | 116 | TM helix | |
115 | 120 | inside | 115 | 120 | inside | 115 | 120 | inside | 117 | 121 | inside | 117 | 120 | inside | |
121 | 143 | TM Helix | 121 | 142 | TM helix | 121 | 141 | TM helix | 122 | 142 | TM helix | 121 | 141 | TM helix | |
144 | 147 | outside | 143 | 147 | outside | 142 | 147 | outside | 143 | 147 | outside | 142 | 147 | outside | |
148 | 170 | TM Helix | 148 | 169 | TM helix | 148 | 166 | TM helix | 148 | 168 | TM helix | 148 | 168 | TM helix | |
171 | 189 | inside | 170 | 189 | inside | 167 | 186 | inside | 169 | 185 | inside | 169 | 185 | inside | |
190 | 212 | TM Helix | 190 | 212 | TM helix | 187 | 205 | TM helix | 186 | 206 | TM helix | 186 | 206 | TM helix | |
213 | 262 | outside | 213 | 217 | outside | 206 | 215 | outside | 207 | 216 | outside | 207 | 216 | outside | |
218 | 237 | TM helix | 216 | 237 | TM helix | 217 | 237 | TM helix | 217 | 237 | TM helix | ||||
238 | 262 | inside | 238 | 262 | inside | 238 | 262 | inside | 238 | 262 | inside | ||||
RET4 HUMAN | 1 | 1 | inside | ||||||||||||
2 | 23 | TM helix | |||||||||||||
1 | 201 | outside | 19 | 201 | outside | 19 | 201 | outside | 24 | 201 | outside | 20 | 201 | outside | |
INSL5 HUMAN | 1 | 1 | inside | ||||||||||||
2 | 32 | TM helix | |||||||||||||
1 | 135 | outside | 23 | 135 | outside | 23 | 135 | outside | 33 | 135 | outside | 24 | 135 | outside | |
LAMP1 HUMAN | 1 | 10 | inside | 1 | 10 | inside | |||||||||
11 | 33 | TM Helix | 11 | 31 | TM helix | ||||||||||
34 | 383 | outside | 29 | 381 | outside | 29 | 381 | outside | 32 | 383 | outside | 30 | 383 | outside | |
384 | 406 | TM Helix | 382 | 405 | TM helix | 382 | 405 | TM helix | 384 | 404 | TM helix | 384 | 404 | TM helix | |
407 | 417 | inside | 406 | 417 | outside | 406 | 417 | outside | 405 | 417 | outside | 405 | 417 | outside | |
A4 HUMAN | 1 | 5 | outside | ||||||||||||
6 | 11 | R | |||||||||||||
1 | 700 | outside | 18 | 700 | outside | 18 | 700 | outside | 12 | 701 | outside | 19 | 701 | outside | |
701 | 723 | TM Helix | 701 | 723 | TM helix | 701 | 723 | TM helix | 702 | 722 | TM helix | 702 | 722 | TM helix | |
724 | 770 | inside | 724 | 770 | inside | 724 | 770 | inside | 723 | 770 | inside | 723 | 770 | inside |
On the table above, you can see the summary of the results of the different methods which predict transmembrane helices. As you can see on this table, OCTOPUS often predicts a transmembrane helix, although all other methods do not predict one. Phobis, PolyPhobius and SPOCTOPUS show always very similar result, whereas TMHMM and OCTOPUS differ from these results.
Signal Peptide
Phobius | PolyPhobius | SPOCTOPUS | TargetP | SignalP | |||||
protein | start position | end position | start position | end position | start position | end position | location | start position | end position |
HEXA HUMAN | 1 | 22 | 1 | 19 | 7 | 21 | secretory pathway | 1 | 22 |
BACR HALSA | no prediction available | secretory pathway | 1 | 38 | |||||
RET4 HUMAN | 1 | 18 | 1 | 18 | 6 | 19 | secretory pathway | 1 | 18 |
INSL5 HUMAN | 1 | 22 | 1 | 22 | 6 | 23 | secretory pathway | 1 | 22 |
LAMP1 HUMAN | 1 | 28 | 1 | 28 | 12 | 29 | secretory pathway | 1 | 28 |
A4 HUMAN | 1 | 17 | 1 | 17 | 5 | 18 | secretory pathway | 1 | 15 |
In the last table there is a list with the results of the prediction of the signal peptides created by different methods. As we can see on the first look, all methods predict always a signal peptide, although the stop position of this signal differ. Phobius, PolyPhobius and SPOCTOPUS failed by predicting the signal peptide from BACR_HALSA. Furthermore, TargetP do not predict the position of the signal peptide, instead it only predicts the location of the protein.
Comparison of the different methods
We decided to split the comparison of the methods, because it is unfair to directly compare a method which can not predict a signal peptide and a method which predicts signal peptides. Therefore, we split the comparison in one comparison for transmembrane helices, one for signal peptides and one for the combination of both.
- Comparison of transmembrane helix prediction
Here we compared TMHMM, OCTOPUS and the transmembrane predictions of SPOCTOPUS, Phobius and PolyPhobius. In this comparison we skipped the first residues which are signal peptides, because all only-transmembrane prediction methods predicted these region as transmembrane helices, which is wrong.
For this comparison we counted the wrong predicted transmembrane residues, the wrong predicted outside located residues and the wrong predicted inside residues.
methods | |||||||
TMHMM | Phobius | PolyPhobius | OCTOPUS | SPOCTOPUS | Transmembrane protein | ||
HEXA_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 0 | 0 | no |
#wrong outside | 0 | 0 | 0 | 0 | 0 | ||
#wrong insde | 0 | 0 | 0 | 0 | 0 | ||
#wrong sum | 0 | 0 | 0 | 0 | 0 | ||
%wrong predicted | 0% | 0% | 0% | 0% | 0% | ||
BACR_HALSA | #wrong transmembrane | 24 | 20 | 12 | 16 | 11 | yes (7 transmembrane helices) |
#wrong outside | 46 | 5 | 3 | 4 | 6 | ||
#wrong inside | 4 | 4 | 2 | 0 | 0 | ||
#wrong sum | 74 | 29 | 17 | 20 | 17 | ||
%wrong predicted | 29% | 11% | 6% | 8% | 6% | ||
RET4_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 5 | 0 | no |
#wrong outside | 0 | 0 | 0 | 0 | 0 | ||
#wrong inside | 0 | 0 | 0 | 0 | 0 | ||
#wrong sum | 0 | 0 | 0 | 5 | 0 | ||
%wrong predicted | 0% | 0% | 0% | 2% | 0% | ||
INSL5_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 10 | 0 | no |
#wrong outside | 0 | 0 | 0 | 0 | 0 | ||
#wrong inside | 0 | 0 | 0 | 0 | 0 | ||
#wrong sum | 0 | 0 | 0 | 10 | 0 | ||
%wrong predicted | 0% | 0% | 0% | 8% | 0% | ||
LAMP1_HUMAN | #wrong transmembrane | 5 | 3 | 4 | 3 | 1 | yes (single-spanning) |
#wrong outside | 2 | 0 | 0 | 1 | 1 | ||
#wrong inside | 0 | 0 | 0 | 1 | 1 | ||
#wrong sum | 7 | 3 | 4 | 5 | 3 | ||
%wrong predicted | 2% | 0% | 1% | 1% | 0% | ||
A4_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 0 | 0 | yes (single-spanning) |
#wrong outside | 1 | 1 | 1 | 1 | 2 | ||
#wrong inside | 0 | 0 | 0 | 1 | 1 | ||
#wrong sum | 1 | 1 | 1 | 2 | 3 | ||
%wrong predicted | 0% | 0% | 0% | 0% | 0% | ||
Average number of wrong predicted residues | |||||||
13.6 | 5.5 | 3.6 | 7 | 3.8 |
TMHMM is the worst prediction method. This can also be seen on the example of BACR_HALSA, because TMHMM is the only prediction method, which do not recognize the 7 transmembrane helices.
SPOCTOPUS and PolyPhobius are the best prediction methods.
In general, the prediction of transmembrane helices works quite good and almost all predictions are very close to the real protein.
- Comparison of signal peptide prediction
Now we compared TargetP and SignalP which only predict signal peptides. Furthermore, we compared SPOCTOPUS, Phobius and PolyPhobius.
TargetP does not predict the start and end position of the signal peptide, instead it predicts only the location of the protein.
methods | |||||||
real position | Phobius | PolyPhobius | SPOCTOPUS | TargetP | SignalP | ||
HEXA_HUMAN | stop position | 22 | 22 | 19 | 21 | no prediction | 22 |
#wrong residues | 0 | 3 | 3 | no prediction | 0 | ||
location | secretory pathway | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
BACR_HALSA | stop position | not available | no prediction | no prediction | no prediction | no prediction | no consensus prediction |
#wrong predicted | not available | not available | not available | not available | no prediction | not available | |
location | membrane | not available | not available | not available | secretory pathway | non-signal peptide | |
RET4_HUMAN | stop position | 18 | 18 | 18 | 19 | no prediction | 18 |
#wrong predicted | 0 | 0 | 1 | no prediction | 0 | ||
location | secretory pathway | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
INSL5_HUMAN | stop position | 22 | 22 | 22 | 22 | no prediction | 22 |
#wrong residues | 0 | 0 | 0 | no prediction | 0 | ||
location | secretory pathway | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
LAMP1_HUMAN | stop position | 28 | 28 | 28 | 29 | no prediction | 28 |
#wrong residues | 0 | 0 | 1 | no prediction | 0 | ||
location | transmembrane helix | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
A4_HUMAN | stop position | 17 | 17 | 17 | 18 | no prediction | 17 |
#wrong residues | 0 | 0 | 1 | no prediction | 0 | ||
location | transmembrane helix | secretory pathway | secretory pathway | no prediction | secretory pathway | secretory pathway | |
Average number of wrong prediction | |||||||
sum of wrong predicted residues | 0 | 3 | 2 | no prediction | 0 | ||
#right predicted locations / #predicted locations | 3/5 | 3/5 | no prediction | 3/5 | no prediction |
SPOCTOPUS and SignalP do not predict the location of the protein, they only predict the start and stop position of the signal peptide. Furthermore, SignalP predicts if it is a signal peptide or not.
In contrast, TargetP only predicts the location of the protein, not the start and stop position of the signal peptide. Only Phobius and PolyPhobius predict both.
Therefore, it is difficult to compare the different methods. First of all, Phobius and PolyPhobius have more power than the other prediction methods, because they predict both. In average they predict the location and also the position as good as the other prediction methods. None of the methods could predict the transmembrane proteins, all methods predict them as proteins of the secretory pathway. Therefore, it is useful to use Phobius or PolyPhobius, because they predict more than the other methods. Furthermore, both methods can also predict transmembrane helices.
The results of Phobius were a little bit better than the results of PolyPhobius.
We also wanted to mention, that SignalP gave you the possibility to choose between the prediction for eukaryotes, gram-positive bacteria and gram-negative bacteria. In our analyse we also analysed BACR_HALSA, which is an archaea protein. We tested all three prediction methods for this protein and all three methods failed. BACR_HALSA do not possess a signal peptide, but every method predicts one. Only the eukaryotic prediction method recognized a signal anchor for BACR_HALSA, whereas the other two methods could not give a prediction of the location.
- Comparison of the combined methods
The last issue, we wanted to compare, was the combined methods. SPOCTOPUS, Phobius and PolyPhobius can predict transmembrane helices as well as signal peptides. Therefore we combined our two further comparisons.
methods | ||||
Phobius | PolyPhobius | SPOCTOPUS | ||
HEXA_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 0 | 3 | 2 | |
location | right | right | no prediction | |
BACR_HALSA | #wrong predicted residues (TM) | 29 | 17 | 17 |
#wrong predicted residues (SP) | n.a. | n.a. | n.a. | |
location | n.a | n.a | no prediction | |
RET4_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 0 | 0 | 0 | |
location | right | right | no prediction | |
INSL5_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 0 | 0 | 1 | |
location | right | right | no prediction | |
LAMP1_HUMAN | #wrong predicted residues (TM) | 3 | 4 | 3 |
#wrong predicted residues (SP) | 0 | 0 | 0 | |
location | wrong | wrong | no prediction | |
A4_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 1 | 1 | 3 | |
location | wrong | wrong | no prediction | |
Average | ||||
avg(#wrong predicted residues (TM)) | 5.3 | 3.5 | 3.3 | |
avg(#wrong predicted residues (SP)) | 0.1 | 0.6 | 1 | |
#location (right predicted) / #location(predicted) | 3/5 | 3/5 | no prediction |
In general, PolyPhobius gave the best results. Although it predicts the signal peptide stop position a little bit worse than Phobius, the transmembrane prediction is significant better than by the prediction of Phobius. The predictions of SPOCTOPUS are also good, but sadly SPOCTOPUS does not predict the location of the protein.
Therefore, it seems a good choice to use PolyPhobius, which is in average the best method for transmembrane and signal peptide prediction.
Back to [Tay-Sachs Disease]
Signal Peptide
Phobius | PolyPhobius | SPOCTOPUS | TargetP | SignalP | |||||
protein | start position | end position | start position | end position | start position | end position | location | start position | end position |
HEXA HUMAN | 1 | 22 | 1 | 19 | 7 | 21 | secretory pathway | 1 | 22 |
BACR HALSA | no prediction available | secretory pathway | 1 | 38 | |||||
RET4 HUMAN | 1 | 18 | 1 | 18 | 6 | 19 | secretory pathway | 1 | 18 |
INSL5 HUMAN | 1 | 22 | 1 | 22 | 6 | 23 | secretory pathway | 1 | 22 |
LAMP1 HUMAN | 1 | 28 | 1 | 28 | 12 | 29 | secretory pathway | 1 | 28 |
A4 HUMAN | 1 | 17 | 1 | 17 | 5 | 18 | secretory pathway | 1 | 15 |
In the last table there is a list with the results of the prediction of the signal peptides created by different methods.
Comparison of the different methods
We decided to split the comparison of the methods, because it is unfair to directly compare a method which can not predict a signal peptide and a method which predicts signal peptides. Therefore, we split the comparison in one comparison for transmembrane helices, one for signal peptides and one for the combination of both.
- Comparison of transmembrane helix prediction
Here we compared TMHMM, OCTOPUS and the transmembrane predictions of SPOCTOPUS, Phobius and PolyPhobius. In this comparison we skipped the first residues which are signal peptides, because all only-transmembrane prediction methods predicted these region as transmembrane helices, which is wrong.
For this comparison we counted the wrong predicted transmembrane residues, the wrong predicted outside located residues and the wrong predicted inside residues.
methods | |||||||
TMHMM | Phobius | PolyPhobius | OCTOPUS | SPOCTOPUS | Transmembrane protein | ||
HEXA_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 0 | 0 | no |
#wrong outside | 0 | 0 | 0 | 0 | 0 | ||
#wrong insde | 0 | 0 | 0 | 0 | 0 | ||
#wrong sum | 0 | 0 | 0 | 0 | 0 | ||
%wrong predicted | 0% | 0% | 0% | 0% | 0% | ||
BACR_HALSA | #wrong transmembrane | 24 | 20 | 12 | 16 | 11 | yes (7 transmembrane helices) |
#wrong outside | 46 | 5 | 3 | 4 | 6 | ||
#wrong inside | 4 | 4 | 2 | 0 | 0 | ||
#wrong sum | 74 | 29 | 17 | 20 | 17 | ||
%wrong predicted | 29% | 11% | 6% | 8% | 6% | ||
RET4_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 5 | 0 | no |
#wrong outside | 0 | 0 | 0 | 0 | 0 | ||
#wrong inside | 0 | 0 | 0 | 0 | 0 | ||
#wrong sum | 0 | 0 | 0 | 5 | 0 | ||
%wrong predicted | 0% | 0% | 0% | 2% | 0% | ||
INSL5_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 10 | 0 | no |
#wrong outside | 0 | 0 | 0 | 0 | 0 | ||
#wrong inside | 0 | 0 | 0 | 0 | 0 | ||
#wrong sum | 0 | 0 | 0 | 10 | 0 | ||
%wrong predicted | 0% | 0% | 0% | 8% | 0% | ||
LAMP1_HUMAN | #wrong transmembrane | 5 | 3 | 4 | 3 | 1 | yes (single-spanning) |
#wrong outside | 2 | 0 | 0 | 1 | 1 | ||
#wrong inside | 0 | 0 | 0 | 1 | 1 | ||
#wrong sum | 7 | 3 | 4 | 5 | 3 | ||
%wrong predicted | 2% | 0% | 1% | 1% | 0% | ||
A4_HUMAN | #wrong transmembrane | 0 | 0 | 0 | 0 | 0 | yes (single-spanning) |
#wrong outside | 1 | 1 | 1 | 1 | 2 | ||
#wrong inside | 0 | 0 | 0 | 1 | 1 | ||
#wrong sum | 1 | 1 | 1 | 2 | 3 | ||
%wrong predicted | 0% | 0% | 0% | 0% | 0% | ||
Average number of wrong predicted residues | |||||||
13.6 | 5.5 | 3.6 | 7 | 3.8 |
TMHMM is the baddest prediction method. This can also be seen at the example of BACR_HALSA, because TMHMM is the only prediction method, which do not recognize the 7 transmembrane helices.
SPOCTOPUS and PolyPhobius are the best prediction methods.
In general the prediction of transmembrane helices works quite good and almost all predictions are very close to the real protein.
- Comparison of signal peptide prediction
Now we compared TargetP and SignalP which can only predict signal peptides. Furthermore we compared SPOCTOPUS, Phobius and PolyPhobius.
TargetP does not predict the start and end position of the signal peptide, instead it predicts only the location of the protein.
methods | |||||||
real position | Phobius | PolyPhobius | SPOCTOPUS | TargetP | SignalP | ||
HEXA_HUMAN | stop position | 22 | 22 | 19 | 21 | no prediction | 22 |
#wrong residues | 0 | 3 | 3 | no prediction | 0 | ||
location | secretory pathway | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
BACR_HALSA | stop position | not available | no prediction | no prediction | no prediction | no prediction | no consensus prediction |
#wrong predicted | not available | not available | not available | not available | no prediction | not available | |
location | membrane | not available | not available | not available | secretory pathway | non-signal peptide | |
RET4_HUMAN | stop position | 18 | 18 | 18 | 19 | no prediction | 18 |
#wrong predicted | 0 | 0 | 1 | no prediction | 0 | ||
location | secretory pathway | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
INSL5_HUMAN | stop position | 22 | 22 | 22 | 22 | no prediction | 22 |
#wrong residues | 0 | 0 | 0 | no prediction | 0 | ||
location | secretory pathway | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
LAMP1_HUMAN | stop position | 28 | 28 | 28 | 29 | no prediction | 28 |
#wrong residues | 0 | 0 | 1 | no prediction | 0 | ||
location | transmembrane helix | secretory pathway | secretory pathway | no prediction | secretory pathway | no prediction | |
A4_HUMAN | stop position | 17 | 17 | 17 | 18 | no prediction | 17 |
#wrong residues | 0 | 0 | 1 | no prediction | 0 | ||
location | transmembrane helix | secretory pathway | secretory pathway | no prediction | secretory pathway | secretory pathway | |
Average number of wrong prediction | |||||||
sum of wrong predicted residues | 0 | 3 | 2 | no prediction | 0 | ||
#right predicted locations / #predicted locations | 3/5 | 3/5 | no prediction | 3/5 | no prediction |
SPOCTOPUS and SignalP do not predict the location of the protein, they only predict the start and stop position of the signal peptide. Furthermore, SignalP predicts if it is a signal peptide or not.
In contrast, TargetP only predicts the location of the protein, not the start and stop position of the signal peptide. Only Phobius and PolyPhobius predict both.
Therefore, it is difficult to compare the different methods. First of all, Phobius and PolyPhobius have more power than the other prediction methods, because they predict both. In average they predict the location and also the position as good as the other prediction methods. None of the methods could predict the transmembrane proteins, all methods predict them as proteins of the secretory pathway. Therefore, it is useful to use Phobius or PolyPhobius, because they predict more than the other methods. Furthermore, both methods can also predict transmembrane helices.
The results of Phobius were a litte bit better than the results of PolyPhobius.
We also wanted to mention, that SignalP gave you the possibility to choose between the prediction for eukaryotes, gram-positive bacteria and gram-negative bacteria. In our analyse we also analysied BACR_HALSA, which is an archaea protein. We tested all three prediction methods for this protein and all three methods failed. BACR_HALSA don't posses a signal peptide, but every method predicts one. Only the eukaryotic prediction method recogniced a signal anchor for BACR_HALSA, whereas the other two methods could not give a prediction of the location.
- Comparison of the combined methods
The last thing, which we wanted to compare, was the combined methods. SPOCTOPUS, Phobius and PolyPhobius can predict transmembrane helices as well as signal peptides. Therefore we combined our two further comparisons.
methods | ||||
Phobius | PolyPhobius | SPOCTOPUS | ||
HEXA_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 0 | 3 | 2 | |
location | right | right | no prediction | |
BACR_HALSA | #wrong predicted residues (TM) | 29 | 17 | 17 |
#wrong predicted residues (SP) | n.a. | n.a. | n.a. | |
location | n.a | n.a | no prediction | |
RET4_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 0 | 0 | 0 | |
location | right | right | no prediction | |
INSL5_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 0 | 0 | 1 | |
location | right | right | no prediction | |
LAMP1_HUMAN | #wrong predicted residues (TM) | 3 | 4 | 3 |
#wrong predicted residues (SP) | 0 | 0 | 0 | |
location | wrong | wrong | no prediction | |
A4_HUMAN | #wrong predicted residues (TM) | 0 | 0 | 0 |
#wrong predicted residues (SP) | 1 | 1 | 3 | |
location | wrong | wrong | no prediction | |
Average | ||||
avg(#wrong predicted residues (TM)) | 5.3 | 3.5 | 3.3 | |
avg(#wrong predicted residues (SP)) | 0.1 | 0.6 | 1 | |
#location (right predicted) / #location(predicted) | 3/5 | 3/5 | no prediction |
In general, PolyPhobius gave the best results. Although it predicts the singal peptide stop position a little bit badder than Phobius, the transmembrane prediction is significant bettern than by Phobius. The predictions of SPOCTOPUS are also good, but sadly SPOCTOPUS does not predict the location of the protein.
Therefore, it seems a good choice to use PolyPhobius, which is in average the best method for transmembrane and signal peptide prediction.
Prediction of GO terms
Before we start with our analysis, we decided to check the GO annotations for the six sequences, which can be found [here]:
A detailed list of the GO annotation terms of each protein can be found [here].
Results
We created for each protein an own result page. Sadly, it is not possible to summarize the results in a short way, so please have a look at the different result pages for a detailed output.
- [HEXA HUMAN]
- [BACR HALSA]
- [RET4 HUMAN]
- [INSL5 HUMAN]
- [LAMP1 HUMAN]
- [A4 HUMAN]
Comparison of the different methods
It is difficult to compare these methods. First of all, two methods are based on homology-based prediction, whereas ProtFun is based on ab initio prediction. So it is clear, that the results differ. Second, each method has another prediction focus and called the results a little bit different. Only GOPET predicts exact GO numbers, the other two methods only predict the approximate functions and processes.
Therefore, to compare the results, we decided to calculate the fraction of right prediction and the ratio between right predictions and annotated GO terms.
methods | |||||
GOPET terms | GOPET GOids | Pfam | ProtFun | ||
HEXA_HUMAN | #true positive | 7 | 7 | 2 | 31 |
#false negative | 1 | 1 | 0 | 3 | |
#predictions | 8 | 8 | 2 | 34 | |
#GO terms | 25 | ||||
true positive (in %) | 0.87 | 0.87 | 1 | 0.91 | |
ratio true positive/annotated GO terms | 0.28 | 0.28 | 0.08 | not possible | |
BACR_HALSA | #true positive | 2 | 1 | 1 | 30 |
#false negative | 1 | 2 | 0 | 4 | |
#predictions | 3 | 3 | 1 | 34 | |
#GO terms | 12 | ||||
true positive (in %) | 0.66 | 0.33 | 1 | 0.88 | |
ratio true positive/annotated GO terms | 0.16 | 0.08 | 0.08 | not possible | |
RET4_HUMAN | #true positive | 5 | 5 | 1 | 30 |
#false negative | 3 | 3 | 0 | 4 | |
#predictions | 8 | 8 | 1 | 34 | |
#GO terms | 41 | ||||
true positive (in %) | 0.62 | 0.62 | 1 | 0.88 | |
ratio true positive/annotated GO terms | 0.12 | 0.12 | 0.02 | not possible | |
INSL5_HUMAN | #true positive | 1 | 1 | 1 | 32 |
#false negative | 0 | 0 | 0 | 2 | |
#predictions | 1 | 1 | 1 | 34 | |
#GO terms | 4 | ||||
true positive (in %) | 1 | 1 | 1 | 0.94 | |
ratio true positive/annotated GO terms | 0.25 | 0.25 | 0.25 | not possible | |
LAMP1_HUMAN | #true positive | 0 | 0 | 1 | 33 |
#false negative | 2 | 2 | 0 | 1 | |
#predictions | 2 | 2 | 1 | 34 | |
#GO terms | 17 | ||||
true positive (in %) | 0 | 0 | 1 | 0.97 | |
ratio true positive/annotated GO terms | 0 | 0 | 0.05 | not possible | |
A4_HUMAN | #true positive | 7 | 7 | 6 | 33 |
#false negative | 6 | 6 | 0 | 1 | |
#predictions | 13 | 13 | 6 | 34 | |
#GO terms | 78 | ||||
true positive (in %) | 0.53 | 0.53 | 1 | 0.97 | |
ratio true positive/annotated GO terms | 0.08 | 0.08 | 0.07 | not possible |
As you can see in the table above, each method only predicts a small subgroup of the real annotated GO terms. In general, GOPET seems to be the best method, because GOPET is the only method which predicts the GO Terms and in sum, it has mostly the best ratio by prediction true positive. Furthermore, it also predicts more GO terms than the other methods.
It was not possible to calculate the ratio between true positives and annotated GO terms for ProtFun, because this method has defined terms and only predicts the probability, that the protein belongs to these terms.
In general, you can say GO term prediction does not work very well and the prediction results only give hints of the function and localization of the protein.
Back to [Tay-Sachs Disease]