Difference between revisions of "ASPA Sequence Based Predictions"
(→DSSP) |
(→ProtFun 2.2) |
||
(35 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
===PsiPred=== |
===PsiPred=== |
||
+ | |||
+ | For a description of PsiPred, see [[Psipred]]. |
||
[[File:3ef7a0ac-0a08-412e-bf9f-e54bac6babd0.psi 1.png|200px|thumb|right|PsiPred results for Aspartoacyclase]] |
[[File:3ef7a0ac-0a08-412e-bf9f-e54bac6babd0.psi 1.png|200px|thumb|right|PsiPred results for Aspartoacyclase]] |
||
Line 82: | Line 84: | ||
DSSP does not predict secondary structure from amino acid sequences; instead, it uses a 3D structure (a PDB file) to deduce the secondary structure from the 3D structure. To this end, DSSP examines the phi and psi angles and the C alpha positions in the protein backbone and H-bonds present in the structure; these are used to define "n-turns", which are H-bonds between the NH and CO groups of amino acids with sequence separations of 3-5 residues, and "bridges" with greater sequence separations. Repeating 4-turns are used to identify helices, repeating bridges identify beta sheets. |
DSSP does not predict secondary structure from amino acid sequences; instead, it uses a 3D structure (a PDB file) to deduce the secondary structure from the 3D structure. To this end, DSSP examines the phi and psi angles and the C alpha positions in the protein backbone and H-bonds present in the structure; these are used to define "n-turns", which are H-bonds between the NH and CO groups of amino acids with sequence separations of 3-5 residues, and "bridges" with greater sequence separations. Repeating 4-turns are used to identify helices, repeating bridges identify beta sheets. |
||
− | Input: A 3D structure (a PDB file) |
+ | Input: A 3D structure (a PDB file, ID 2o53 in our case) |
Output: (from [http://swift.cmbi.ru.nl/gv/dssp/DSSP_2.html]) |
Output: (from [http://swift.cmbi.ru.nl/gv/dssp/DSSP_2.html]) |
||
Line 169: | Line 171: | ||
603 - 604 |
603 - 604 |
||
603 - 604 |
603 - 604 |
||
− | 603 - 604 AA |
+ | 603 - 604 AA |
Clearly solvent accessible: A; involved in symmetry contacts: * |
Clearly solvent accessible: A; involved in symmetry contacts: * |
||
+ | </pre> |
||
+ | All in all, the two prediction methods Psipred and JPred3 did a good job; they managed to predict most of the main secondary structure elements, with only minor variations in length and position of the individual helices/sheets and very minor variations between each other. A somewhat more detailed result from DSSP is to be expected, as it has pointedly better information to and merely assigns instead of actually predicting the secondary structure. |
||
− | pdb id: 2o53 |
||
==Prediction of disordered regions== |
==Prediction of disordered regions== |
||
===DISOPRED=== |
===DISOPRED=== |
||
+ | |||
+ | DISOPRED predicts native disorder in proteins. It was published in 2004 by Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF and Jones DT. |
||
+ | Reference: [http://www.sciencedirect.com/science/article/pii/S0022283604001482] |
||
+ | |||
+ | DISOPRED uses linear support vector machines to predict disorder in a given protein sequence. A set of 750 proteins with high-quality structures was used as training data; to this end, PSI-Blast profiles were generated by aligning the training structures against a filtered database of protein structures. The resulting profiles were used to train the SVMs. |
||
[[File:Aa9f21ee-f0a3-4cd4-9cf0-366fe1b5377e.dis 1.diso.png|400px|thumb|right|DISOPRED result graph for Aspartoacyclase]] |
[[File:Aa9f21ee-f0a3-4cd4-9cf0-366fe1b5377e.dis 1.diso.png|400px|thumb|right|DISOPRED result graph for Aspartoacyclase]] |
||
Line 218: | Line 226: | ||
===POODLE=== |
===POODLE=== |
||
+ | |||
+ | POODLE (Prediction Of Order and Disorder by machine LEarning) is a series of programs published between 2005 and 2008. We used the latest variant, POODLE-I, which was published in 2008 by S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi. |
||
+ | |||
+ | Reference: |
||
+ | S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi, "Disordered region prediction by integrating POODLE series", CASP8 Proceedings 2008, 14-15. |
||
+ | |||
+ | Input: Protein amino acid sequence |
||
+ | |||
+ | POODLE-I is an integrated variant of other flavors of POODLE (-S and -L for short/long regions of disorder and -W for proteins that are mostly disordered) and several other tools like Psipred, JNet etc. It employs a rather involved [http://mbs.cbrc.jp/poodle/images/workflow.png workflow]. |
||
+ | |||
+ | Custom-formatted output for Aspartoacyclase: |
||
[[File: Aspa disopred.png]] |
[[File: Aspa disopred.png]] |
||
Line 378: | Line 397: | ||
===IUPRED=== |
===IUPRED=== |
||
+ | |||
+ | IUPRED is a software for the prediction of intrinsically unstructured regions in proteins. It was published in 2005 by Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon. |
||
+ | |||
+ | Reference: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content |
||
+ | Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon, Bioinformatics (2005) 21, 3433-3434. |
||
+ | |||
+ | IUPRED predicts disordered regions by estimating the capacity of the amino acid chain to form stabilizing contacts. The underlying assumption is that proteins intrinsically unable to do so have distinct sequences that can be identified via their unfavorable energy values. To this end a 20x20 predictor matrix was calculated from a set of globular proteins with known structure. IUPRED uses this matrix to derive a tendency to be intrinsically unstructured from the amino acid composition alone. |
||
+ | |||
+ | Input: An amino acid sequence. |
||
+ | |||
+ | IUPRED comes in three flavors: Long Disorder, which specializes in finding long stretches of disorder, Short Disorder, which does the same for short stretches of disorder, and structured regions, which predicts regions lacking disorder. |
||
+ | |||
====Long Disorder==== |
====Long Disorder==== |
||
Line 636: | Line 667: | ||
[[File:Aspa iupred3.png]] |
[[File:Aspa iupred3.png]] |
||
− | IUPRED predicts one |
+ | IUPRED predicts one structured region comprised of the whole input sequence. |
− | |||
+ | ====Results==== |
||
+ | IUPRED predicts no significant disorder in Aspartoacyclase. The disorder tendency stays below 0.5 in all cases (except for short stretches of about 3-5 residues at each end of the sequence in short disorder mode, which are negligible) and the structured regions mode predicts one continuous structured region spanning all of the protein sequence. This makes sense when looking at the 3D structure: Aspartoacyclase is a rather densely packed globular structure, which according to the assumptions that IUPRED makes has a strong tendency to form many inter-residue contacts and to stabilize itself thereby, markedly reducing the tendency for disorder in the process. |
||
===Meta-Disorder=== |
===Meta-Disorder=== |
||
+ | |||
+ | Meta-Disorder, as the name implies, employs a set of so-called orthogonal disorder predictors in order to combine their strengths and mitigate their weak points. It was published in 2009 by Avner Schlessinger, Marco Punta, Guy Yachdav, Laszlo Kajan and Burkhard Rost. |
||
+ | |||
+ | Reference: [http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0004433 Paper] |
||
+ | |||
+ | As with the previous methods, Meta-Disorder predicts disorder from the amino acid sequence alone; results from the predictors IUPRED, DISOPRED, NORSnet and Ucon are molded into one final result using a neural network. |
||
+ | |||
+ | Results for Aspartoacyclase: |
||
+ | |||
<pre>Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw MD_rel MD2st |
<pre>Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw MD_rel MD2st |
||
1 M 0.33 - 0.99 D 0.17 - 0.551 1 D |
1 M 0.33 - 0.99 D 0.17 - 0.551 1 D |
||
Line 973: | Line 1,014: | ||
MD2st - two-state prediction by MD |
MD2st - two-state prediction by MD |
||
</pre> |
</pre> |
||
+ | |||
+ | |||
+ | The last column indicates whether or not disorder was predicted at the current position. Meta-Disorder predicts a total of four disorder positions, which are not significant. This coincides with the predictions of the other programs employed previously - not alltogether surprising, since Meta-Disorder draws its predictions from two of them. |
||
==Prediction of transmembrane alpha-helices and signal peptides== |
==Prediction of transmembrane alpha-helices and signal peptides== |
||
+ | |||
+ | The results of this task are unequivocal: Aspartoacyclase does not contain any transmembrane regions. From a biological point of view this was to be expected, as Aspartoacyclase is known to be located in the cytosol. |
||
===TMHMM=== |
===TMHMM=== |
||
Since the VM version could not be made to work, we used the server at http://www.cbs.dtu.dk/services/TMHMM/. |
Since the VM version could not be made to work, we used the server at http://www.cbs.dtu.dk/services/TMHMM/. |
||
+ | |||
+ | TMHMM uses a hidden markov model to predict transmembrane helices in proteins. It was published in 1998 by E. L.L. Sonnhammer, G. von Heijne, and A. Krogh. |
||
+ | |||
+ | Reference: [http://people.binf.ku.dk/~krogh/publications/ps/SonnhammerEtal98.pdf Original paper] |
||
+ | |||
+ | The hidden markov model used by TMHMM models the biological structure with states for helix turns, helix caps and loops on either side of the membrane, which are specially designed to model membrane insertion, too. The HMM probabilities were estimated both by using a maximum likelihood method and a discriminative method. |
||
+ | |||
+ | Results for Aspartoacyclase very clearly show absence of any sort of transmembrane structure, which is biologically sound. |
||
[[File:Sp P45381 ACY2 HUMAN.gif]] |
[[File:Sp P45381 ACY2 HUMAN.gif]] |
||
Line 989: | Line 1,043: | ||
http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output |
http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output |
||
+ | |||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | <pre># BACR_HALSA Length: 262 |
||
+ | # BACR_HALSA Number of predicted TMHs: 6 |
||
+ | # BACR_HALSA Exp number of AAs in TMHs: 140.4032 |
||
+ | # BACR_HALSA Exp number, first 60 AAs: 26.1196 |
||
+ | # BACR_HALSA Total prob of N-in: 0.01887 |
||
+ | # BACR_HALSA POSSIBLE N-term signal sequence |
||
+ | BACR_HALSA TMHMM2.0 outside 1 22 |
||
+ | BACR_HALSA TMHMM2.0 TMhelix 23 42 |
||
+ | BACR_HALSA TMHMM2.0 inside 43 54 |
||
+ | BACR_HALSA TMHMM2.0 TMhelix 55 77 |
||
+ | BACR_HALSA TMHMM2.0 outside 78 91 |
||
+ | BACR_HALSA TMHMM2.0 TMhelix 92 114 |
||
+ | BACR_HALSA TMHMM2.0 inside 115 120 |
||
+ | BACR_HALSA TMHMM2.0 TMhelix 121 143 |
||
+ | BACR_HALSA TMHMM2.0 outside 144 147 |
||
+ | BACR_HALSA TMHMM2.0 TMhelix 148 170 |
||
+ | BACR_HALSA TMHMM2.0 inside 171 189 |
||
+ | BACR_HALSA TMHMM2.0 TMhelix 190 212 |
||
+ | BACR_HALSA TMHMM2.0 outside 213 262</pre> |
||
+ | |||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | <pre># RET4_HUMAN Length: 201 |
||
+ | # RET4_HUMAN Number of predicted TMHs: 0 |
||
+ | # RET4_HUMAN Exp number of AAs in TMHs: 0.01196 |
||
+ | # RET4_HUMAN Exp number, first 60 AAs: 0.01179 |
||
+ | # RET4_HUMAN Total prob of N-in: 0.01909 |
||
+ | RET4_HUMAN TMHMM2.0 outside 1 201</pre> |
||
+ | |||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | <pre># INSL5_HUMAN Length: 135 |
||
+ | # INSL5_HUMAN Number of predicted TMHs: 0 |
||
+ | # INSL5_HUMAN Exp number of AAs in TMHs: 0.50415 |
||
+ | # INSL5_HUMAN Exp number, first 60 AAs: 0.50415 |
||
+ | # INSL5_HUMAN Total prob of N-in: 0.03772 |
||
+ | INSL5_HUMAN TMHMM2.0 outside 1 135</pre> |
||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | <pre># LAMP1_HUMAN Length: 417 |
||
+ | # LAMP1_HUMAN Number of predicted TMHs: 2 |
||
+ | # LAMP1_HUMAN Exp number of AAs in TMHs: 44.89582 |
||
+ | # LAMP1_HUMAN Exp number, first 60 AAs: 22.24286 |
||
+ | # LAMP1_HUMAN Total prob of N-in: 0.99287 |
||
+ | # LAMP1_HUMAN POSSIBLE N-term signal sequence |
||
+ | LAMP1_HUMAN TMHMM2.0 inside 1 10 |
||
+ | LAMP1_HUMAN TMHMM2.0 TMhelix 11 33 |
||
+ | LAMP1_HUMAN TMHMM2.0 outside 34 383 |
||
+ | LAMP1_HUMAN TMHMM2.0 TMhelix 384 406 |
||
+ | LAMP1_HUMAN TMHMM2.0 inside 407 417</pre> |
||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | <pre># A4_HUMAN Length: 770 |
||
+ | # A4_HUMAN Number of predicted TMHs: 1 |
||
+ | # A4_HUMAN Exp number of AAs in TMHs: 22.72525 |
||
+ | # A4_HUMAN Exp number, first 60 AAs: 0.0027 |
||
+ | # A4_HUMAN Total prob of N-in: 0.00015 |
||
+ | A4_HUMAN TMHMM2.0 outside 1 700 |
||
+ | A4_HUMAN TMHMM2.0 TMhelix 701 723 |
||
+ | A4_HUMAN TMHMM2.0 inside 724 770</pre> |
||
===Phobius & PolyPhobius=== |
===Phobius & PolyPhobius=== |
||
+ | |||
+ | Phobius is a program for the prediction of transmembrane region with special emphasis on reducing confusion with signal peptides. It was published in 2005 by Käll L, Krogh A, Sonnhammer EL. |
||
+ | |||
+ | Reference: [http://www.ncbi.nlm.nih.gov/pubmed/15111065?dopt=Abstract Paper] |
||
+ | |||
+ | Signal peptides and transmembrane proteins share a great deal of similarity and are often confused by predictors for either class; Phobius aims to predict both and to discriminate between them. It employs a hidden markov model to do this, modelling the different sequence regions pertaining to either class. |
||
+ | |||
+ | Input: An amino acid sequence. |
||
+ | |||
+ | Again, neither signal nor transmembrane regions were detected in Aspartoacyclase. |
||
+ | |||
[[File:Aspa phobius.png]] |
[[File:Aspa phobius.png]] |
||
+ | |||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | <pre>ID |
||
+ | FT TOPO_DOM 1 22 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 23 42 |
||
+ | FT TOPO_DOM 43 53 CYTOPLASMIC. |
||
+ | FT TRANSMEM 54 76 |
||
+ | FT TOPO_DOM 77 95 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 96 114 |
||
+ | FT TOPO_DOM 115 120 CYTOPLASMIC. |
||
+ | FT TRANSMEM 121 142 |
||
+ | FT TOPO_DOM 143 147 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 148 169 |
||
+ | FT TOPO_DOM 170 189 CYTOPLASMIC. |
||
+ | FT TRANSMEM 190 212 |
||
+ | FT TOPO_DOM 213 217 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 218 237 |
||
+ | FT TOPO_DOM 238 262 CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | With PolyPhobius: |
||
+ | <pre>ID BACR_HALSA |
||
+ | FT TOPO_DOM 1 21 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 22 43 |
||
+ | FT TOPO_DOM 44 54 CYTOPLASMIC. |
||
+ | FT TRANSMEM 55 77 |
||
+ | FT TOPO_DOM 78 94 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 95 114 |
||
+ | FT TOPO_DOM 115 120 CYTOPLASMIC. |
||
+ | FT TRANSMEM 121 141 |
||
+ | FT TOPO_DOM 142 147 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 148 166 |
||
+ | FT TOPO_DOM 167 186 CYTOPLASMIC. |
||
+ | FT TRANSMEM 187 205 |
||
+ | FT TOPO_DOM 206 215 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 216 237 |
||
+ | FT TOPO_DOM 238 262 CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | <pre>ID RET4_HUMAN |
||
+ | FT SIGNAL 1 18 |
||
+ | FT REGION 1 2 N-REGION. |
||
+ | FT REGION 3 13 H-REGION. |
||
+ | FT REGION 14 18 C-REGION. |
||
+ | FT TOPO_DOM 19 201 NON CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | With PolyPhobius: |
||
+ | <pre>ID RET4_HUMAN |
||
+ | FT SIGNAL 1 18 |
||
+ | FT REGION 1 3 N-REGION. |
||
+ | FT REGION 4 13 H-REGION. |
||
+ | FT REGION 14 18 C-REGION. |
||
+ | FT TOPO_DOM 19 201 NON CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | <pre>ID |
||
+ | FT SIGNAL 1 22 |
||
+ | FT REGION 1 5 N-REGION. |
||
+ | FT REGION 6 17 H-REGION. |
||
+ | FT REGION 18 22 C-REGION. |
||
+ | FT TOPO_DOM 23 135 NON CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | With PolyPhobius: |
||
+ | <pre>ID INSL5_HUMAN |
||
+ | FT SIGNAL 1 22 |
||
+ | FT REGION 1 4 N-REGION. |
||
+ | FT REGION 5 16 H-REGION. |
||
+ | FT REGION 17 22 C-REGION. |
||
+ | FT TOPO_DOM 23 135 NON CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | <pre>ID |
||
+ | FT SIGNAL 1 28 |
||
+ | FT REGION 1 10 N-REGION. |
||
+ | FT REGION 11 22 H-REGION. |
||
+ | FT REGION 23 28 C-REGION. |
||
+ | FT TOPO_DOM 29 381 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 382 405 |
||
+ | FT TOPO_DOM 406 417 CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | With PolyPhobius: |
||
+ | <pre>ID LAMP1_HUMAN |
||
+ | FT SIGNAL 1 28 |
||
+ | FT REGION 1 9 N-REGION. |
||
+ | FT REGION 10 22 H-REGION. |
||
+ | FT REGION 23 28 C-REGION. |
||
+ | FT TOPO_DOM 29 381 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 382 405 |
||
+ | FT TOPO_DOM 406 417 CYTOPLASMIC. |
||
+ | //</pre> |
||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | <pre>ID A4_HUMAN |
||
+ | FT SIGNAL 1 17 |
||
+ | FT REGION 1 1 N-REGION. |
||
+ | FT REGION 2 12 H-REGION. |
||
+ | FT REGION 13 17 C-REGION. |
||
+ | FT TOPO_DOM 18 700 NON CYTOPLASMIC. |
||
+ | FT TRANSMEM 701 723 |
||
+ | FT TOPO_DOM 724 770 CYTOPLASMIC. |
||
+ | //</pre> |
||
===OCTOPUS & SPOCTOPUS=== |
===OCTOPUS & SPOCTOPUS=== |
||
+ | |||
− | No transmembrane regions were predicted by both methods. |
||
+ | OCTOPUS uses a combination of hidden markov models and neural networks to predict transmembrane regions. It was published in 2004 by Käll L, Krogh A, Sonnhammer EL. |
||
+ | |||
+ | Reference: [http://www.ncbi.nlm.nih.gov/pubmed/15111065?ordinalpos=6&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum Original paper] |
||
+ | |||
+ | OCROPUS first creates a sequence profile by running BLAST with the input sequence. Neural networks are used to subsequently predict the propensity for each residue to be located in a transmembrane region or in certain structure patterns on either side of the membrane. The resulting propensities are then fed to a hidden markov model, which calculates the most likely topology. |
||
+ | |||
+ | SPOCTOPUS extends OCTOPUS with a preprocessor that uses a neural network to assess the probability that the first 70 residues of the input sequence contain a signal peptide sequence. If this scores high enough, a hidden markov model is used to ascertain the exact offset of the signal region. |
||
+ | |||
+ | No transmembrane/signal regions were predicted for Aspartoacyclase. |
||
+ | |||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | <pre>OCTOPUS predicted topology: |
||
+ | oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM |
||
+ | MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii |
||
+ | MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii |
||
+ | iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii |
||
+ | iiiiiiiiiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM |
||
+ | MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiM |
||
+ | MMMMMMMMMMMMMMMMMMMMooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii |
||
+ | iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii |
||
+ | iiiiiiiiiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | <pre>OCTOPUS predicted topology: |
||
+ | iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooo</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnnSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooo</pre> |
||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | <pre>OCTOPUS predicted topology: |
||
+ | iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooo</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooo</pre> |
||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | <pre>OCTOPUS predicted topology: |
||
+ | iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnnnnnnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | <pre>OCTOPUS predicted topology: |
||
+ | ooooRRRRRRoooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM |
||
+ | Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM |
||
+ | Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii |
||
+ | </pre> |
||
===SignalP=== |
===SignalP=== |
||
+ | SignalP is a method for the detection of signal peptides. It was first published in 1997 by Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne. |
||
− | ====HMM==== |
||
+ | |||
+ | Reference: [http://www.ncbi.nlm.nih.gov/pubmed/9051728?dopt=Abstract Original paper], [http://www.ncbi.nlm.nih.gov/pubmed/15223320?dopt=Abstract current version] |
||
+ | |||
+ | SignalP comes in two flavours: One using a neural network, the other using a hidden markov model. It supports discriminating between cleaved and uncleaved signal peptides and supports both prokaryotic and eukaryotic input. |
||
+ | |||
+ | Input: A protein sequence. |
||
+ | |||
+ | Neither flavour detected any signal sequence in Aspartoacyclase. |
||
+ | |||
+ | ====Aspartoacyclase: HMM==== |
||
[[File:ASPA Plot.hmm.1.gif]] |
[[File:ASPA Plot.hmm.1.gif]] |
||
− | ====Neural Network==== |
+ | ====Aspartoacyclase: Neural Network==== |
[[File:ASPA Plot.nn.1.gif]] |
[[File:ASPA Plot.nn.1.gif]] |
||
+ | |||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | Neural Network: |
||
+ | <pre>BACR_HALSA length = 70 |
||
+ | # Measure Position Value Cutoff signal peptide? |
||
+ | max. C 16 0.220 0.32 NO |
||
+ | max. Y 39 0.196 0.33 NO |
||
+ | max. S 31 0.970 0.87 YES |
||
+ | mean S 1-38 0.426 0.48 NO |
||
+ | D 1-38 0.311 0.43 NO |
||
+ | # Most likely cleavage site between pos. 38 and 39: GTL-YF</pre> |
||
+ | |||
+ | HMM: |
||
+ | <pre>Prediction: Signal anchor |
||
+ | Signal peptide probability: 0.017 |
||
+ | Signal anchor probability: 0.859 |
||
+ | Max cleavage site probability: 0.004 between pos. 15 and 16 |
||
+ | </pre> |
||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | Neural Network: |
||
+ | <pre>RET4_HUMAN length = 70 |
||
+ | # Measure Position Value Cutoff signal peptide? |
||
+ | max. C 19 0.929 0.32 YES |
||
+ | max. Y 19 0.901 0.33 YES |
||
+ | max. S 1 0.994 0.87 YES |
||
+ | mean S 1-18 0.938 0.48 YES |
||
+ | D 1-18 0.920 0.43 YES |
||
+ | # Most likely cleavage site between pos. 18 and 19: GRA-ER</pre> |
||
+ | |||
+ | HMM: |
||
+ | <pre>RET4_HUMAN |
||
+ | Prediction: Signal peptide |
||
+ | Signal peptide probability: 1.000 |
||
+ | Signal anchor probability: 0.000 |
||
+ | Max cleavage site probability: 0.979 between pos. 18 and 19</pre> |
||
+ | |||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | Neural Network: |
||
+ | <pre>INSL5_HUMAN length = 70 |
||
+ | # Measure Position Value Cutoff signal peptide? |
||
+ | max. C 23 0.855 0.32 YES |
||
+ | max. Y 23 0.778 0.33 YES |
||
+ | max. S 13 0.987 0.87 YES |
||
+ | mean S 1-22 0.852 0.48 YES |
||
+ | D 1-22 0.815 0.43 YES |
||
+ | # Most likely cleavage site between pos. 22 and 23: VRS-KE</pre> |
||
+ | |||
+ | HMM: |
||
+ | <pre>INSL5_HUMAN |
||
+ | Prediction: Signal peptide |
||
+ | Signal peptide probability: 0.999 |
||
+ | Signal anchor probability: 0.000 |
||
+ | Max cleavage site probability: 0.911 between pos. 22 and 23</pre> |
||
+ | |||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | Neural Network: |
||
+ | <pre>LAMP1_HUMAN length = 70 |
||
+ | # Measure Position Value Cutoff signal peptide? |
||
+ | max. C 29 0.978 0.32 YES |
||
+ | max. Y 29 0.903 0.33 YES |
||
+ | max. S 19 0.999 0.87 YES |
||
+ | mean S 1-28 0.960 0.48 YES |
||
+ | D 1-28 0.932 0.43 YES |
||
+ | # Most likely cleavage site between pos. 28 and 29: ASA-AM</pre> |
||
+ | |||
+ | HMM: |
||
+ | <pre>LAMP1_HUMAN |
||
+ | Prediction: Signal peptide |
||
+ | Signal peptide probability: 1.000 |
||
+ | Signal anchor probability: 0.000 |
||
+ | Max cleavage site probability: 0.847 between pos. 28 and 29</pre> |
||
+ | |||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | Neural Network: |
||
+ | <pre>A4_HUMAN length = 70 |
||
+ | # Measure Position Value Cutoff signal peptide? |
||
+ | max. C 18 0.891 0.32 YES |
||
+ | max. Y 18 0.850 0.33 YES |
||
+ | max. S 2 0.992 0.87 YES |
||
+ | mean S 1-17 0.967 0.48 YES |
||
+ | D 1-17 0.909 0.43 YES |
||
+ | # Most likely cleavage site between pos. 17 and 18: ARA-LE</pre> |
||
+ | |||
+ | HMM: |
||
+ | <pre>A4_HUMAN |
||
+ | Prediction: Signal peptide |
||
+ | Signal peptide probability: 1.000 |
||
+ | Signal anchor probability: 0.000 |
||
+ | Max cleavage site probability: 0.993 between pos. 17 and 18 |
||
+ | </pre> |
||
===TargetP=== |
===TargetP=== |
||
+ | |||
+ | TargetP is a software for the prediction of the cellular location of certain proteins, based on location signals in their sequence. It was published in 2000 by Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1. |
||
+ | |||
+ | Reference: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1. J. Mol. Biol., 300: 1005-1016, 2000. |
||
+ | |||
+ | TargetP confines its analysis to the N-terminal part of the sequence, it can discriminate between proteins destined for either mitochondrion, chloroplast (plants only, for obvious reasons), the secretory pathway or another location. |
||
+ | |||
+ | The prediction for Aspartoacyclase was "other location", which is plausible, as the enzyme is known to reside in the cytosol. |
||
+ | |||
<pre>### targetp v1.1 prediction results ################################## |
<pre>### targetp v1.1 prediction results ################################## |
||
Number of query sequences: 1 |
Number of query sequences: 1 |
||
Line 1,017: | Line 1,467: | ||
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php |
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php |
||
+ | |||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | <pre>Number of query sequences: 1 |
||
+ | Cleavage site predictions not included. |
||
+ | Using NON-PLANT networks. |
||
+ | |||
+ | Name Len mTP SP other Loc RC |
||
+ | ---------------------------------------------------------------------- |
||
+ | BACR_HALSA 262 0.019 0.897 0.562 S 4 |
||
+ | ---------------------------------------------------------------------- |
||
+ | cutoff 0.000 0.000 0.000 |
||
+ | </pre> |
||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | <pre>Number of query sequences: 1 |
||
+ | Cleavage site predictions not included. |
||
+ | Using NON-PLANT networks. |
||
+ | |||
+ | Name Len mTP SP other Loc RC |
||
+ | ---------------------------------------------------------------------- |
||
+ | RET4_HUMAN 201 0.242 0.928 0.020 S 2 |
||
+ | ---------------------------------------------------------------------- |
||
+ | cutoff 0.000 0.000 0.000 |
||
+ | </pre> |
||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | <pre>Number of query sequences: 1 |
||
+ | Cleavage site predictions not included. |
||
+ | Using NON-PLANT networks. |
||
+ | |||
+ | Name Len mTP SP other Loc RC |
||
+ | ---------------------------------------------------------------------- |
||
+ | INSL5_HUMAN 135 0.074 0.899 0.037 S 1 |
||
+ | ---------------------------------------------------------------------- |
||
+ | cutoff 0.000 0.000 0.000 |
||
+ | </pre> |
||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | <pre>Number of query sequences: 1 |
||
+ | Cleavage site predictions not included. |
||
+ | Using NON-PLANT networks. |
||
+ | |||
+ | Name Len mTP SP other Loc RC |
||
+ | ---------------------------------------------------------------------- |
||
+ | LAMP1_HUMAN 417 0.043 0.953 0.017 S 1 |
||
+ | ---------------------------------------------------------------------- |
||
+ | cutoff 0.000 0.000 0.000 |
||
+ | </pre> |
||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | <pre>Number of query sequences: 1 |
||
+ | Cleavage site predictions not included. |
||
+ | Using NON-PLANT networks. |
||
+ | |||
+ | Name Len mTP SP other Loc RC |
||
+ | ---------------------------------------------------------------------- |
||
+ | A4_HUMAN 770 0.035 0.937 0.084 S 1 |
||
+ | ---------------------------------------------------------------------- |
||
+ | cutoff 0.000 0.000 0.000 |
||
+ | </pre> |
||
+ | |||
+ | |||
+ | ===Analysis=== |
||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | |||
+ | TM Prediction: |
||
+ | TMHMM predicts one helix less than the other tools (ca. 216-237); other than that, all methods consent on 7 TM helices with insignificant differences. The PDB structure shows that this is correct. |
||
+ | |||
+ | Signalpeptid: |
||
+ | SignalP predicts it to be a signal peptide (NN mode) and a signal anchor (HMM mode); according to the information we found on the protein, both predictions are faulty. |
||
+ | |||
+ | Target prediction: |
||
+ | TargetP predicted this protein to be located in the secretory pathway and to be a signal peptide; this is not correct. |
||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | |||
+ | TM Prediction: |
||
+ | Only Octopus predicts a TM helix; this is a mis-identified signal sequence. The other programs predict no TM helices, which is correct. |
||
+ | |||
+ | Signal peptide prediction: |
||
+ | Phobius predicts it to be a signal peptide; so do Spoctopus, and SignalP, with the cleaving site at position 18. According to Uniprot, this is correct. |
||
+ | |||
+ | Target prediction: |
||
+ | TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct. |
||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | |||
+ | TM Prediction: |
||
+ | Only Octopus predicts a transmembrane element, which is a mis-identified signal sequence. |
||
+ | |||
+ | Signal peptide prediction: |
||
+ | Phobius predicts a signal sequence with cleaving site at 22; Spoctopus predicts the cleaving site at 23; SignalP predicts it to be between 22 and 23. |
||
+ | |||
+ | Target prediction: |
||
+ | TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct. |
||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | |||
+ | TM Prediction: |
||
+ | TMHMM detects two TM helices; so does Octopus. One TM helix is detected as a signal sequence by Spoctopus and Phobius. |
||
+ | |||
+ | Signal peptide prediction: |
||
+ | Phobius, Spoctopus and SignalP find a signal sequence with cleaving site at 28-29. This is correct, according to Uniprot. |
||
+ | |||
+ | Target prediction: |
||
+ | TargetP predicted this protein to be a signal peptide in the secretory pathway; since it is membrane-located, this is not correct. |
||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | |||
+ | TM Prediction: |
||
+ | One TM helix from 701 to 723 predicted by all programms (end is 722 in case of Octopus and Spoctopus). One short reentrant sequence predicted by Octopus, which is a mis-identified signal sequence. |
||
+ | |||
+ | Signal peptide prediction: |
||
+ | Spoctopus, SignalP and Phobius all report a signal sequence with cleaving site at 17-19. According to Uniprot, this is correct. |
||
+ | |||
+ | Target prediction: |
||
+ | TargetP predicted this protein to be a signal peptide in the secretory pathway; this is wrong, as it is membrane-associated. |
||
==Prediction of GO terms== |
==Prediction of GO terms== |
||
===GOPET=== |
===GOPET=== |
||
+ | |||
+ | GOPET is a tool aimed at automatically assigning Gene Ontology terms to proteins. It was published in 2006 by Arunachalam Vinayagam, Coral del Val, Falk Schubert, Roland Eils, Karl-Heinz Glatting, Sándor Suhai and Rainer König. |
||
+ | |||
+ | Reference: [http://www.biomedcentral.com/1471-2105/7/161 Paper] |
||
+ | |||
+ | The input sequence is first BLASTed against a database of proteins with known GO terms; a support vector machine is then used to discriminate between correct and false terms. |
||
+ | |||
+ | Results for Aspartoacyclase, all coinciding nicely with the current knowledge on the enzyme: |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
Line 1,038: | Line 1,615: | ||
|- |
|- |
||
|} |
|} |
||
+ | |||
+ | |||
+ | Other proteins: |
||
+ | |||
+ | [[File:Aspa other goped.gif]] |
||
===Pfam=== |
===Pfam=== |
||
+ | |||
+ | PFAM is a large database of protein functions. It was established in 1998 at the Wellcome Trust Sanger Institute. |
||
+ | |||
+ | It is comprised of two database: Pfam-A, a manually curated high-quality database with a limited number of entries, and the much larger, automatically curated, Pfam-B. |
||
+ | |||
+ | Reference: The Pfam protein families database: R.D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J.E. Pollington, O.L. Gavin, P. Gunesekaran, G. Ceric, K. Forslund, L. Holm, E.L. Sonnhammer, S.R. Eddy, A. Bateman |
||
+ | |||
+ | The result for Aspartoacyclase is spot-on: |
||
+ | |||
[[File:Aspa pfam significant.png]] |
[[File:Aspa pfam significant.png]] |
||
+ | |||
+ | |||
+ | ====BACR_HALSA==== |
||
+ | [[File:Aspa bacr.png]] |
||
+ | |||
+ | ====RET4_HUMAN==== |
||
+ | [[File:Aspa ret4.png]] |
||
+ | |||
+ | ====INSL5_HUMAN==== |
||
+ | [[File:Aspa insl.png]] |
||
+ | |||
+ | ====LAMP1_HUMAN==== |
||
+ | [[File:Aspa lamp1.png]] |
||
+ | |||
+ | ====A4_HUMAN==== |
||
+ | [[File:Aspa a4.png]] |
||
===ProtFun 2.2=== |
===ProtFun 2.2=== |
||
+ | |||
− | ... |
||
+ | ProtFun is a program for ab-initio protein function prediction. It was published in 2002 by Juhl Jensen et al. |
||
+ | |||
+ | Reference: [http://www.cbs.dtu.dk/services/ProtFun/abstract.php Paper Abstract] |
||
+ | |||
+ | The software queries a number of existing prediction servers for a wide range of features, from isoelectic point to posttranslational modifications, and deduces its function from this data. |
||
+ | |||
+ | Results for Aspartoacyclase: |
||
+ | |||
+ | <pre>############## ProtFun 2.2 predictions ############## |
||
+ | |||
+ | >sp_P45381_A |
||
+ | |||
+ | # Functional category Prob Odds |
||
+ | Amino_acid_biosynthesis 0.071 3.233 |
||
+ | Biosynthesis_of_cofactors 0.144 2.003 |
||
+ | Cell_envelope 0.033 0.535 |
||
+ | Cellular_processes 0.137 1.875 |
||
+ | Central_intermediary_metabolism => 0.334 5.309 |
||
+ | Energy_metabolism 0.226 2.511 |
||
+ | Fatty_acid_metabolism 0.022 1.663 |
||
+ | Purines_and_pyrimidines 0.367 1.512 |
||
+ | Regulatory_functions 0.021 0.128 |
||
+ | Replication_and_transcription 0.167 0.625 |
||
+ | Translation 0.113 2.559 |
||
+ | Transport_and_binding 0.017 0.042 |
||
+ | |||
+ | # Enzyme/nonenzyme Prob Odds |
||
+ | Enzyme => 0.703 2.454 |
||
+ | Nonenzyme 0.297 0.416 |
||
+ | |||
+ | # Enzyme class Prob Odds |
||
+ | Oxidoreductase (EC 1.-.-.-) 0.111 0.534 |
||
+ | Transferase (EC 2.-.-.-) 0.202 0.585 |
||
+ | Hydrolase (EC 3.-.-.-) 0.115 0.363 |
||
+ | Lyase (EC 4.-.-.-) 0.031 0.662 |
||
+ | Isomerase (EC 5.-.-.-) => 0.084 2.637 |
||
+ | Ligase (EC 6.-.-.-) 0.074 1.460 |
||
+ | |||
+ | # Gene Ontology category Prob Odds |
||
+ | Signal_transducer 0.053 0.246 |
||
+ | Receptor 0.004 0.024 |
||
+ | Hormone 0.001 0.206 |
||
+ | Structural_protein 0.001 0.041 |
||
+ | Transporter 0.025 0.230 |
||
+ | Ion_channel 0.015 0.257 |
||
+ | Voltage-gated_ion_channel 0.004 0.173 |
||
+ | Cation_channel 0.011 0.234 |
||
+ | Transcription 0.100 0.785 |
||
+ | Transcription_regulation 0.039 0.313 |
||
+ | Stress_response 0.010 0.117 |
||
+ | Immune_response 0.061 0.720 |
||
+ | Growth_factor 0.006 0.450 |
||
+ | Metal_ion_transport 0.009 0.020 |
||
+ | </pre> |
||
+ | |||
+ | |||
+ | Other proteins: |
||
+ | <pre>############## ProtFun 2.2 predictions ############## |
||
+ | |||
+ | >LAMP1_HUMAN |
||
+ | |||
+ | # Functional category Prob Odds |
||
+ | Amino_acid_biosynthesis 0.011 0.484 |
||
+ | Biosynthesis_of_cofactors 0.053 0.735 |
||
+ | Cell_envelope => 0.804 13.186 |
||
+ | Cellular_processes 0.027 0.373 |
||
+ | Central_intermediary_metabolism 0.138 2.188 |
||
+ | Energy_metabolism 0.037 0.411 |
||
+ | Fatty_acid_metabolism 0.016 1.265 |
||
+ | Purines_and_pyrimidines 0.533 2.195 |
||
+ | Regulatory_functions 0.015 0.090 |
||
+ | Replication_and_transcription 0.019 0.073 |
||
+ | Translation 0.027 0.613 |
||
+ | Transport_and_binding 0.834 2.033 |
||
+ | |||
+ | # Enzyme/nonenzyme Prob Odds |
||
+ | Enzyme 0.276 0.965 |
||
+ | Nonenzyme => 0.724 1.014 |
||
+ | |||
+ | # Enzyme class Prob Odds |
||
+ | Oxidoreductase (EC 1.-.-.-) 0.039 0.187 |
||
+ | Transferase (EC 2.-.-.-) 0.046 0.134 |
||
+ | Hydrolase (EC 3.-.-.-) 0.058 0.184 |
||
+ | Lyase (EC 4.-.-.-) 0.020 0.430 |
||
+ | Isomerase (EC 5.-.-.-) 0.010 0.321 |
||
+ | Ligase (EC 6.-.-.-) 0.017 0.326 |
||
+ | |||
+ | # Gene Ontology category Prob Odds |
||
+ | Signal_transducer 0.396 1.849 |
||
+ | Receptor 0.282 1.659 |
||
+ | Hormone 0.001 0.206 |
||
+ | Structural_protein 0.011 0.408 |
||
+ | Transporter 0.024 0.222 |
||
+ | Ion_channel 0.008 0.147 |
||
+ | Voltage-gated_ion_channel 0.002 0.111 |
||
+ | Cation_channel 0.010 0.215 |
||
+ | Transcription 0.032 0.247 |
||
+ | Transcription_regulation 0.018 0.142 |
||
+ | Stress_response 0.246 2.795 |
||
+ | Immune_response => 0.371 4.368 |
||
+ | Growth_factor 0.013 0.956 |
||
+ | Metal_ion_transport 0.009 0.020 |
||
+ | |||
+ | // |
||
+ | |||
+ | |||
+ | >RET4_HUMAN |
||
+ | |||
+ | # Functional category Prob Odds |
||
+ | Amino_acid_biosynthesis 0.017 0.751 |
||
+ | Biosynthesis_of_cofactors 0.044 0.610 |
||
+ | Cell_envelope => 0.804 13.186 |
||
+ | Cellular_processes 0.075 1.021 |
||
+ | Central_intermediary_metabolism 0.197 3.128 |
||
+ | Energy_metabolism 0.043 0.475 |
||
+ | Fatty_acid_metabolism 0.016 1.265 |
||
+ | Purines_and_pyrimidines 0.275 1.131 |
||
+ | Regulatory_functions 0.013 0.080 |
||
+ | Replication_and_transcription 0.022 0.084 |
||
+ | Translation 0.032 0.721 |
||
+ | Transport_and_binding 0.800 1.951 |
||
+ | |||
+ | # Enzyme/nonenzyme Prob Odds |
||
+ | Enzyme => 0.544 1.900 |
||
+ | Nonenzyme 0.456 0.639 |
||
+ | |||
+ | # Enzyme class Prob Odds |
||
+ | Oxidoreductase (EC 1.-.-.-) 0.095 0.458 |
||
+ | Transferase (EC 2.-.-.-) 0.038 0.109 |
||
+ | Hydrolase (EC 3.-.-.-) 0.235 0.742 |
||
+ | Lyase (EC 4.-.-.-) => 0.059 1.264 |
||
+ | Isomerase (EC 5.-.-.-) 0.010 0.321 |
||
+ | Ligase (EC 6.-.-.-) 0.017 0.326 |
||
+ | |||
+ | # Gene Ontology category Prob Odds |
||
+ | Signal_transducer 0.202 0.942 |
||
+ | Receptor 0.147 0.862 |
||
+ | Hormone 0.004 0.667 |
||
+ | Structural_protein 0.002 0.058 |
||
+ | Transporter 0.025 0.232 |
||
+ | Ion_channel 0.016 0.288 |
||
+ | Voltage-gated_ion_channel 0.003 0.148 |
||
+ | Cation_channel 0.010 0.215 |
||
+ | Transcription 0.027 0.207 |
||
+ | Transcription_regulation 0.025 0.196 |
||
+ | Stress_response 0.161 1.829 |
||
+ | Immune_response => 0.239 2.813 |
||
+ | Growth_factor 0.023 1.617 |
||
+ | Metal_ion_transport 0.009 0.020 |
||
+ | |||
+ | // |
||
+ | |||
+ | |||
+ | >BACR_HALSA |
||
+ | |||
+ | # Functional category Prob Odds |
||
+ | Amino_acid_biosynthesis 0.033 1.495 |
||
+ | Biosynthesis_of_cofactors 0.186 2.589 |
||
+ | Cell_envelope 0.029 0.483 |
||
+ | Cellular_processes 0.051 0.694 |
||
+ | Central_intermediary_metabolism 0.045 0.711 |
||
+ | Energy_metabolism 0.138 1.537 |
||
+ | Fatty_acid_metabolism 0.016 1.265 |
||
+ | Purines_and_pyrimidines 0.302 1.244 |
||
+ | Regulatory_functions 0.013 0.080 |
||
+ | Replication_and_transcription 0.019 0.073 |
||
+ | Translation 0.059 1.339 |
||
+ | Transport_and_binding => 0.791 1.929 |
||
+ | |||
+ | # Enzyme/nonenzyme Prob Odds |
||
+ | Enzyme 0.199 0.696 |
||
+ | Nonenzyme => 0.801 1.122 |
||
+ | |||
+ | # Enzyme class Prob Odds |
||
+ | Oxidoreductase (EC 1.-.-.-) 0.114 0.549 |
||
+ | Transferase (EC 2.-.-.-) 0.031 0.091 |
||
+ | Hydrolase (EC 3.-.-.-) 0.057 0.180 |
||
+ | Lyase (EC 4.-.-.-) 0.020 0.430 |
||
+ | Isomerase (EC 5.-.-.-) 0.010 0.321 |
||
+ | Ligase (EC 6.-.-.-) 0.017 0.326 |
||
+ | |||
+ | # Gene Ontology category Prob Odds |
||
+ | Signal_transducer 0.258 1.205 |
||
+ | Receptor 0.355 2.087 |
||
+ | Hormone 0.001 0.206 |
||
+ | Structural_protein 0.006 0.200 |
||
+ | Transporter => 0.440 4.036 |
||
+ | Ion_channel 0.010 0.169 |
||
+ | Voltage-gated_ion_channel 0.004 0.172 |
||
+ | Cation_channel 0.078 1.689 |
||
+ | Transcription 0.026 0.205 |
||
+ | Transcription_regulation 0.028 0.226 |
||
+ | Stress_response 0.012 0.139 |
||
+ | Immune_response 0.011 0.128 |
||
+ | Growth_factor 0.010 0.727 |
||
+ | Metal_ion_transport 0.049 0.106 |
||
+ | |||
+ | // |
||
+ | |||
+ | |||
+ | >INSL5_HUMAN |
||
+ | |||
+ | # Functional category Prob Odds |
||
+ | Amino_acid_biosynthesis 0.011 0.484 |
||
+ | Biosynthesis_of_cofactors 0.040 0.558 |
||
+ | Cell_envelope => 0.756 12.393 |
||
+ | Cellular_processes 0.033 0.448 |
||
+ | Central_intermediary_metabolism 0.048 0.755 |
||
+ | Energy_metabolism 0.036 0.397 |
||
+ | Fatty_acid_metabolism 0.016 1.265 |
||
+ | Purines_and_pyrimidines 0.144 0.592 |
||
+ | Regulatory_functions 0.014 0.087 |
||
+ | Replication_and_transcription 0.020 0.075 |
||
+ | Translation 0.032 0.735 |
||
+ | Transport_and_binding 0.834 2.033 |
||
+ | |||
+ | # Enzyme/nonenzyme Prob Odds |
||
+ | Enzyme 0.209 0.729 |
||
+ | Nonenzyme => 0.791 1.109 |
||
+ | |||
+ | # Enzyme class Prob Odds |
||
+ | Oxidoreductase (EC 1.-.-.-) 0.056 0.268 |
||
+ | Transferase (EC 2.-.-.-) 0.031 0.091 |
||
+ | Hydrolase (EC 3.-.-.-) 0.062 0.195 |
||
+ | Lyase (EC 4.-.-.-) 0.020 0.430 |
||
+ | Isomerase (EC 5.-.-.-) 0.010 0.321 |
||
+ | Ligase (EC 6.-.-.-) 0.017 0.327 |
||
+ | |||
+ | # Gene Ontology category Prob Odds |
||
+ | Signal_transducer 0.374 1.746 |
||
+ | Receptor 0.128 0.750 |
||
+ | Hormone => 0.247 37.936 |
||
+ | Structural_protein 0.001 0.041 |
||
+ | Transporter 0.025 0.228 |
||
+ | Ion_channel 0.010 0.168 |
||
+ | Voltage-gated_ion_channel 0.003 0.131 |
||
+ | Cation_channel 0.010 0.215 |
||
+ | Transcription 0.054 0.425 |
||
+ | Transcription_regulation 0.091 0.724 |
||
+ | Stress_response 0.099 1.128 |
||
+ | Immune_response 0.178 2.090 |
||
+ | Growth_factor 0.061 4.379 |
||
+ | Metal_ion_transport 0.009 0.020 |
||
+ | |||
+ | // |
||
+ | |||
+ | |||
+ | >A4_HUMAN |
||
+ | |||
+ | # Functional category Prob Odds |
||
+ | Amino_acid_biosynthesis 0.020 0.921 |
||
+ | Biosynthesis_of_cofactors 0.261 3.623 |
||
+ | Cell_envelope => 0.804 13.186 |
||
+ | Cellular_processes 0.053 0.730 |
||
+ | Central_intermediary_metabolism 0.184 2.920 |
||
+ | Energy_metabolism 0.023 0.259 |
||
+ | Fatty_acid_metabolism 0.016 1.265 |
||
+ | Purines_and_pyrimidines 0.417 1.716 |
||
+ | Regulatory_functions 0.013 0.084 |
||
+ | Replication_and_transcription 0.029 0.109 |
||
+ | Translation 0.027 0.613 |
||
+ | Transport_and_binding 0.827 2.016 |
||
+ | |||
+ | # Enzyme/nonenzyme Prob Odds |
||
+ | Enzyme => 0.392 1.368 |
||
+ | Nonenzyme 0.608 0.852 |
||
+ | |||
+ | # Enzyme class Prob Odds |
||
+ | Oxidoreductase (EC 1.-.-.-) 0.024 0.114 |
||
+ | Transferase (EC 2.-.-.-) 0.208 0.603 |
||
+ | Hydrolase (EC 3.-.-.-) 0.190 0.600 |
||
+ | Lyase (EC 4.-.-.-) 0.020 0.430 |
||
+ | Isomerase (EC 5.-.-.-) 0.010 0.324 |
||
+ | Ligase (EC 6.-.-.-) 0.048 0.946 |
||
+ | |||
+ | # Gene Ontology category Prob Odds |
||
+ | Signal_transducer 0.126 0.586 |
||
+ | Receptor 0.036 0.211 |
||
+ | Hormone 0.001 0.206 |
||
+ | Structural_protein => 0.034 1.205 |
||
+ | Transporter 0.024 0.222 |
||
+ | Ion_channel 0.009 0.162 |
||
+ | Voltage-gated_ion_channel 0.002 0.108 |
||
+ | Cation_channel 0.010 0.215 |
||
+ | Transcription 0.043 0.335 |
||
+ | Transcription_regulation 0.018 0.143 |
||
+ | Stress_response 0.076 0.862 |
||
+ | Immune_response 0.016 0.183 |
||
+ | Growth_factor 0.005 0.372 |
||
+ | Metal_ion_transport 0.009 0.020 |
||
+ | |||
+ | //</pre> |
Latest revision as of 03:12, 14 June 2011
Contents
Prediction of Secondary Structure Elements
PsiPred
For a description of PsiPred, see Psipred.
# PSIPRED HFORMAT (PSIPRED V3.0) Conf: 987522213466199993246776008999999984450000587389976339987971 Pred: CCCCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCCCCEEEEEECCHHHHHH AA: MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK 10 20 30 40 50 60 Conf: 998788998878786647999999984999999999988199999997428994187898 Pred: CCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCEEEECCCCCC AA: CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS 70 80 90 100 110 120 Conf: 999505864599448999999998762999737862048886301220027861499667 Pred: CCCCEEEEECCCCHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCHHHHCCCCCCEEEEEC AA: NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG 130 140 150 160 170 180 Conf: 877898808999999999999998976406998899973479998113515579877700 Pred: CCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEEECCCCCCCCCE AA: PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA 190 200 210 220 230 240 Conf: 552467669998546888832213699778518622057770372000011102000100 Pred: EEECCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCEEEEECCCCHHCCCCHHHEECCE AA: AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK 250 260 270 280 290 300 Conf: 3544256113309 Pred: EEEEECCCEEECC AA: LTLNAKSIRCCLH 310
JPred3
JPred3 was published in 1998 by Christian Cole, Jonathan D. Barber and Geoffrey J. Barton.
Reference: Original paper, current version
JPred3 uses the JNet 2.0 algorithm to make its predictions. This algorithm generates profiles using PSI-Blast (which is used to build a position-specific scoring matrix) and HMMer (which is used to construct HMM profiles.) Both position-specific scoring matrix and the HMMs are used to predict secondary structure and solvent accessibility.
Input: A protein sequence or a pre-made MSA; a PDB database is needed, too, but provided by the JPred3 server.
MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYID ------------EEEEEEEE------HHHHHHHHHH---------EEEEEEEE-HHHHHH-----H CDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLILEDSR ---------------------HHHHHHHHHHHHHH-------EEEEEE-----------EEEE--- NNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLRADILDQMRKM -HHHHHHHHHHHH------EEEEEE---------HHEE----EEEEE---------HHHHHHHHHH IKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQDQDWKPLHPGDPMFLT HHHHHHHHHHH----------EEEEEEEEEE----------EEEE----------------HHE-- LDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSIRCCLH ----EEEE----EEEEEEE-----HHH-HHHHHHHHEEE-----EEEE-
DSSP
DSSP (Define Secondary Structure of Proteins) is a software for secondary structure assignment and was published in 1983 by Wolfgang Kabsch and Chris Sander. Reference: Original paper
DSSP does not predict secondary structure from amino acid sequences; instead, it uses a 3D structure (a PDB file) to deduce the secondary structure from the 3D structure. To this end, DSSP examines the phi and psi angles and the C alpha positions in the protein backbone and H-bonds present in the structure; these are used to define "n-turns", which are H-bonds between the NH and CO groups of amino acids with sequence separations of 3-5 residues, and "bridges" with greater sequence separations. Repeating 4-turns are used to identify helices, repeating bridges identify beta sheets.
Input: A 3D structure (a PDB file, ID 2o53 in our case)
Output: (from [1])
H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend
The results differ from those of the two secondary structure predictors, as the PDB file contains a dimer, whereas the Uniprot sequence only contains one domain (which is a sensible thing, since both domains are essentially identical.)
The prediction shows slight differences between both domains; we assume that reasons for this are slight differences in the actual 3D structure of the two chains as well as H-bonds between the two chains.
10 20 30 40 50 60 | | | | | | 1 - 60 EHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYIDCD 1 - 60 SSSSSS TTTT HHHHHHHHHHTT 333 TT SSSSSST HHHHHTTTT TTT 1 - 60 1 - 60 AA AA A AA AAA AA AAAA A A AA AAAA AAAA 70 80 90 100 110 120 | | | | | | 61 - 120 LNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLIL 61 - 120 333 THHHHTT TTT HHHHHHHHHHHHH TTTTTT TSSSSSSS TTT SSSSSS 61 - 120 ** * * ** ** * * 61 - 120 A AAA AAAAAAAAA A A AA AAA AA A 130 140 150 160 170 180 | | | | | | 121 - 180 EDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLR 121 - 180 T TT HHHHHHHHHHHHHHTTT SSSSS TT 3333TTSSSSSSSST TT 121 - 180 * ** ** 121 - 180 A A A AA AAA AAAA A AAAAAA AA A A A 190 200 210 220 230 240 | | | | | | 181 - 240 ADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQ 181 - 240 HHHHHHHHHHHHHHHHHHHHHHTT SSSSSSSSSSSS TTT TSS TTTT 181 - 240 181 - 240 A AA AA AA AA AAAA AA A A A AAA A AAAAA A AA 250 260 270 280 290 300 | | | | | | 241 - 300 DQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSI 241 - 300 T TTT TTTSSSS TT SSS TTT SSSTTT 333TTTT TSSSSSSSSSSS 241 - 300 241 - 300 AA AA AAAAA A AAAAAA AAAAA A AAA A AAA A A 301 - 302 RC 301 - 302 301 - 302 301 - 302 AA 310 320 330 340 350 360 | | | | | | 303 - 362 EHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYIDCD 303 - 362 SSSSSS TTTT HHHHHHHHHHHH 3333 TT SSSSSST HHHHHTT T TTT 303 - 362 303 - 362 AA AA AA AAA AA AAAA A A A AA AAAA AAAA 370 380 390 400 410 420 | | | | | | 363 - 422 LNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLIL 363 - 422 333 THHHHTT TTT HHHHHHHHHHHHH TTTTTT TSSSSSSS TTT SSSSSS 363 - 422 ** *** * ** **** 363 - 422 A AAA AAAAAAAAA A A AA AAA AA A 430 440 450 460 470 480 | | | | | | 423 - 482 EDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLR 423 - 482 T TT HHHHHHHHHHHHHHTTT SSSSS TTTT 3333TTSSSSSSSS TT 423 - 482 423 - 482 A A A AA AA AAAA A AAAAAA AA A A 490 500 510 520 530 540 | | | | | | 483 - 542 ADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQ 483 - 542 HHHHHHHHHHHHHHHHHHHHHHTT SSSSSSSSSSSS TTT TSS TTTT 483 - 542 *** * 483 - 542 A AA AA A AA A AA AA A A A AAA A AAAAA A A AA 550 560 570 580 590 600 | | | | | | 543 - 602 DQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSI 543 - 602 T TTT TTTSSSS TT SSS TTT SSSTTT THHHHTT TSSSSSSSSSSS 543 - 602 * 543 - 602 AA AA AAAAA A AAAAAA AAAAA A AAA A AAAA A AA 603 - 604 RC 603 - 604 603 - 604 603 - 604 AA Clearly solvent accessible: A; involved in symmetry contacts: *
All in all, the two prediction methods Psipred and JPred3 did a good job; they managed to predict most of the main secondary structure elements, with only minor variations in length and position of the individual helices/sheets and very minor variations between each other. A somewhat more detailed result from DSSP is to be expected, as it has pointedly better information to and merely assigns instead of actually predicting the secondary structure.
Prediction of disordered regions
DISOPRED
DISOPRED predicts native disorder in proteins. It was published in 2004 by Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF and Jones DT. Reference: [2]
DISOPRED uses linear support vector machines to predict disorder in a given protein sequence. A set of 750 proteins with high-quality structures was used as training data; to this end, PSI-Blast profiles were generated by aligning the training structures against a filtered database of protein structures. The resulting profiles were used to train the SVMs.
DISOPRED predictions for a false positive rate threshold of: 2% conf: 999999999877640000000000000000000000000000000000000000000000 pred: **********.................................................. AA: MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK 10 20 30 40 50 60 conf: 000000000000000356777788777654200000000000000000000000000000 pred: ......................**.................................... AA: CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS 70 80 90 100 110 120 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG 130 140 150 160 170 180 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA 190 200 210 220 230 240 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK 250 260 270 280 290 300 conf: 0000000000002 pred: ............. AA: LTLNAKSIRCCLH 310 Asterisks (*) represent disorder predictions and dots (.) prediction of order. The confidence estimates give a rough indication of the probability that each residue is disordered.
POODLE
POODLE (Prediction Of Order and Disorder by machine LEarning) is a series of programs published between 2005 and 2008. We used the latest variant, POODLE-I, which was published in 2008 by S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi.
Reference: S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi, "Disordered region prediction by integrating POODLE series", CASP8 Proceedings 2008, 14-15.
Input: Protein amino acid sequence
POODLE-I is an integrated variant of other flavors of POODLE (-S and -L for short/long regions of disorder and -W for proteins that are mostly disordered) and several other tools like Psipred, JNet etc. It employs a rather involved workflow.
Custom-formatted output for Aspartoacyclase:
POS 1 M T S C H I A E E H I -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.461 0.444 0.413 0.401 0.418 0.461 0.537 0.644 0.693 0.62 0.468 POS 12 Q K V A I F G G T H G -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.321 0.238 0.177 0.146 0.128 0.116 0.106 0.104 0.111 0.126 0.132 POS 23 N E L T G V F L V K H -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.131 0.118 0.098 0.073 0.053 0.041 0.036 0.035 0.035 0.036 0.036 POS 34 W L E N G A E I Q R T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.038 0.045 0.06 0.081 0.099 0.119 0.133 0.146 0.147 0.143 0.129 POS 45 G L E V K P F I T N P -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.111 0.09 0.073 0.062 0.054 0.047 0.039 0.033 0.033 0.037 0.041 POS 56 R A V K K C T R Y I D -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.043 0.047 0.054 0.062 0.068 0.071 0.073 0.07 0.067 0.069 0.075 POS 67 C D L N R I F D L E N -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.08 0.081 0.078 0.075 0.073 0.072 0.076 0.094 0.127 0.176 0.249 POS 78 L G K K M S E D L P Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.403 0.554 0.737 0.766 0.804 0.755 0.682 0.65 0.632 0.636 0.583 POS 89 E V R R A Q E I N H L -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.505 0.448 0.348 0.262 0.201 0.16 0.131 0.11 0.103 0.104 0.111 POS 100 F G P K D S E D S Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.116 0.117 0.108 0.089 0.067 0.049 0.039 0.034 0.033 0.035 POS 110 D I I F D L H N T T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.038 0.041 0.043 0.043 0.042 0.041 0.042 0.045 0.052 0.06 POS 120 S N M G C T L I L E -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.07 0.081 0.092 0.101 0.109 0.111 0.107 0.096 0.085 0.072 POS 130 D S R N N F L I Q M -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.06 0.051 0.046 0.04 0.036 0.033 0.032 0.031 0.031 0.031 POS 140 F H Y I K T S L A P -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.033 0.036 0.04 0.043 0.046 0.049 0.05 0.053 0.055 0.059 POS 150 L P C Y V Y L I E H -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.065 0.073 0.088 0.103 0.115 0.119 0.118 0.111 0.104 0.104 POS 160 P S L K Y A T T R S -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.121 0.147 0.19 0.229 0.264 0.263 0.245 0.196 0.149 0.098 POS 170 I A K Y P V G I E V -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.069 0.053 0.051 0.057 0.066 0.08 0.093 0.102 0.103 0.099 POS 180 G P Q P Q G V L R A -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.096 0.094 0.095 0.095 0.099 0.098 0.099 0.095 0.094 0.086 POS 190 D I L D Q M R K M I -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.074 0.059 0.048 0.04 0.038 0.038 0.038 0.039 0.04 0.043 POS 200 K H A L D F I H H F -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.046 0.049 0.051 0.052 0.056 0.064 0.077 0.092 0.112 0.142 POS 210 N E G K E F P P C A -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.17 0.198 0.21 0.311 0.281 0.248 0.105 0.084 0.072 0.071 POS 220 I E V Y K I I E K V -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.069 0.065 0.06 0.056 0.052 0.054 0.062 0.076 0.105 0.141 POS 230 D Y P R D E N G E I -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.176 0.203 0.224 0.227 0.217 0.209 0.228 0.248 0.271 0.282 POS 240 A A I I H P N L Q D -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.289 0.269 0.24 0.208 0.188 0.169 0.155 0.152 0.167 0.193 POS 250 Q D W K P L H P G D -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.222 0.236 0.235 0.21 0.175 0.136 0.11 0.097 0.099 0.104 POS 260 P M F L T L D G K T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.107 0.108 0.104 0.095 0.084 0.077 0.073 0.082 0.102 0.125 POS 270 I P L G G D C T V Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.144 0.162 0.169 0.166 0.156 0.149 0.133 0.117 0.099 0.089 POS 280 P V F V N E A A Y Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.077 0.072 0.067 0.064 0.066 0.082 0.122 0.184 0.241 0.279 POS 290 E K K E A F A K T T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.282 0.264 0.236 0.229 0.238 0.257 0.263 0.252 0.231 0.222 POS 300 K L T L N A K S I R -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.246 0.278 0.305 0.31 0.303 0.277 0.263 0.371 0.382 0.38 POS 310 C C L H -1 -1 -1 -1 0.348 0.51 0.496 0.489
IUPRED
IUPRED is a software for the prediction of intrinsically unstructured regions in proteins. It was published in 2005 by Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon.
Reference: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon, Bioinformatics (2005) 21, 3433-3434.
IUPRED predicts disordered regions by estimating the capacity of the amino acid chain to form stabilizing contacts. The underlying assumption is that proteins intrinsically unable to do so have distinct sequences that can be identified via their unfavorable energy values. To this end a 20x20 predictor matrix was calculated from a set of globular proteins with known structure. IUPRED uses this matrix to derive a tendency to be intrinsically unstructured from the amino acid composition alone.
Input: An amino acid sequence.
IUPRED comes in three flavors: Long Disorder, which specializes in finding long stretches of disorder, Short Disorder, which does the same for short stretches of disorder, and structured regions, which predicts regions lacking disorder.
Long Disorder
POS 1 M T S C H I A E E H I 0.3215 0.3426 0.2817 0.2783 0.2064 0.1275 0.1554 0.1823 0.2094 0.2364 0.2575 POS 12 Q K V A I F G G T H G 0.2988 0.3087 0.2364 0.3215 0.3149 0.3321 0.2609 0.1823 0.1275 0.1206 0.1759 POS 23 N E L T G V F L V K H 0.1028 0.0676 0.1070 0.1298 0.1881 0.2575 0.2715 0.1969 0.2034 0.2034 0.2064 POS 34 W L E N G A E I Q R T 0.1942 0.1206 0.1914 0.1399 0.1373 0.2064 0.2002 0.1969 0.2541 0.2715 0.2951 POS 45 G L E V K P F I T N P 0.3840 0.4256 0.3460 0.3321 0.3286 0.2609 0.2503 0.3249 0.2292 0.1583 0.1611 POS 56 R A V K K C T R Y I D 0.0985 0.1554 0.0929 0.1373 0.1424 0.0765 0.0749 0.1229 0.0749 0.0780 0.0719 POS 67 C D L N R I F D L E N 0.0424 0.0506 0.0734 0.0734 0.0605 0.1048 0.1115 0.1184 0.1229 0.2064 0.1323 POS 78 L G K K M S E D L P Y 0.2258 0.1643 0.2364 0.2292 0.2002 0.2884 0.4087 0.3215 0.4119 0.3948 0.3053 POS 89 E V R R A Q E I N H L 0.2849 0.2849 0.3149 0.3182 0.3631 0.3667 0.3667 0.3631 0.4441 0.3286 0.4220 POS 100 F G P K D S E D S Y 0.3215 0.3053 0.1969 0.2034 0.1611 0.1501 0.1476 0.2224 0.2164 0.2164 POS 110 D I I F D L H N T T 0.3053 0.3704 0.3704 0.2609 0.2680 0.1823 0.1184 0.0662 0.0690 0.0734 POS 120 S N M G C T L I L E 0.1229 0.1184 0.1914 0.2817 0.2849 0.2034 0.2064 0.2193 0.1759 0.0985 POS 130 D S R N N F L I Q M 0.0948 0.0581 0.0327 0.0398 0.0414 0.0719 0.0414 0.0581 0.1092 0.1092 POS 140 F H Y I K T S L A P 0.1162 0.0662 0.0398 0.0269 0.0163 0.0105 0.0115 0.0184 0.0275 0.0300 POS 150 L P C Y V Y L I E H 0.0372 0.0433 0.0424 0.0405 0.0581 0.0618 0.0618 0.0618 0.1007 0.0749 POS 160 P S L K Y A T T R S 0.0543 0.0870 0.0443 0.0888 0.1007 0.1424 0.1449 0.2292 0.2470 0.2328 POS 170 I A K Y P V G I E V 0.2575 0.2503 0.2752 0.3667 0.3704 0.3948 0.3426 0.3356 0.3019 0.3149 POS 180 G P Q P Q G V L R A 0.2328 0.2328 0.2752 0.2951 0.3321 0.3087 0.3631 0.3182 0.3182 0.2918 POS 190 D I L D Q M R K M I 0.3494 0.3182 0.2164 0.2129 0.1115 0.0605 0.0592 0.0870 0.0734 0.0780 POS 200 K H A L D F I H H F 0.1048 0.0967 0.1501 0.2364 0.1349 0.1399 0.1942 0.1206 0.1048 0.0817 POS 210 N E G K E F P P C A 0.1449 0.1048 0.0618 0.0734 0.0704 0.0389 0.0835 0.1349 0.0948 0.1028 POS 220 I E V Y K I I E K V 0.1115 0.1184 0.1092 0.1184 0.1323 0.1275 0.2129 0.2094 0.1229 0.1731 POS 230 D Y P R D E N G E I 0.1731 0.1759 0.1028 0.1476 0.2470 0.2609 0.2680 0.3631 0.3566 0.3740 POS 240 A A I I H P N L Q D 0.4476 0.3392 0.4256 0.4256 0.3460 0.3356 0.3392 0.3249 0.3392 0.3460 POS 250 Q D W K P L H P G D 0.3910 0.3215 0.2783 0.3631 0.3667 0.3774 0.3566 0.3392 0.4220 0.3321 POS 260 P M F L T L D G K T 0.3426 0.2541 0.2436 0.3426 0.3566 0.2470 0.3286 0.2680 0.1643 0.1852 POS 270 I P L G G D C T V Y 0.1298 0.0631 0.0543 0.1048 0.1731 0.1449 0.1881 0.1115 0.0646 0.0734 POS 280 P V F V N E A A Y Y 0.0690 0.1092 0.1048 0.1399 0.0765 0.0646 0.0581 0.1028 0.1007 0.1373 POS 290 E K K E A F A K T T 0.1449 0.1298 0.1184 0.2034 0.2364 0.2164 0.2002 0.1583 0.1823 0.1852 POS 300 K L T L N A K S I R 0.1881 0.1184 0.1184 0.1184 0.0888 0.0851 0.1349 0.1349 0.1137 0.0870 POS 310 C C L H 0.0631 0.0473 0.0734 0.0483
Short Disorder
POS 1 M T S C H I A E E H I 0.8886 0.7772 0.7418 0.6984 0.5992 0.5296 0.4149 0.2748 0.2333 0.1921 0.1566 POS 12 Q K V A I F G G T H G 0.1805 0.1844 0.1732 0.2700 0.1766 0.2531 0.2913 0.2080 0.1292 0.0832 0.0965 POS 23 N E L T G V F L V K H 0.0909 0.0991 0.0935 0.0660 0.1088 0.1766 0.1766 0.1495 0.1566 0.0991 0.1041 POS 34 W L E N G A E I Q R T 0.1041 0.0909 0.1456 0.0935 0.0935 0.0909 0.1416 0.2385 0.2080 0.1416 0.1322 POS 45 G L E V K P F I T N P 0.1844 0.2963 0.2820 0.2558 0.1921 0.2167 0.2041 0.1998 0.1921 0.1921 0.1380 POS 56 R A V K K C T R Y I D 0.0935 0.1495 0.0935 0.1416 0.0935 0.0771 0.1322 0.1416 0.0935 0.0965 0.0464 POS 67 C D L N R I F D L E N 0.0490 0.0567 0.0542 0.0554 0.0567 0.0935 0.0813 0.0858 0.1292 0.1958 0.1322 POS 78 L G K K M S E D L P Y 0.2385 0.1732 0.2558 0.1958 0.1878 0.2432 0.2963 0.3184 0.4149 0.3359 0.3399 POS 89 E V R R A Q E I N H L 0.4116 0.3491 0.2820 0.2913 0.3535 0.3399 0.3456 0.3399 0.4333 0.4078 0.4825 POS 100 F G P K D S E D S Y 0.3992 0.4651 0.4149 0.3578 0.3005 0.2820 0.1878 0.2483 0.2385 0.2385 POS 110 D I I F D L H N T T 0.2963 0.3668 0.3630 0.2865 0.2963 0.2209 0.2122 0.1292 0.0789 0.0441 POS 120 S N M G C T L I L E 0.0771 0.0771 0.1117 0.1635 0.2333 0.2209 0.1878 0.1292 0.0771 0.0858 POS 130 D S R N N F L I Q M 0.0660 0.0336 0.0316 0.0226 0.0102 0.0179 0.0179 0.0327 0.0279 0.0363 POS 140 F H Y I K T S L A P 0.0387 0.0173 0.0218 0.0128 0.0078 0.0044 0.0055 0.0059 0.0055 0.0055 POS 150 L P C Y V Y L I E H 0.0070 0.0167 0.0194 0.0200 0.0387 0.0212 0.0160 0.0157 0.0327 0.0455 POS 160 P S L K Y A T T R S 0.0387 0.0414 0.0279 0.0441 0.0455 0.0884 0.0991 0.1602 0.1635 0.1602 POS 170 I A K Y P V G I E V 0.1088 0.0965 0.1150 0.1878 0.2167 0.3146 0.3535 0.2865 0.2080 0.2080 POS 180 G P Q P Q G V L R A 0.1766 0.2748 0.2333 0.1667 0.2531 0.2385 0.2748 0.2748 0.3630 0.3184 POS 190 D I L D Q M R K M I 0.3146 0.3096 0.2748 0.2292 0.1292 0.1292 0.0744 0.0701 0.1150 0.1117 POS 200 K H A L D F I H H F 0.0723 0.0701 0.1150 0.1732 0.1602 0.1602 0.1205 0.1456 0.1766 0.1532 POS 210 N E G K E F P P C A 0.1958 0.1322 0.1416 0.1178 0.1205 0.1088 0.1205 0.1240 0.1380 0.1322 POS 220 I E V Y K I I E K V 0.1566 0.1698 0.1060 0.1266 0.1178 0.1240 0.2080 0.1766 0.1566 0.2292 POS 230 D Y P R D E N G E I 0.1878 0.2432 0.2041 0.2041 0.2122 0.2122 0.3225 0.3992 0.3005 0.3359 POS 240 A A I I H P N L Q D 0.4149 0.4245 0.5173 0.4078 0.4116 0.4245 0.3359 0.3263 0.3578 0.3399 POS 250 Q D W K P L H P G D 0.4282 0.4825 0.4703 0.4651 0.4600 0.4600 0.3399 0.3578 0.4333 0.4078 POS 260 P M F L T L D G K T 0.3885 0.2913 0.3053 0.3096 0.3184 0.2820 0.3630 0.3005 0.2657 0.1998 POS 270 I P L G G D C T V Y 0.1205 0.1205 0.1041 0.1018 0.1117 0.1150 0.1844 0.1380 0.1117 0.0701 POS 280 P V F V N E A A Y Y 0.0425 0.0744 0.0567 0.0909 0.0965 0.0744 0.0387 0.0464 0.0441 0.0607 POS 290 E K K E A F A K T T 0.0991 0.0771 0.0660 0.1041 0.0935 0.0935 0.0660 0.0832 0.1041 0.1041 POS 300 K L T L N A K S I R 0.1698 0.1041 0.0660 0.0441 0.0405 0.0701 0.1602 0.2333 0.2865 0.3456 POS 310 C C L H 0.4037 0.4556 0.5802 0.6334
Structured Regions
IUPRED predicts one structured region comprised of the whole input sequence.
Results
IUPRED predicts no significant disorder in Aspartoacyclase. The disorder tendency stays below 0.5 in all cases (except for short stretches of about 3-5 residues at each end of the sequence in short disorder mode, which are negligible) and the structured regions mode predicts one continuous structured region spanning all of the protein sequence. This makes sense when looking at the 3D structure: Aspartoacyclase is a rather densely packed globular structure, which according to the assumptions that IUPRED makes has a strong tendency to form many inter-residue contacts and to stabilize itself thereby, markedly reducing the tendency for disorder in the process.
Meta-Disorder
Meta-Disorder, as the name implies, employs a set of so-called orthogonal disorder predictors in order to combine their strengths and mitigate their weak points. It was published in 2009 by Avner Schlessinger, Marco Punta, Guy Yachdav, Laszlo Kajan and Burkhard Rost.
Reference: Paper
As with the previous methods, Meta-Disorder predicts disorder from the amino acid sequence alone; results from the predictors IUPRED, DISOPRED, NORSnet and Ucon are molded into one final result using a neural network.
Results for Aspartoacyclase:
Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw MD_rel MD2st 1 M 0.33 - 0.99 D 0.17 - 0.551 1 D 2 T 0.26 - 0.78 D 0.25 - 0.531 0 D 3 S 0.16 - 0.72 D 0.35 - 0.535 0 D 4 C 0.23 - 0.65 D 0.33 - 0.505 0 - 5 H 0.20 - 0.48 D 0.25 - 0.475 1 - 6 I 0.16 - 0.55 D 0.30 - 0.465 1 - 7 A 0.34 - 0.56 D 0.40 - 0.444 2 - 8 E 0.28 - 0.67 D 0.30 - 0.424 3 - 9 E 0.21 - 0.73 D 0.38 - 0.404 3 - 10 H 0.15 - 0.70 D 0.30 - 0.374 4 - 11 I 0.15 - 0.59 D 0.29 - 0.354 5 - 12 Q 0.15 - 0.60 D 0.28 - 0.313 6 - 13 K 0.14 - 0.51 D 0.23 - 0.263 8 - 14 V 0.14 - 0.30 - 0.19 - 0.253 8 - 15 A 0.16 - 0.24 - 0.19 - 0.250 9 - 16 I 0.13 - 0.20 - 0.24 - 0.242 9 - 17 F 0.10 - 0.13 - 0.23 - 0.250 9 - 18 G 0.13 - 0.18 - 0.21 - 0.242 9 - 19 G 0.10 - 0.24 - 0.20 - 0.253 8 - 20 T 0.07 - 0.34 - 0.20 - 0.253 8 - 21 H 0.06 - 0.26 - 0.26 - 0.260 8 - 22 G 0.06 - 0.39 - 0.29 - 0.253 8 - 23 N 0.06 - 0.48 D 0.22 - 0.250 9 - 24 E 0.06 - 0.47 D 0.18 - 0.242 9 - 25 L 0.11 - 0.43 - 0.16 - 0.242 9 - 26 T 0.12 - 0.39 - 0.20 - 0.253 8 - 27 G 0.10 - 0.32 - 0.20 - 0.242 9 - 28 V 0.08 - 0.28 - 0.15 - 0.242 9 - 29 F 0.12 - 0.35 - 0.13 - 0.242 9 - 30 L 0.14 - 0.28 - 0.15 - 0.242 9 - 31 V 0.09 - 0.30 - 0.16 - 0.253 8 - 32 K 0.07 - 0.40 - 0.16 - 0.263 8 - 33 H 0.06 - 0.40 - 0.18 - 0.293 7 - 34 W 0.08 - 0.38 - 0.29 - 0.273 8 - 35 L 0.09 - 0.45 - 0.30 - 0.283 7 - 36 E 0.09 - 0.56 D 0.41 - 0.313 6 - 37 N 0.12 - 0.62 D 0.32 - 0.313 6 - 38 G 0.16 - 0.62 D 0.35 - 0.330 6 - 39 A 0.11 - 0.64 D 0.46 - 0.313 6 - 40 E 0.10 - 0.66 D 0.47 - 0.323 6 - 41 I 0.09 - 0.65 D 0.47 - 0.323 6 - 42 Q 0.10 - 0.64 D 0.36 - 0.293 7 - 43 R 0.09 - 0.61 D 0.50 - 0.273 8 - 44 T 0.08 - 0.61 D 0.56 - 0.273 8 - 45 G 0.08 - 0.53 D 0.34 - 0.263 8 - 46 L 0.09 - 0.43 - 0.35 - 0.260 8 - 47 E 0.10 - 0.33 - 0.32 - 0.253 8 - 48 V 0.07 - 0.23 - 0.32 - 0.250 9 - 49 K 0.06 - 0.17 - 0.34 - 0.253 8 - 50 P 0.08 - 0.18 - 0.37 - 0.263 8 - 51 F 0.08 - 0.17 - 0.49 - 0.273 8 - 52 I 0.07 - 0.21 - 0.33 - 0.273 8 - 53 T 0.06 - 0.28 - 0.53 - 0.303 7 - 54 N 0.07 - 0.28 - 0.53 - 0.303 7 - 55 P 0.09 - 0.36 - 0.37 - 0.313 6 - 56 R 0.08 - 0.41 - 0.51 - 0.313 6 - 57 A 0.10 - 0.40 - 0.66 D 0.280 7 - 58 V 0.13 - 0.40 - 0.51 - 0.263 8 - 59 K 0.16 - 0.48 D 0.37 - 0.263 8 - 60 K 0.19 - 0.47 D 0.40 - 0.263 8 - 61 C 0.18 - 0.47 D 0.29 - 0.253 8 - 62 T 0.16 - 0.55 D 0.35 - 0.263 8 - 63 R 0.18 - 0.51 D 0.31 - 0.253 8 - 64 Y 0.22 - 0.47 D 0.25 - 0.273 8 - 65 I 0.23 - 0.47 D 0.20 - 0.260 8 - 66 D 0.23 - 0.56 D 0.21 - 0.263 8 - 67 C 0.25 - 0.57 D 0.16 - 0.263 8 - 68 D 0.30 - 0.43 - 0.18 - 0.263 8 - 69 L 0.29 - 0.40 - 0.18 - 0.260 8 - 70 N 0.28 - 0.40 - 0.25 - 0.263 8 - 71 R 0.40 - 0.39 - 0.23 - 0.273 8 - 72 I 0.46 - 0.43 - 0.22 - 0.280 7 - 73 F 0.46 - 0.37 - 0.19 - 0.273 8 - 74 D 0.37 - 0.46 - 0.32 - 0.310 6 - 75 L 0.33 - 0.57 D 0.40 - 0.390 4 - 76 E 0.36 - 0.61 D 0.30 - 0.444 2 - 77 N 0.44 - 0.62 D 0.41 - 0.465 1 - 78 L 0.38 - 0.66 D 0.65 D 0.531 0 D 79 G 0.30 - 0.70 D 0.64 D 0.485 1 - 80 K 0.35 - 0.69 D 0.64 D 0.515 0 - 81 K 0.23 - 0.69 D 0.59 D 0.475 1 - 82 M 0.23 - 0.66 D 0.42 - 0.444 2 - 83 S 0.28 - 0.69 D 0.64 D 0.449 2 - 84 E 0.34 - 0.72 D 0.56 - 0.485 1 - 85 D 0.29 - 0.74 D 0.45 - 0.424 3 - 86 L 0.20 - 0.64 D 0.35 - 0.424 3 - 87 P 0.20 - 0.64 D 0.45 - 0.404 3 - 88 Y 0.17 - 0.55 D 0.46 - 0.384 4 - 89 E 0.14 - 0.50 D 0.46 - 0.364 5 - 90 V 0.13 - 0.45 - 0.30 - 0.333 6 - 91 R 0.12 - 0.43 - 0.43 - 0.320 6 - 92 R 0.11 - 0.40 - 0.36 - 0.293 7 - 93 A 0.11 - 0.34 - 0.36 - 0.283 7 - 94 Q 0.10 - 0.45 - 0.22 - 0.290 7 - 95 E 0.12 - 0.41 - 0.25 - 0.303 7 - 96 I 0.09 - 0.34 - 0.26 - 0.283 7 - 97 N 0.11 - 0.40 - 0.33 - 0.313 6 - 98 H 0.10 - 0.49 D 0.39 - 0.313 6 - 99 L 0.10 - 0.47 D 0.38 - 0.313 6 - 100 F 0.13 - 0.47 D 0.38 - 0.293 7 - 101 G 0.14 - 0.54 D 0.58 D 0.323 6 - 102 P 0.13 - 0.61 D 0.58 D 0.333 6 - 103 K 0.13 - 0.60 D 0.47 - 0.323 6 - 104 D 0.11 - 0.61 D 0.71 D 0.323 6 - 105 S 0.10 - 0.65 D 0.73 D 0.283 7 - 106 E 0.10 - 0.70 D 0.62 D 0.283 7 - 107 D 0.12 - 0.70 D 0.42 - 0.273 8 - 108 S 0.11 - 0.64 D 0.37 - 0.270 8 - 109 Y 0.12 - 0.50 D 0.23 - 0.253 8 - 110 D 0.13 - 0.39 - 0.20 - 0.242 9 - 111 I 0.16 - 0.29 - 0.18 - 0.240 9 - 112 I 0.15 - 0.20 - 0.16 - 0.240 9 - 113 F 0.14 - 0.20 - 0.16 - 0.240 9 - 114 D 0.17 - 0.21 - 0.20 - 0.242 9 - 115 L 0.21 - 0.20 - 0.19 - 0.253 8 - 116 H 0.17 - 0.28 - 0.19 - 0.273 8 - 117 N 0.11 - 0.48 D 0.23 - 0.283 7 - 118 T 0.13 - 0.39 - 0.24 - 0.283 7 - 119 T 0.13 - 0.41 - 0.21 - 0.273 8 - 120 S 0.15 - 0.46 - 0.21 - 0.273 8 - 121 N 0.22 - 0.54 D 0.18 - 0.263 8 - 122 M 0.25 - 0.51 D 0.14 - 0.260 8 - 123 G 0.30 - 0.51 D 0.16 - 0.253 8 - 124 C 0.26 - 0.42 - 0.18 - 0.250 9 - 125 T 0.29 - 0.40 - 0.18 - 0.242 9 - 126 L 0.24 - 0.34 - 0.18 - 0.253 8 - 127 I 0.17 - 0.28 - 0.23 - 0.260 8 - 128 L 0.13 - 0.28 - 0.25 - 0.263 8 - 129 E 0.14 - 0.41 - 0.24 - 0.253 8 - 130 D 0.14 - 0.54 D 0.18 - 0.253 8 - 131 S 0.10 - 0.59 D 0.19 - 0.273 8 - 132 R 0.07 - 0.68 D 0.27 - 0.280 7 - 133 N 0.05 - 0.64 D 0.28 - 0.273 8 - 134 N 0.06 - 0.61 D 0.18 - 0.273 8 - 135 F 0.07 - 0.53 D 0.15 - 0.260 8 - 136 L 0.08 - 0.47 D 0.13 - 0.242 9 - 137 I 0.10 - 0.47 D 0.13 - 0.242 9 - 138 Q 0.16 - 0.42 - 0.13 - 0.242 9 - 139 M 0.15 - 0.34 - 0.13 - 0.242 9 - 140 F 0.11 - 0.32 - 0.14 - 0.250 9 - 141 H 0.13 - 0.41 - 0.14 - 0.263 8 - 142 Y 0.16 - 0.36 - 0.16 - 0.263 8 - 143 I 0.12 - 0.34 - 0.16 - 0.263 8 - 144 K 0.11 - 0.46 - 0.15 - 0.283 7 - 145 T 0.07 - 0.54 D 0.20 - 0.273 8 - 146 S 0.07 - 0.55 D 0.17 - 0.253 8 - 147 L 0.09 - 0.56 D 0.17 - 0.242 9 - 148 A 0.10 - 0.57 D 0.17 - 0.250 9 - 149 P 0.09 - 0.60 D 0.13 - 0.253 8 - 150 L 0.11 - 0.51 D 0.13 - 0.242 9 - 151 P 0.13 - 0.44 - 0.14 - 0.242 9 - 152 C 0.12 - 0.38 - 0.13 - 0.240 9 - 153 Y 0.13 - 0.31 - 0.13 - 0.240 9 - 154 V 0.18 - 0.28 - 0.13 - 0.242 9 - 155 Y 0.17 - 0.33 - 0.14 - 0.250 9 - 156 L 0.21 - 0.47 D 0.13 - 0.253 8 - 157 I 0.22 - 0.54 D 0.15 - 0.263 8 - 158 E 0.17 - 0.58 D 0.15 - 0.283 7 - 159 H 0.16 - 0.62 D 0.17 - 0.303 7 - 160 P 0.13 - 0.65 D 0.21 - 0.323 6 - 161 S 0.14 - 0.59 D 0.29 - 0.303 7 - 162 L 0.17 - 0.58 D 0.41 - 0.303 7 - 163 K 0.21 - 0.56 D 0.36 - 0.293 7 - 164 Y 0.32 - 0.51 D 0.29 - 0.273 8 - 165 A 0.31 - 0.47 D 0.26 - 0.273 8 - 166 T 0.28 - 0.45 - 0.32 - 0.273 8 - 167 T 0.22 - 0.41 - 0.33 - 0.273 8 - 168 R 0.15 - 0.47 D 0.26 - 0.273 8 - 169 S 0.14 - 0.47 D 0.28 - 0.280 7 - 170 I 0.12 - 0.46 - 0.29 - 0.273 8 - 171 A 0.12 - 0.47 D 0.22 - 0.283 7 - 172 K 0.11 - 0.58 D 0.27 - 0.290 7 - 173 Y 0.13 - 0.47 D 0.20 - 0.263 8 - 174 P 0.11 - 0.38 - 0.19 - 0.253 8 - 175 V 0.10 - 0.26 - 0.26 - 0.250 9 - 176 G 0.09 - 0.24 - 0.31 - 0.250 9 - 177 I 0.13 - 0.33 - 0.25 - 0.250 9 - 178 E 0.20 - 0.28 - 0.37 - 0.253 8 - 179 V 0.26 - 0.41 - 0.33 - 0.253 8 - 180 G 0.20 - 0.45 - 0.33 - 0.270 8 - 181 P 0.17 - 0.59 D 0.25 - 0.283 7 - 182 Q 0.12 - 0.49 D 0.35 - 0.283 7 - 183 P 0.12 - 0.51 D 0.28 - 0.273 8 - 184 Q 0.13 - 0.54 D 0.42 - 0.273 8 - 185 G 0.10 - 0.51 D 0.33 - 0.263 8 - 186 V 0.12 - 0.55 D 0.22 - 0.253 8 - 187 L 0.17 - 0.54 D 0.24 - 0.253 8 - 188 R 0.14 - 0.48 D 0.24 - 0.263 8 - 189 A 0.11 - 0.53 D 0.18 - 0.273 8 - 190 D 0.11 - 0.51 D 0.19 - 0.273 8 - 191 I 0.08 - 0.41 - 0.31 - 0.283 7 - 192 L 0.07 - 0.42 - 0.33 - 0.263 8 - 193 D 0.06 - 0.47 D 0.24 - 0.273 8 - 194 Q 0.08 - 0.45 - 0.33 - 0.270 8 - 195 M 0.04 - 0.34 - 0.26 - 0.263 8 - 196 R 0.04 - 0.43 - 0.34 - 0.273 8 - 197 K 0.05 - 0.44 - 0.34 - 0.263 8 - 198 M 0.06 - 0.29 - 0.34 - 0.263 8 - 199 I 0.06 - 0.28 - 0.22 - 0.253 8 - 200 K 0.07 - 0.34 - 0.22 - 0.263 8 - 201 H 0.07 - 0.32 - 0.20 - 0.253 8 - 202 A 0.08 - 0.28 - 0.15 - 0.250 9 - 203 L 0.08 - 0.35 - 0.15 - 0.253 8 - 204 D 0.09 - 0.43 - 0.19 - 0.263 8 - 205 F 0.11 - 0.41 - 0.18 - 0.263 8 - 206 I 0.12 - 0.45 - 0.18 - 0.253 8 - 207 H 0.15 - 0.59 D 0.23 - 0.270 8 - 208 H 0.18 - 0.59 D 0.40 - 0.290 7 - 209 F 0.22 - 0.58 D 0.24 - 0.283 7 - 210 N 0.27 - 0.63 D 0.37 - 0.293 7 - 211 E 0.27 - 0.66 D 0.53 - 0.313 6 - 212 G 0.28 - 0.68 D 0.44 - 0.313 6 - 213 K 0.26 - 0.70 D 0.46 - 0.323 6 - 214 E 0.26 - 0.71 D 0.50 - 0.323 6 - 215 F 0.20 - 0.70 D 0.56 - 0.303 7 - 216 P 0.21 - 0.69 D 0.37 - 0.293 7 - 217 P 0.24 - 0.69 D 0.28 - 0.280 7 - 218 C 0.14 - 0.66 D 0.28 - 0.263 8 - 219 A 0.14 - 0.58 D 0.19 - 0.263 8 - 220 I 0.15 - 0.52 D 0.19 - 0.263 8 - 221 E 0.11 - 0.47 D 0.22 - 0.270 8 - 222 V 0.11 - 0.34 - 0.26 - 0.273 8 - 223 Y 0.12 - 0.30 - 0.28 - 0.280 7 - 224 K 0.08 - 0.37 - 0.34 - 0.280 7 - 225 I 0.09 - 0.32 - 0.33 - 0.273 8 - 226 I 0.07 - 0.35 - 0.29 - 0.283 7 - 227 E 0.09 - 0.43 - 0.38 - 0.313 6 - 228 K 0.09 - 0.49 D 0.61 D 0.333 6 - 229 V 0.12 - 0.49 D 0.58 D 0.337 6 - 230 D 0.16 - 0.53 D 0.75 D 0.354 5 - 231 Y 0.14 - 0.52 D 0.84 D 0.343 5 - 232 P 0.12 - 0.57 D 0.84 D 0.313 6 - 233 R 0.13 - 0.66 D 0.59 D 0.303 7 - 234 D 0.15 - 0.69 D 0.70 D 0.310 6 - 235 E 0.10 - 0.71 D 0.59 D 0.293 7 - 236 N 0.12 - 0.71 D 0.62 D 0.303 7 - 237 G 0.17 - 0.67 D 0.44 - 0.293 7 - 238 E 0.22 - 0.60 D 0.40 - 0.283 7 - 239 I 0.17 - 0.53 D 0.36 - 0.270 8 - 240 A 0.16 - 0.38 - 0.30 - 0.260 8 - 241 A 0.19 - 0.29 - 0.22 - 0.253 8 - 242 I 0.16 - 0.28 - 0.24 - 0.263 8 - 243 I 0.22 - 0.33 - 0.24 - 0.263 8 - 244 H 0.25 - 0.34 - 0.34 - 0.293 7 - 245 P 0.14 - 0.48 D 0.41 - 0.323 6 - 246 N 0.16 - 0.53 D 0.30 - 0.343 5 - 247 L 0.16 - 0.58 D 0.53 - 0.343 5 - 248 Q 0.16 - 0.61 D 0.71 D 0.374 4 - 249 D 0.22 - 0.64 D 0.59 D 0.354 5 - 250 Q 0.30 - 0.64 D 0.51 - 0.364 5 - 251 D 0.34 - 0.62 D 0.52 - 0.333 6 - 252 W 0.33 - 0.52 D 0.65 D 0.313 6 - 253 K 0.22 - 0.58 D 0.68 D 0.283 7 - 254 P 0.21 - 0.58 D 0.63 D 0.283 7 - 255 L 0.18 - 0.54 D 0.45 - 0.263 8 - 256 H 0.16 - 0.68 D 0.27 - 0.263 8 - 257 P 0.18 - 0.69 D 0.28 - 0.273 8 - 258 G 0.19 - 0.57 D 0.21 - 0.270 8 - 259 D 0.28 - 0.54 D 0.16 - 0.290 7 - 260 P 0.25 - 0.54 D 0.23 - 0.270 8 - 261 M 0.19 - 0.40 - 0.24 - 0.273 8 - 262 F 0.16 - 0.34 - 0.29 - 0.253 8 - 263 L 0.13 - 0.37 - 0.30 - 0.253 8 - 264 T 0.10 - 0.46 - 0.20 - 0.242 9 - 265 L 0.14 - 0.56 D 0.20 - 0.253 8 - 266 D 0.13 - 0.61 D 0.20 - 0.263 8 - 267 G 0.11 - 0.62 D 0.26 - 0.280 7 - 268 K 0.10 - 0.60 D 0.34 - 0.283 7 - 269 T 0.10 - 0.60 D 0.40 - 0.273 8 - 270 I 0.08 - 0.46 - 0.41 - 0.250 9 - 271 P 0.07 - 0.43 - 0.35 - 0.250 9 - 272 L 0.12 - 0.46 - 0.29 - 0.242 9 - 273 G 0.10 - 0.53 D 0.18 - 0.242 9 - 274 G 0.08 - 0.52 D 0.19 - 0.250 9 - 275 D 0.05 - 0.62 D 0.19 - 0.253 8 - 276 C 0.06 - 0.68 D 0.15 - 0.263 8 - 277 T 0.10 - 0.59 D 0.16 - 0.263 8 - 278 V 0.08 - 0.52 D 0.17 - 0.263 8 - 279 Y 0.09 - 0.33 - 0.20 - 0.242 9 - 280 P 0.10 - 0.27 - 0.17 - 0.242 9 - 281 V 0.12 - 0.23 - 0.21 - 0.242 9 - 282 F 0.12 - 0.18 - 0.17 - 0.253 8 - 283 V 0.09 - 0.24 - 0.16 - 0.263 8 - 284 N 0.05 - 0.28 - 0.23 - 0.303 7 - 285 E 0.06 - 0.39 - 0.28 - 0.384 4 - 286 A 0.09 - 0.35 - 0.46 - 0.404 3 - 287 A 0.08 - 0.43 - 0.72 D 0.418 3 - 288 Y 0.08 - 0.37 - 0.79 D 0.374 4 - 289 Y 0.09 - 0.55 D 0.61 D 0.354 5 - 290 E 0.10 - 0.41 - 0.49 - 0.333 6 - 291 K 0.10 - 0.50 D 0.65 D 0.323 6 - 292 K 0.07 - 0.47 D 0.66 D 0.323 6 - 293 E 0.07 - 0.36 - 0.79 D 0.333 6 - 294 A 0.06 - 0.29 - 0.95 D 0.354 5 - 295 F 0.08 - 0.27 - 0.82 D 0.333 6 - 296 A 0.09 - 0.32 - 0.70 D 0.323 6 - 297 K 0.09 - 0.42 - 0.41 - 0.343 5 - 298 T 0.08 - 0.42 - 0.36 - 0.343 5 - 299 T 0.10 - 0.51 D 0.36 - 0.414 3 - 300 K 0.09 - 0.54 D 0.64 D 0.455 2 - 301 L 0.09 - 0.48 D 0.70 D 0.394 4 - 302 T 0.15 - 0.55 D 0.43 - 0.404 3 - 303 L 0.15 - 0.48 D 0.46 - 0.374 4 - 304 N 0.12 - 0.48 D 0.34 - 0.374 4 - 305 A 0.13 - 0.47 D 0.19 - 0.374 4 - 306 K 0.22 - 0.55 D 0.19 - 0.384 4 - 307 S 0.07 - 0.53 D 0.18 - 0.434 2 - 308 I 0.07 - 0.41 - 0.23 - 0.424 3 - 309 R 0.07 - 0.37 - 0.21 - 0.414 3 - 310 C 0.11 - 0.60 D 0.19 - 0.414 3 - 311 C 0.14 - 0.67 D 0.14 - 0.394 4 - 312 L 0.15 - 0.72 D 0.13 - 0.424 3 - 313 H 0.44 - 0.80 D 0.13 - 0.485 1 - Key for output ---------------- Number - residue number Residue - amino-acid type NORSnet - raw score by NORSnet (prediction of unstructured loops) NORS2st - two-state prediction by NORSnet; D=disordered PROFbval - raw score by PROFbval (prediction of residue flexibility from sequence) Bval2st - two-state prediction by PROFbval Ucon - raw score by Ucon (prediction of protein disorder using predicted internal contacts) Ucon2st - two-state prediction by Ucon MD - raw score by MD (prediction of protein disorder using orthogonal sources) MD_rel - reliability of the prediction by MD; values range from 0-9. 9=strong prediction MD2st - two-state prediction by MD
The last column indicates whether or not disorder was predicted at the current position. Meta-Disorder predicts a total of four disorder positions, which are not significant. This coincides with the predictions of the other programs employed previously - not alltogether surprising, since Meta-Disorder draws its predictions from two of them.
Prediction of transmembrane alpha-helices and signal peptides
The results of this task are unequivocal: Aspartoacyclase does not contain any transmembrane regions. From a biological point of view this was to be expected, as Aspartoacyclase is known to be located in the cytosol.
TMHMM
Since the VM version could not be made to work, we used the server at http://www.cbs.dtu.dk/services/TMHMM/.
TMHMM uses a hidden markov model to predict transmembrane helices in proteins. It was published in 1998 by E. L.L. Sonnhammer, G. von Heijne, and A. Krogh.
Reference: Original paper
The hidden markov model used by TMHMM models the biological structure with states for helix turns, helix caps and loops on either side of the membrane, which are specially designed to model membrane insertion, too. The HMM probabilities were estimated both by using a maximum likelihood method and a discriminative method.
Results for Aspartoacyclase very clearly show absence of any sort of transmembrane structure, which is biologically sound.
# sp_P45381_ACY2_HUMAN Length: 313 # sp_P45381_ACY2_HUMAN Number of predicted TMHs: 0 # sp_P45381_ACY2_HUMAN Exp number of AAs in TMHs: 0.2005 # sp_P45381_ACY2_HUMAN Exp number, first 60 AAs: 0.01618 # sp_P45381_ACY2_HUMAN Total prob of N-in: 0.03827 sp_P45381_ACY2_HUMAN TMHMM2.0 outside 1 313
http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output
BACR_HALSA
# BACR_HALSA Length: 262 # BACR_HALSA Number of predicted TMHs: 6 # BACR_HALSA Exp number of AAs in TMHs: 140.4032 # BACR_HALSA Exp number, first 60 AAs: 26.1196 # BACR_HALSA Total prob of N-in: 0.01887 # BACR_HALSA POSSIBLE N-term signal sequence BACR_HALSA TMHMM2.0 outside 1 22 BACR_HALSA TMHMM2.0 TMhelix 23 42 BACR_HALSA TMHMM2.0 inside 43 54 BACR_HALSA TMHMM2.0 TMhelix 55 77 BACR_HALSA TMHMM2.0 outside 78 91 BACR_HALSA TMHMM2.0 TMhelix 92 114 BACR_HALSA TMHMM2.0 inside 115 120 BACR_HALSA TMHMM2.0 TMhelix 121 143 BACR_HALSA TMHMM2.0 outside 144 147 BACR_HALSA TMHMM2.0 TMhelix 148 170 BACR_HALSA TMHMM2.0 inside 171 189 BACR_HALSA TMHMM2.0 TMhelix 190 212 BACR_HALSA TMHMM2.0 outside 213 262
RET4_HUMAN
# RET4_HUMAN Length: 201 # RET4_HUMAN Number of predicted TMHs: 0 # RET4_HUMAN Exp number of AAs in TMHs: 0.01196 # RET4_HUMAN Exp number, first 60 AAs: 0.01179 # RET4_HUMAN Total prob of N-in: 0.01909 RET4_HUMAN TMHMM2.0 outside 1 201
INSL5_HUMAN
# INSL5_HUMAN Length: 135 # INSL5_HUMAN Number of predicted TMHs: 0 # INSL5_HUMAN Exp number of AAs in TMHs: 0.50415 # INSL5_HUMAN Exp number, first 60 AAs: 0.50415 # INSL5_HUMAN Total prob of N-in: 0.03772 INSL5_HUMAN TMHMM2.0 outside 1 135
LAMP1_HUMAN
# LAMP1_HUMAN Length: 417 # LAMP1_HUMAN Number of predicted TMHs: 2 # LAMP1_HUMAN Exp number of AAs in TMHs: 44.89582 # LAMP1_HUMAN Exp number, first 60 AAs: 22.24286 # LAMP1_HUMAN Total prob of N-in: 0.99287 # LAMP1_HUMAN POSSIBLE N-term signal sequence LAMP1_HUMAN TMHMM2.0 inside 1 10 LAMP1_HUMAN TMHMM2.0 TMhelix 11 33 LAMP1_HUMAN TMHMM2.0 outside 34 383 LAMP1_HUMAN TMHMM2.0 TMhelix 384 406 LAMP1_HUMAN TMHMM2.0 inside 407 417
A4_HUMAN
# A4_HUMAN Length: 770 # A4_HUMAN Number of predicted TMHs: 1 # A4_HUMAN Exp number of AAs in TMHs: 22.72525 # A4_HUMAN Exp number, first 60 AAs: 0.0027 # A4_HUMAN Total prob of N-in: 0.00015 A4_HUMAN TMHMM2.0 outside 1 700 A4_HUMAN TMHMM2.0 TMhelix 701 723 A4_HUMAN TMHMM2.0 inside 724 770
Phobius & PolyPhobius
Phobius is a program for the prediction of transmembrane region with special emphasis on reducing confusion with signal peptides. It was published in 2005 by Käll L, Krogh A, Sonnhammer EL.
Reference: Paper
Signal peptides and transmembrane proteins share a great deal of similarity and are often confused by predictors for either class; Phobius aims to predict both and to discriminate between them. It employs a hidden markov model to do this, modelling the different sequence regions pertaining to either class.
Input: An amino acid sequence.
Again, neither signal nor transmembrane regions were detected in Aspartoacyclase.
BACR_HALSA
ID FT TOPO_DOM 1 22 NON CYTOPLASMIC. FT TRANSMEM 23 42 FT TOPO_DOM 43 53 CYTOPLASMIC. FT TRANSMEM 54 76 FT TOPO_DOM 77 95 NON CYTOPLASMIC. FT TRANSMEM 96 114 FT TOPO_DOM 115 120 CYTOPLASMIC. FT TRANSMEM 121 142 FT TOPO_DOM 143 147 NON CYTOPLASMIC. FT TRANSMEM 148 169 FT TOPO_DOM 170 189 CYTOPLASMIC. FT TRANSMEM 190 212 FT TOPO_DOM 213 217 NON CYTOPLASMIC. FT TRANSMEM 218 237 FT TOPO_DOM 238 262 CYTOPLASMIC. //
With PolyPhobius:
ID BACR_HALSA FT TOPO_DOM 1 21 NON CYTOPLASMIC. FT TRANSMEM 22 43 FT TOPO_DOM 44 54 CYTOPLASMIC. FT TRANSMEM 55 77 FT TOPO_DOM 78 94 NON CYTOPLASMIC. FT TRANSMEM 95 114 FT TOPO_DOM 115 120 CYTOPLASMIC. FT TRANSMEM 121 141 FT TOPO_DOM 142 147 NON CYTOPLASMIC. FT TRANSMEM 148 166 FT TOPO_DOM 167 186 CYTOPLASMIC. FT TRANSMEM 187 205 FT TOPO_DOM 206 215 NON CYTOPLASMIC. FT TRANSMEM 216 237 FT TOPO_DOM 238 262 CYTOPLASMIC. //
RET4_HUMAN
ID RET4_HUMAN FT SIGNAL 1 18 FT REGION 1 2 N-REGION. FT REGION 3 13 H-REGION. FT REGION 14 18 C-REGION. FT TOPO_DOM 19 201 NON CYTOPLASMIC. //
With PolyPhobius:
ID RET4_HUMAN FT SIGNAL 1 18 FT REGION 1 3 N-REGION. FT REGION 4 13 H-REGION. FT REGION 14 18 C-REGION. FT TOPO_DOM 19 201 NON CYTOPLASMIC. //
INSL5_HUMAN
ID FT SIGNAL 1 22 FT REGION 1 5 N-REGION. FT REGION 6 17 H-REGION. FT REGION 18 22 C-REGION. FT TOPO_DOM 23 135 NON CYTOPLASMIC. //
With PolyPhobius:
ID INSL5_HUMAN FT SIGNAL 1 22 FT REGION 1 4 N-REGION. FT REGION 5 16 H-REGION. FT REGION 17 22 C-REGION. FT TOPO_DOM 23 135 NON CYTOPLASMIC. //
LAMP1_HUMAN
ID FT SIGNAL 1 28 FT REGION 1 10 N-REGION. FT REGION 11 22 H-REGION. FT REGION 23 28 C-REGION. FT TOPO_DOM 29 381 NON CYTOPLASMIC. FT TRANSMEM 382 405 FT TOPO_DOM 406 417 CYTOPLASMIC. //
With PolyPhobius:
ID LAMP1_HUMAN FT SIGNAL 1 28 FT REGION 1 9 N-REGION. FT REGION 10 22 H-REGION. FT REGION 23 28 C-REGION. FT TOPO_DOM 29 381 NON CYTOPLASMIC. FT TRANSMEM 382 405 FT TOPO_DOM 406 417 CYTOPLASMIC. //
A4_HUMAN
ID A4_HUMAN FT SIGNAL 1 17 FT REGION 1 1 N-REGION. FT REGION 2 12 H-REGION. FT REGION 13 17 C-REGION. FT TOPO_DOM 18 700 NON CYTOPLASMIC. FT TRANSMEM 701 723 FT TOPO_DOM 724 770 CYTOPLASMIC. //
OCTOPUS & SPOCTOPUS
OCTOPUS uses a combination of hidden markov models and neural networks to predict transmembrane regions. It was published in 2004 by Käll L, Krogh A, Sonnhammer EL.
Reference: Original paper
OCROPUS first creates a sequence profile by running BLAST with the input sequence. Neural networks are used to subsequently predict the propensity for each residue to be located in a transmembrane region or in certain structure patterns on either side of the membrane. The resulting propensities are then fed to a hidden markov model, which calculates the most likely topology.
SPOCTOPUS extends OCTOPUS with a preprocessor that uses a neural network to assess the probability that the first 70 residues of the input sequence contain a signal peptide sequence. If this scores high enough, a hidden markov model is used to ascertain the exact offset of the signal region.
No transmembrane/signal regions were predicted for Aspartoacyclase.
BACR_HALSA
OCTOPUS predicted topology: oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii iiiiiiiiiiiiiiiiiiiiii
SPOCTOPUS predicted topology: oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiM MMMMMMMMMMMMMMMMMMMMooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii iiiiiiiiiiiiiiiiiiiiii
RET4_HUMAN
OCTOPUS predicted topology: iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooo
SPOCTOPUS predicted topology: nnnnSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooo
INSL5_HUMAN
OCTOPUS predicted topology: iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooo
SPOCTOPUS predicted topology: nnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooo
LAMP1_HUMAN
OCTOPUS predicted topology: iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii
SPOCTOPUS predicted topology: nnnnnnnnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii
A4_HUMAN
OCTOPUS predicted topology: ooooRRRRRRoooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SPOCTOPUS predicted topology: nnnSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SignalP
SignalP is a method for the detection of signal peptides. It was first published in 1997 by Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne.
Reference: Original paper, current version
SignalP comes in two flavours: One using a neural network, the other using a hidden markov model. It supports discriminating between cleaved and uncleaved signal peptides and supports both prokaryotic and eukaryotic input.
Input: A protein sequence.
Neither flavour detected any signal sequence in Aspartoacyclase.
Aspartoacyclase: HMM
Aspartoacyclase: Neural Network
BACR_HALSA
Neural Network:
BACR_HALSA length = 70 # Measure Position Value Cutoff signal peptide? max. C 16 0.220 0.32 NO max. Y 39 0.196 0.33 NO max. S 31 0.970 0.87 YES mean S 1-38 0.426 0.48 NO D 1-38 0.311 0.43 NO # Most likely cleavage site between pos. 38 and 39: GTL-YF
HMM:
Prediction: Signal anchor Signal peptide probability: 0.017 Signal anchor probability: 0.859 Max cleavage site probability: 0.004 between pos. 15 and 16
RET4_HUMAN
Neural Network:
RET4_HUMAN length = 70 # Measure Position Value Cutoff signal peptide? max. C 19 0.929 0.32 YES max. Y 19 0.901 0.33 YES max. S 1 0.994 0.87 YES mean S 1-18 0.938 0.48 YES D 1-18 0.920 0.43 YES # Most likely cleavage site between pos. 18 and 19: GRA-ER
HMM:
RET4_HUMAN Prediction: Signal peptide Signal peptide probability: 1.000 Signal anchor probability: 0.000 Max cleavage site probability: 0.979 between pos. 18 and 19
INSL5_HUMAN
Neural Network:
INSL5_HUMAN length = 70 # Measure Position Value Cutoff signal peptide? max. C 23 0.855 0.32 YES max. Y 23 0.778 0.33 YES max. S 13 0.987 0.87 YES mean S 1-22 0.852 0.48 YES D 1-22 0.815 0.43 YES # Most likely cleavage site between pos. 22 and 23: VRS-KE
HMM:
INSL5_HUMAN Prediction: Signal peptide Signal peptide probability: 0.999 Signal anchor probability: 0.000 Max cleavage site probability: 0.911 between pos. 22 and 23
LAMP1_HUMAN
Neural Network:
LAMP1_HUMAN length = 70 # Measure Position Value Cutoff signal peptide? max. C 29 0.978 0.32 YES max. Y 29 0.903 0.33 YES max. S 19 0.999 0.87 YES mean S 1-28 0.960 0.48 YES D 1-28 0.932 0.43 YES # Most likely cleavage site between pos. 28 and 29: ASA-AM
HMM:
LAMP1_HUMAN Prediction: Signal peptide Signal peptide probability: 1.000 Signal anchor probability: 0.000 Max cleavage site probability: 0.847 between pos. 28 and 29
A4_HUMAN
Neural Network:
A4_HUMAN length = 70 # Measure Position Value Cutoff signal peptide? max. C 18 0.891 0.32 YES max. Y 18 0.850 0.33 YES max. S 2 0.992 0.87 YES mean S 1-17 0.967 0.48 YES D 1-17 0.909 0.43 YES # Most likely cleavage site between pos. 17 and 18: ARA-LE
HMM:
A4_HUMAN Prediction: Signal peptide Signal peptide probability: 1.000 Signal anchor probability: 0.000 Max cleavage site probability: 0.993 between pos. 17 and 18
TargetP
TargetP is a software for the prediction of the cellular location of certain proteins, based on location signals in their sequence. It was published in 2000 by Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1.
Reference: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1. J. Mol. Biol., 300: 1005-1016, 2000.
TargetP confines its analysis to the N-terminal part of the sequence, it can discriminate between proteins destined for either mitochondrion, chloroplast (plants only, for obvious reasons), the secretory pathway or another location.
The prediction for Aspartoacyclase was "other location", which is plausible, as the enzyme is known to reside in the cytosol.
### targetp v1.1 prediction results ################################## Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- sp_P45381_ACY2_HUMAN 313 0.073 0.109 0.898 _ 2 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php
BACR_HALSA
Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- BACR_HALSA 262 0.019 0.897 0.562 S 4 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
RET4_HUMAN
Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- RET4_HUMAN 201 0.242 0.928 0.020 S 2 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
INSL5_HUMAN
Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- INSL5_HUMAN 135 0.074 0.899 0.037 S 1 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
LAMP1_HUMAN
Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- LAMP1_HUMAN 417 0.043 0.953 0.017 S 1 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
A4_HUMAN
Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- A4_HUMAN 770 0.035 0.937 0.084 S 1 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
Analysis
BACR_HALSA
TM Prediction: TMHMM predicts one helix less than the other tools (ca. 216-237); other than that, all methods consent on 7 TM helices with insignificant differences. The PDB structure shows that this is correct.
Signalpeptid: SignalP predicts it to be a signal peptide (NN mode) and a signal anchor (HMM mode); according to the information we found on the protein, both predictions are faulty.
Target prediction: TargetP predicted this protein to be located in the secretory pathway and to be a signal peptide; this is not correct.
RET4_HUMAN
TM Prediction: Only Octopus predicts a TM helix; this is a mis-identified signal sequence. The other programs predict no TM helices, which is correct.
Signal peptide prediction: Phobius predicts it to be a signal peptide; so do Spoctopus, and SignalP, with the cleaving site at position 18. According to Uniprot, this is correct.
Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct.
INSL5_HUMAN
TM Prediction: Only Octopus predicts a transmembrane element, which is a mis-identified signal sequence.
Signal peptide prediction: Phobius predicts a signal sequence with cleaving site at 22; Spoctopus predicts the cleaving site at 23; SignalP predicts it to be between 22 and 23.
Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct.
LAMP1_HUMAN
TM Prediction: TMHMM detects two TM helices; so does Octopus. One TM helix is detected as a signal sequence by Spoctopus and Phobius.
Signal peptide prediction: Phobius, Spoctopus and SignalP find a signal sequence with cleaving site at 28-29. This is correct, according to Uniprot.
Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; since it is membrane-located, this is not correct.
A4_HUMAN
TM Prediction: One TM helix from 701 to 723 predicted by all programms (end is 722 in case of Octopus and Spoctopus). One short reentrant sequence predicted by Octopus, which is a mis-identified signal sequence.
Signal peptide prediction: Spoctopus, SignalP and Phobius all report a signal sequence with cleaving site at 17-19. According to Uniprot, this is correct.
Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; this is wrong, as it is membrane-associated.
Prediction of GO terms
GOPET
GOPET is a tool aimed at automatically assigning Gene Ontology terms to proteins. It was published in 2006 by Arunachalam Vinayagam, Coral del Val, Falk Schubert, Roland Eils, Karl-Heinz Glatting, Sándor Suhai and Rainer König.
Reference: Paper
The input sequence is first BLASTed against a database of proteins with known GO terms; a support vector machine is then used to discriminate between correct and false terms.
Results for Aspartoacyclase, all coinciding nicely with the current knowledge on the enzyme:
GOid | Aspect | Confidence | GO Term |
---|---|---|---|
GO:0016787 | F | 96% | hydrolase activity |
GO:0004046 | F | 82% | aminoacyclase activity |
GO:0019807 | F | 82% | aspartoacyclase activity |
GO:0016788 | F | 81% | hydrolase activity acting on ester bonds |
Other proteins:
Pfam
PFAM is a large database of protein functions. It was established in 1998 at the Wellcome Trust Sanger Institute.
It is comprised of two database: Pfam-A, a manually curated high-quality database with a limited number of entries, and the much larger, automatically curated, Pfam-B.
Reference: The Pfam protein families database: R.D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J.E. Pollington, O.L. Gavin, P. Gunesekaran, G. Ceric, K. Forslund, L. Holm, E.L. Sonnhammer, S.R. Eddy, A. Bateman
The result for Aspartoacyclase is spot-on:
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
ProtFun 2.2
ProtFun is a program for ab-initio protein function prediction. It was published in 2002 by Juhl Jensen et al.
Reference: Paper Abstract
The software queries a number of existing prediction servers for a wide range of features, from isoelectic point to posttranslational modifications, and deduces its function from this data.
Results for Aspartoacyclase:
############## ProtFun 2.2 predictions ############## >sp_P45381_A # Functional category Prob Odds Amino_acid_biosynthesis 0.071 3.233 Biosynthesis_of_cofactors 0.144 2.003 Cell_envelope 0.033 0.535 Cellular_processes 0.137 1.875 Central_intermediary_metabolism => 0.334 5.309 Energy_metabolism 0.226 2.511 Fatty_acid_metabolism 0.022 1.663 Purines_and_pyrimidines 0.367 1.512 Regulatory_functions 0.021 0.128 Replication_and_transcription 0.167 0.625 Translation 0.113 2.559 Transport_and_binding 0.017 0.042 # Enzyme/nonenzyme Prob Odds Enzyme => 0.703 2.454 Nonenzyme 0.297 0.416 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.111 0.534 Transferase (EC 2.-.-.-) 0.202 0.585 Hydrolase (EC 3.-.-.-) 0.115 0.363 Lyase (EC 4.-.-.-) 0.031 0.662 Isomerase (EC 5.-.-.-) => 0.084 2.637 Ligase (EC 6.-.-.-) 0.074 1.460 # Gene Ontology category Prob Odds Signal_transducer 0.053 0.246 Receptor 0.004 0.024 Hormone 0.001 0.206 Structural_protein 0.001 0.041 Transporter 0.025 0.230 Ion_channel 0.015 0.257 Voltage-gated_ion_channel 0.004 0.173 Cation_channel 0.011 0.234 Transcription 0.100 0.785 Transcription_regulation 0.039 0.313 Stress_response 0.010 0.117 Immune_response 0.061 0.720 Growth_factor 0.006 0.450 Metal_ion_transport 0.009 0.020
Other proteins:
############## ProtFun 2.2 predictions ############## >LAMP1_HUMAN # Functional category Prob Odds Amino_acid_biosynthesis 0.011 0.484 Biosynthesis_of_cofactors 0.053 0.735 Cell_envelope => 0.804 13.186 Cellular_processes 0.027 0.373 Central_intermediary_metabolism 0.138 2.188 Energy_metabolism 0.037 0.411 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.533 2.195 Regulatory_functions 0.015 0.090 Replication_and_transcription 0.019 0.073 Translation 0.027 0.613 Transport_and_binding 0.834 2.033 # Enzyme/nonenzyme Prob Odds Enzyme 0.276 0.965 Nonenzyme => 0.724 1.014 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.039 0.187 Transferase (EC 2.-.-.-) 0.046 0.134 Hydrolase (EC 3.-.-.-) 0.058 0.184 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.396 1.849 Receptor 0.282 1.659 Hormone 0.001 0.206 Structural_protein 0.011 0.408 Transporter 0.024 0.222 Ion_channel 0.008 0.147 Voltage-gated_ion_channel 0.002 0.111 Cation_channel 0.010 0.215 Transcription 0.032 0.247 Transcription_regulation 0.018 0.142 Stress_response 0.246 2.795 Immune_response => 0.371 4.368 Growth_factor 0.013 0.956 Metal_ion_transport 0.009 0.020 // >RET4_HUMAN # Functional category Prob Odds Amino_acid_biosynthesis 0.017 0.751 Biosynthesis_of_cofactors 0.044 0.610 Cell_envelope => 0.804 13.186 Cellular_processes 0.075 1.021 Central_intermediary_metabolism 0.197 3.128 Energy_metabolism 0.043 0.475 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.275 1.131 Regulatory_functions 0.013 0.080 Replication_and_transcription 0.022 0.084 Translation 0.032 0.721 Transport_and_binding 0.800 1.951 # Enzyme/nonenzyme Prob Odds Enzyme => 0.544 1.900 Nonenzyme 0.456 0.639 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.095 0.458 Transferase (EC 2.-.-.-) 0.038 0.109 Hydrolase (EC 3.-.-.-) 0.235 0.742 Lyase (EC 4.-.-.-) => 0.059 1.264 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.202 0.942 Receptor 0.147 0.862 Hormone 0.004 0.667 Structural_protein 0.002 0.058 Transporter 0.025 0.232 Ion_channel 0.016 0.288 Voltage-gated_ion_channel 0.003 0.148 Cation_channel 0.010 0.215 Transcription 0.027 0.207 Transcription_regulation 0.025 0.196 Stress_response 0.161 1.829 Immune_response => 0.239 2.813 Growth_factor 0.023 1.617 Metal_ion_transport 0.009 0.020 // >BACR_HALSA # Functional category Prob Odds Amino_acid_biosynthesis 0.033 1.495 Biosynthesis_of_cofactors 0.186 2.589 Cell_envelope 0.029 0.483 Cellular_processes 0.051 0.694 Central_intermediary_metabolism 0.045 0.711 Energy_metabolism 0.138 1.537 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.302 1.244 Regulatory_functions 0.013 0.080 Replication_and_transcription 0.019 0.073 Translation 0.059 1.339 Transport_and_binding => 0.791 1.929 # Enzyme/nonenzyme Prob Odds Enzyme 0.199 0.696 Nonenzyme => 0.801 1.122 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.114 0.549 Transferase (EC 2.-.-.-) 0.031 0.091 Hydrolase (EC 3.-.-.-) 0.057 0.180 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.258 1.205 Receptor 0.355 2.087 Hormone 0.001 0.206 Structural_protein 0.006 0.200 Transporter => 0.440 4.036 Ion_channel 0.010 0.169 Voltage-gated_ion_channel 0.004 0.172 Cation_channel 0.078 1.689 Transcription 0.026 0.205 Transcription_regulation 0.028 0.226 Stress_response 0.012 0.139 Immune_response 0.011 0.128 Growth_factor 0.010 0.727 Metal_ion_transport 0.049 0.106 // >INSL5_HUMAN # Functional category Prob Odds Amino_acid_biosynthesis 0.011 0.484 Biosynthesis_of_cofactors 0.040 0.558 Cell_envelope => 0.756 12.393 Cellular_processes 0.033 0.448 Central_intermediary_metabolism 0.048 0.755 Energy_metabolism 0.036 0.397 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.144 0.592 Regulatory_functions 0.014 0.087 Replication_and_transcription 0.020 0.075 Translation 0.032 0.735 Transport_and_binding 0.834 2.033 # Enzyme/nonenzyme Prob Odds Enzyme 0.209 0.729 Nonenzyme => 0.791 1.109 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.056 0.268 Transferase (EC 2.-.-.-) 0.031 0.091 Hydrolase (EC 3.-.-.-) 0.062 0.195 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.327 # Gene Ontology category Prob Odds Signal_transducer 0.374 1.746 Receptor 0.128 0.750 Hormone => 0.247 37.936 Structural_protein 0.001 0.041 Transporter 0.025 0.228 Ion_channel 0.010 0.168 Voltage-gated_ion_channel 0.003 0.131 Cation_channel 0.010 0.215 Transcription 0.054 0.425 Transcription_regulation 0.091 0.724 Stress_response 0.099 1.128 Immune_response 0.178 2.090 Growth_factor 0.061 4.379 Metal_ion_transport 0.009 0.020 // >A4_HUMAN # Functional category Prob Odds Amino_acid_biosynthesis 0.020 0.921 Biosynthesis_of_cofactors 0.261 3.623 Cell_envelope => 0.804 13.186 Cellular_processes 0.053 0.730 Central_intermediary_metabolism 0.184 2.920 Energy_metabolism 0.023 0.259 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.417 1.716 Regulatory_functions 0.013 0.084 Replication_and_transcription 0.029 0.109 Translation 0.027 0.613 Transport_and_binding 0.827 2.016 # Enzyme/nonenzyme Prob Odds Enzyme => 0.392 1.368 Nonenzyme 0.608 0.852 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.024 0.114 Transferase (EC 2.-.-.-) 0.208 0.603 Hydrolase (EC 3.-.-.-) 0.190 0.600 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.324 Ligase (EC 6.-.-.-) 0.048 0.946 # Gene Ontology category Prob Odds Signal_transducer 0.126 0.586 Receptor 0.036 0.211 Hormone 0.001 0.206 Structural_protein => 0.034 1.205 Transporter 0.024 0.222 Ion_channel 0.009 0.162 Voltage-gated_ion_channel 0.002 0.108 Cation_channel 0.010 0.215 Transcription 0.043 0.335 Transcription_regulation 0.018 0.143 Stress_response 0.076 0.862 Immune_response 0.016 0.183 Growth_factor 0.005 0.372 Metal_ion_transport 0.009 0.020 //