Difference between revisions of "ASPA Sequence Based Predictions"

From Bioinformatikpedia
(ProtFun 2.2)
 
(46 intermediate revisions by the same user not shown)
Line 2: Line 2:
   
 
===PsiPred===
 
===PsiPred===
  +
  +
For a description of PsiPred, see [[Psipred]].
   
 
[[File:3ef7a0ac-0a08-412e-bf9f-e54bac6babd0.psi 1.png|200px|thumb|right|PsiPred results for Aspartoacyclase]]
 
[[File:3ef7a0ac-0a08-412e-bf9f-e54bac6babd0.psi 1.png|200px|thumb|right|PsiPred results for Aspartoacyclase]]
Line 44: Line 46:
   
 
===JPred3===
 
===JPred3===
  +
  +
JPred3 was published in 1998 by Christian Cole, Jonathan D. Barber and Geoffrey J. Barton.
  +
  +
Reference: [http://bioinformatics.oxfordjournals.org/content/14/10/892.short Original paper], [http://nar.oxfordjournals.org/content/36/suppl_2/W197.abstract current version]
  +
  +
JPred3 uses the JNet 2.0 algorithm to make its predictions. This algorithm generates profiles using PSI-Blast (which is used to build a position-specific scoring matrix) and HMMer (which is used to construct HMM profiles.) Both position-specific scoring matrix and the HMMs are used to predict secondary structure and solvent accessibility.
  +
  +
Input: A protein sequence or a pre-made MSA; a PDB database is needed, too, but provided by the JPred3 server.
  +
 
<pre>MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYID
 
<pre>MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYID
 
------------EEEEEEEE------HHHHHHHHHH---------EEEEEEEE-HHHHHH-----H
 
------------EEEEEEEE------HHHHHHHHHH---------EEEEEEEE-HHHHHH-----H
Line 68: Line 79:
   
 
===DSSP===
 
===DSSP===
  +
DSSP (Define Secondary Structure of Proteins) is a software for secondary structure assignment and was published in 1983 by Wolfgang Kabsch and Chris Sander.
  +
Reference: [http://onlinelibrary.wiley.com/doi/10.1002/bip.360221211/abstract Original paper]
  +
  +
DSSP does not predict secondary structure from amino acid sequences; instead, it uses a 3D structure (a PDB file) to deduce the secondary structure from the 3D structure. To this end, DSSP examines the phi and psi angles and the C alpha positions in the protein backbone and H-bonds present in the structure; these are used to define "n-turns", which are H-bonds between the NH and CO groups of amino acids with sequence separations of 3-5 residues, and "bridges" with greater sequence separations. Repeating 4-turns are used to identify helices, repeating bridges identify beta sheets.
  +
  +
Input: A 3D structure (a PDB file, ID 2o53 in our case)
  +
  +
Output: (from [http://swift.cmbi.ru.nl/gv/dssp/DSSP_2.html])
  +
<pre> H = alpha helix
  +
B = residue in isolated beta-bridge
  +
E = extended strand, participates in beta ladder
  +
G = 3-helix (3/10 helix)
  +
I = 5 helix (pi helix)
  +
T = hydrogen bonded turn
  +
S = bend </pre>
  +
  +
The results differ from those of the two secondary structure predictors, as the PDB file contains a dimer, whereas the Uniprot sequence only contains one domain (which is a sensible thing, since both domains are essentially identical.)
  +
  +
The prediction shows slight differences between both domains; we assume that reasons for this are slight differences in the actual 3D structure of the two chains as well as H-bonds between the two chains.
  +
 
<pre> 10 20 30 40 50 60
 
<pre> 10 20 30 40 50 60
 
| | | | | |
 
| | | | | |
Line 140: Line 171:
 
603 - 604
 
603 - 604
 
603 - 604
 
603 - 604
603 - 604 AA</pre>
+
603 - 604 AA
   
 
Clearly solvent accessible: A; involved in symmetry contacts: *
 
Clearly solvent accessible: A; involved in symmetry contacts: *
  +
</pre>
   
  +
All in all, the two prediction methods Psipred and JPred3 did a good job; they managed to predict most of the main secondary structure elements, with only minor variations in length and position of the individual helices/sheets and very minor variations between each other. A somewhat more detailed result from DSSP is to be expected, as it has pointedly better information to and merely assigns instead of actually predicting the secondary structure.
pdb id: 2o53
 
   
 
==Prediction of disordered regions==
 
==Prediction of disordered regions==
   
 
===DISOPRED===
 
===DISOPRED===
  +
  +
DISOPRED predicts native disorder in proteins. It was published in 2004 by Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF and Jones DT.
  +
Reference: [http://www.sciencedirect.com/science/article/pii/S0022283604001482]
  +
  +
DISOPRED uses linear support vector machines to predict disorder in a given protein sequence. A set of 750 proteins with high-quality structures was used as training data; to this end, PSI-Blast profiles were generated by aligning the training structures against a filtered database of protein structures. The resulting profiles were used to train the SVMs.
   
 
[[File:Aa9f21ee-f0a3-4cd4-9cf0-366fe1b5377e.dis 1.diso.png|400px|thumb|right|DISOPRED result graph for Aspartoacyclase]]
 
[[File:Aa9f21ee-f0a3-4cd4-9cf0-366fe1b5377e.dis 1.diso.png|400px|thumb|right|DISOPRED result graph for Aspartoacyclase]]
Line 189: Line 226:
   
 
===POODLE===
 
===POODLE===
  +
  +
POODLE (Prediction Of Order and Disorder by machine LEarning) is a series of programs published between 2005 and 2008. We used the latest variant, POODLE-I, which was published in 2008 by S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi.
  +
  +
Reference:
  +
S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi, "Disordered region prediction by integrating POODLE series", CASP8 Proceedings 2008, 14-15.
  +
  +
Input: Protein amino acid sequence
  +
  +
POODLE-I is an integrated variant of other flavors of POODLE (-S and -L for short/long regions of disorder and -W for proteins that are mostly disordered) and several other tools like Psipred, JNet etc. It employs a rather involved [http://mbs.cbrc.jp/poodle/images/workflow.png workflow].
  +
  +
Custom-formatted output for Aspartoacyclase:
   
 
[[File: Aspa disopred.png]]
 
[[File: Aspa disopred.png]]
Line 349: Line 397:
   
 
===IUPRED===
 
===IUPRED===
  +
  +
IUPRED is a software for the prediction of intrinsically unstructured regions in proteins. It was published in 2005 by Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon.
  +
  +
Reference: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content
  +
Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon, Bioinformatics (2005) 21, 3433-3434.
  +
  +
IUPRED predicts disordered regions by estimating the capacity of the amino acid chain to form stabilizing contacts. The underlying assumption is that proteins intrinsically unable to do so have distinct sequences that can be identified via their unfavorable energy values. To this end a 20x20 predictor matrix was calculated from a set of globular proteins with known structure. IUPRED uses this matrix to derive a tendency to be intrinsically unstructured from the amino acid composition alone.
  +
  +
Input: An amino acid sequence.
  +
  +
IUPRED comes in three flavors: Long Disorder, which specializes in finding long stretches of disorder, Short Disorder, which does the same for short stretches of disorder, and structured regions, which predicts regions lacking disorder.
  +
 
====Long Disorder====
 
====Long Disorder====
   
Line 607: Line 667:
 
[[File:Aspa iupred3.png]]
 
[[File:Aspa iupred3.png]]
   
IUPRED predicts one domain comprising of the whole input sequence.
+
IUPRED predicts one structured region comprised of the whole input sequence.
 
   
   
   
  +
====Results====
  +
IUPRED predicts no significant disorder in Aspartoacyclase. The disorder tendency stays below 0.5 in all cases (except for short stretches of about 3-5 residues at each end of the sequence in short disorder mode, which are negligible) and the structured regions mode predicts one continuous structured region spanning all of the protein sequence. This makes sense when looking at the 3D structure: Aspartoacyclase is a rather densely packed globular structure, which according to the assumptions that IUPRED makes has a strong tendency to form many inter-residue contacts and to stabilize itself thereby, markedly reducing the tendency for disorder in the process.
   
 
===Meta-Disorder===
 
===Meta-Disorder===
  +
  +
Meta-Disorder, as the name implies, employs a set of so-called orthogonal disorder predictors in order to combine their strengths and mitigate their weak points. It was published in 2009 by Avner Schlessinger, Marco Punta, Guy Yachdav, Laszlo Kajan and Burkhard Rost.
  +
  +
Reference: [http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0004433 Paper]
  +
  +
As with the previous methods, Meta-Disorder predicts disorder from the amino acid sequence alone; results from the predictors IUPRED, DISOPRED, NORSnet and Ucon are molded into one final result using a neural network.
  +
  +
Results for Aspartoacyclase:
  +
 
<pre>Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw MD_rel MD2st
 
<pre>Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw MD_rel MD2st
 
1 M 0.33 - 0.99 D 0.17 - 0.551 1 D
 
1 M 0.33 - 0.99 D 0.17 - 0.551 1 D
Line 944: Line 1,014:
 
MD2st - two-state prediction by MD
 
MD2st - two-state prediction by MD
 
</pre>
 
</pre>
  +
  +
  +
The last column indicates whether or not disorder was predicted at the current position. Meta-Disorder predicts a total of four disorder positions, which are not significant. This coincides with the predictions of the other programs employed previously - not alltogether surprising, since Meta-Disorder draws its predictions from two of them.
   
 
==Prediction of transmembrane alpha-helices and signal peptides==
 
==Prediction of transmembrane alpha-helices and signal peptides==
  +
  +
The results of this task are unequivocal: Aspartoacyclase does not contain any transmembrane regions. From a biological point of view this was to be expected, as Aspartoacyclase is known to be located in the cytosol.
   
 
===TMHMM===
 
===TMHMM===
  +
Since the VM version could not be made to work, we used the server at http://www.cbs.dtu.dk/services/TMHMM/.
...
 
  +
  +
TMHMM uses a hidden markov model to predict transmembrane helices in proteins. It was published in 1998 by E. L.L. Sonnhammer, G. von Heijne, and A. Krogh.
  +
  +
Reference: [http://people.binf.ku.dk/~krogh/publications/ps/SonnhammerEtal98.pdf Original paper]
  +
  +
The hidden markov model used by TMHMM models the biological structure with states for helix turns, helix caps and loops on either side of the membrane, which are specially designed to model membrane insertion, too. The HMM probabilities were estimated both by using a maximum likelihood method and a discriminative method.
  +
  +
Results for Aspartoacyclase very clearly show absence of any sort of transmembrane structure, which is biologically sound.
  +
  +
[[File:Sp P45381 ACY2 HUMAN.gif]]
  +
  +
<pre># sp_P45381_ACY2_HUMAN Length: 313
  +
# sp_P45381_ACY2_HUMAN Number of predicted TMHs: 0
  +
# sp_P45381_ACY2_HUMAN Exp number of AAs in TMHs: 0.2005
  +
# sp_P45381_ACY2_HUMAN Exp number, first 60 AAs: 0.01618
  +
# sp_P45381_ACY2_HUMAN Total prob of N-in: 0.03827
  +
sp_P45381_ACY2_HUMAN TMHMM2.0 outside 1 313</pre>
  +
  +
http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output
  +
  +
  +
====BACR_HALSA====
  +
<pre># BACR_HALSA Length: 262
  +
# BACR_HALSA Number of predicted TMHs: 6
  +
# BACR_HALSA Exp number of AAs in TMHs: 140.4032
  +
# BACR_HALSA Exp number, first 60 AAs: 26.1196
  +
# BACR_HALSA Total prob of N-in: 0.01887
  +
# BACR_HALSA POSSIBLE N-term signal sequence
  +
BACR_HALSA TMHMM2.0 outside 1 22
  +
BACR_HALSA TMHMM2.0 TMhelix 23 42
  +
BACR_HALSA TMHMM2.0 inside 43 54
  +
BACR_HALSA TMHMM2.0 TMhelix 55 77
  +
BACR_HALSA TMHMM2.0 outside 78 91
  +
BACR_HALSA TMHMM2.0 TMhelix 92 114
  +
BACR_HALSA TMHMM2.0 inside 115 120
  +
BACR_HALSA TMHMM2.0 TMhelix 121 143
  +
BACR_HALSA TMHMM2.0 outside 144 147
  +
BACR_HALSA TMHMM2.0 TMhelix 148 170
  +
BACR_HALSA TMHMM2.0 inside 171 189
  +
BACR_HALSA TMHMM2.0 TMhelix 190 212
  +
BACR_HALSA TMHMM2.0 outside 213 262</pre>
  +
  +
  +
====RET4_HUMAN====
  +
<pre># RET4_HUMAN Length: 201
  +
# RET4_HUMAN Number of predicted TMHs: 0
  +
# RET4_HUMAN Exp number of AAs in TMHs: 0.01196
  +
# RET4_HUMAN Exp number, first 60 AAs: 0.01179
  +
# RET4_HUMAN Total prob of N-in: 0.01909
  +
RET4_HUMAN TMHMM2.0 outside 1 201</pre>
  +
  +
  +
====INSL5_HUMAN====
  +
<pre># INSL5_HUMAN Length: 135
  +
# INSL5_HUMAN Number of predicted TMHs: 0
  +
# INSL5_HUMAN Exp number of AAs in TMHs: 0.50415
  +
# INSL5_HUMAN Exp number, first 60 AAs: 0.50415
  +
# INSL5_HUMAN Total prob of N-in: 0.03772
  +
INSL5_HUMAN TMHMM2.0 outside 1 135</pre>
  +
  +
====LAMP1_HUMAN====
  +
<pre># LAMP1_HUMAN Length: 417
  +
# LAMP1_HUMAN Number of predicted TMHs: 2
  +
# LAMP1_HUMAN Exp number of AAs in TMHs: 44.89582
  +
# LAMP1_HUMAN Exp number, first 60 AAs: 22.24286
  +
# LAMP1_HUMAN Total prob of N-in: 0.99287
  +
# LAMP1_HUMAN POSSIBLE N-term signal sequence
  +
LAMP1_HUMAN TMHMM2.0 inside 1 10
  +
LAMP1_HUMAN TMHMM2.0 TMhelix 11 33
  +
LAMP1_HUMAN TMHMM2.0 outside 34 383
  +
LAMP1_HUMAN TMHMM2.0 TMhelix 384 406
  +
LAMP1_HUMAN TMHMM2.0 inside 407 417</pre>
  +
  +
====A4_HUMAN====
  +
<pre># A4_HUMAN Length: 770
  +
# A4_HUMAN Number of predicted TMHs: 1
  +
# A4_HUMAN Exp number of AAs in TMHs: 22.72525
  +
# A4_HUMAN Exp number, first 60 AAs: 0.0027
  +
# A4_HUMAN Total prob of N-in: 0.00015
  +
A4_HUMAN TMHMM2.0 outside 1 700
  +
A4_HUMAN TMHMM2.0 TMhelix 701 723
  +
A4_HUMAN TMHMM2.0 inside 724 770</pre>
   
 
===Phobius & PolyPhobius===
 
===Phobius & PolyPhobius===
  +
...
 
  +
Phobius is a program for the prediction of transmembrane region with special emphasis on reducing confusion with signal peptides. It was published in 2005 by Käll L, Krogh A, Sonnhammer EL.
  +
  +
Reference: [http://www.ncbi.nlm.nih.gov/pubmed/15111065?dopt=Abstract Paper]
  +
  +
Signal peptides and transmembrane proteins share a great deal of similarity and are often confused by predictors for either class; Phobius aims to predict both and to discriminate between them. It employs a hidden markov model to do this, modelling the different sequence regions pertaining to either class.
  +
  +
Input: An amino acid sequence.
  +
  +
Again, neither signal nor transmembrane regions were detected in Aspartoacyclase.
  +
  +
[[File:Aspa phobius.png]]
  +
  +
  +
====BACR_HALSA====
  +
<pre>ID
  +
FT TOPO_DOM 1 22 NON CYTOPLASMIC.
  +
FT TRANSMEM 23 42
  +
FT TOPO_DOM 43 53 CYTOPLASMIC.
  +
FT TRANSMEM 54 76
  +
FT TOPO_DOM 77 95 NON CYTOPLASMIC.
  +
FT TRANSMEM 96 114
  +
FT TOPO_DOM 115 120 CYTOPLASMIC.
  +
FT TRANSMEM 121 142
  +
FT TOPO_DOM 143 147 NON CYTOPLASMIC.
  +
FT TRANSMEM 148 169
  +
FT TOPO_DOM 170 189 CYTOPLASMIC.
  +
FT TRANSMEM 190 212
  +
FT TOPO_DOM 213 217 NON CYTOPLASMIC.
  +
FT TRANSMEM 218 237
  +
FT TOPO_DOM 238 262 CYTOPLASMIC.
  +
//</pre>
  +
  +
With PolyPhobius:
  +
<pre>ID BACR_HALSA
  +
FT TOPO_DOM 1 21 NON CYTOPLASMIC.
  +
FT TRANSMEM 22 43
  +
FT TOPO_DOM 44 54 CYTOPLASMIC.
  +
FT TRANSMEM 55 77
  +
FT TOPO_DOM 78 94 NON CYTOPLASMIC.
  +
FT TRANSMEM 95 114
  +
FT TOPO_DOM 115 120 CYTOPLASMIC.
  +
FT TRANSMEM 121 141
  +
FT TOPO_DOM 142 147 NON CYTOPLASMIC.
  +
FT TRANSMEM 148 166
  +
FT TOPO_DOM 167 186 CYTOPLASMIC.
  +
FT TRANSMEM 187 205
  +
FT TOPO_DOM 206 215 NON CYTOPLASMIC.
  +
FT TRANSMEM 216 237
  +
FT TOPO_DOM 238 262 CYTOPLASMIC.
  +
//</pre>
  +
  +
====RET4_HUMAN====
  +
<pre>ID RET4_HUMAN
  +
FT SIGNAL 1 18
  +
FT REGION 1 2 N-REGION.
  +
FT REGION 3 13 H-REGION.
  +
FT REGION 14 18 C-REGION.
  +
FT TOPO_DOM 19 201 NON CYTOPLASMIC.
  +
//</pre>
  +
  +
With PolyPhobius:
  +
<pre>ID RET4_HUMAN
  +
FT SIGNAL 1 18
  +
FT REGION 1 3 N-REGION.
  +
FT REGION 4 13 H-REGION.
  +
FT REGION 14 18 C-REGION.
  +
FT TOPO_DOM 19 201 NON CYTOPLASMIC.
  +
//</pre>
  +
  +
====INSL5_HUMAN====
  +
<pre>ID
  +
FT SIGNAL 1 22
  +
FT REGION 1 5 N-REGION.
  +
FT REGION 6 17 H-REGION.
  +
FT REGION 18 22 C-REGION.
  +
FT TOPO_DOM 23 135 NON CYTOPLASMIC.
  +
//</pre>
  +
  +
With PolyPhobius:
  +
<pre>ID INSL5_HUMAN
  +
FT SIGNAL 1 22
  +
FT REGION 1 4 N-REGION.
  +
FT REGION 5 16 H-REGION.
  +
FT REGION 17 22 C-REGION.
  +
FT TOPO_DOM 23 135 NON CYTOPLASMIC.
  +
//</pre>
  +
  +
====LAMP1_HUMAN====
  +
<pre>ID
  +
FT SIGNAL 1 28
  +
FT REGION 1 10 N-REGION.
  +
FT REGION 11 22 H-REGION.
  +
FT REGION 23 28 C-REGION.
  +
FT TOPO_DOM 29 381 NON CYTOPLASMIC.
  +
FT TRANSMEM 382 405
  +
FT TOPO_DOM 406 417 CYTOPLASMIC.
  +
//</pre>
  +
  +
With PolyPhobius:
  +
<pre>ID LAMP1_HUMAN
  +
FT SIGNAL 1 28
  +
FT REGION 1 9 N-REGION.
  +
FT REGION 10 22 H-REGION.
  +
FT REGION 23 28 C-REGION.
  +
FT TOPO_DOM 29 381 NON CYTOPLASMIC.
  +
FT TRANSMEM 382 405
  +
FT TOPO_DOM 406 417 CYTOPLASMIC.
  +
//</pre>
  +
  +
====A4_HUMAN====
  +
<pre>ID A4_HUMAN
  +
FT SIGNAL 1 17
  +
FT REGION 1 1 N-REGION.
  +
FT REGION 2 12 H-REGION.
  +
FT REGION 13 17 C-REGION.
  +
FT TOPO_DOM 18 700 NON CYTOPLASMIC.
  +
FT TRANSMEM 701 723
  +
FT TOPO_DOM 724 770 CYTOPLASMIC.
  +
//</pre>
   
 
===OCTOPUS & SPOCTOPUS===
 
===OCTOPUS & SPOCTOPUS===
  +
...
 
  +
OCTOPUS uses a combination of hidden markov models and neural networks to predict transmembrane regions. It was published in 2004 by Käll L, Krogh A, Sonnhammer EL.
  +
  +
Reference: [http://www.ncbi.nlm.nih.gov/pubmed/15111065?ordinalpos=6&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum Original paper]
  +
  +
OCROPUS first creates a sequence profile by running BLAST with the input sequence. Neural networks are used to subsequently predict the propensity for each residue to be located in a transmembrane region or in certain structure patterns on either side of the membrane. The resulting propensities are then fed to a hidden markov model, which calculates the most likely topology.
  +
  +
SPOCTOPUS extends OCTOPUS with a preprocessor that uses a neural network to assess the probability that the first 70 residues of the input sequence contain a signal peptide sequence. If this scores high enough, a hidden markov model is used to ascertain the exact offset of the signal region.
  +
  +
No transmembrane/signal regions were predicted for Aspartoacyclase.
  +
  +
  +
====BACR_HALSA====
  +
<pre>OCTOPUS predicted topology:
  +
oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM
  +
MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii
  +
MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii
  +
iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii
  +
iiiiiiiiiiiiiiiiiiiiii</pre>
  +
  +
<pre>SPOCTOPUS predicted topology:
  +
oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM
  +
MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiM
  +
MMMMMMMMMMMMMMMMMMMMooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii
  +
iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii
  +
iiiiiiiiiiiiiiiiiiiiii</pre>
  +
  +
====RET4_HUMAN====
  +
<pre>OCTOPUS predicted topology:
  +
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooooooooo</pre>
  +
  +
<pre>SPOCTOPUS predicted topology:
  +
nnnnSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooooooooo</pre>
  +
  +
====INSL5_HUMAN====
  +
<pre>OCTOPUS predicted topology:
  +
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooo</pre>
  +
  +
<pre>SPOCTOPUS predicted topology:
  +
nnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooo</pre>
  +
  +
====LAMP1_HUMAN====
  +
<pre>OCTOPUS predicted topology:
  +
iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii</pre>
  +
  +
<pre>SPOCTOPUS predicted topology:
  +
nnnnnnnnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii</pre>
  +
  +
====A4_HUMAN====
  +
<pre>OCTOPUS predicted topology:
  +
ooooRRRRRRoooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM
  +
Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii</pre>
  +
  +
<pre>SPOCTOPUS predicted topology:
  +
nnnSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
  +
ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM
  +
Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
  +
</pre>
   
 
===SignalP===
 
===SignalP===
  +
...
 
  +
SignalP is a method for the detection of signal peptides. It was first published in 1997 by Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne.
  +
  +
Reference: [http://www.ncbi.nlm.nih.gov/pubmed/9051728?dopt=Abstract Original paper], [http://www.ncbi.nlm.nih.gov/pubmed/15223320?dopt=Abstract current version]
  +
  +
SignalP comes in two flavours: One using a neural network, the other using a hidden markov model. It supports discriminating between cleaved and uncleaved signal peptides and supports both prokaryotic and eukaryotic input.
  +
  +
Input: A protein sequence.
  +
  +
Neither flavour detected any signal sequence in Aspartoacyclase.
  +
  +
====Aspartoacyclase: HMM====
  +
[[File:ASPA Plot.hmm.1.gif]]
  +
  +
====Aspartoacyclase: Neural Network====
  +
[[File:ASPA Plot.nn.1.gif]]
  +
  +
  +
====BACR_HALSA====
  +
Neural Network:
  +
<pre>BACR_HALSA length = 70
  +
# Measure Position Value Cutoff signal peptide?
  +
max. C 16 0.220 0.32 NO
  +
max. Y 39 0.196 0.33 NO
  +
max. S 31 0.970 0.87 YES
  +
mean S 1-38 0.426 0.48 NO
  +
D 1-38 0.311 0.43 NO
  +
# Most likely cleavage site between pos. 38 and 39: GTL-YF</pre>
  +
  +
HMM:
  +
<pre>Prediction: Signal anchor
  +
Signal peptide probability: 0.017
  +
Signal anchor probability: 0.859
  +
Max cleavage site probability: 0.004 between pos. 15 and 16
  +
</pre>
  +
  +
====RET4_HUMAN====
  +
Neural Network:
  +
<pre>RET4_HUMAN length = 70
  +
# Measure Position Value Cutoff signal peptide?
  +
max. C 19 0.929 0.32 YES
  +
max. Y 19 0.901 0.33 YES
  +
max. S 1 0.994 0.87 YES
  +
mean S 1-18 0.938 0.48 YES
  +
D 1-18 0.920 0.43 YES
  +
# Most likely cleavage site between pos. 18 and 19: GRA-ER</pre>
  +
  +
HMM:
  +
<pre>RET4_HUMAN
  +
Prediction: Signal peptide
  +
Signal peptide probability: 1.000
  +
Signal anchor probability: 0.000
  +
Max cleavage site probability: 0.979 between pos. 18 and 19</pre>
  +
  +
  +
====INSL5_HUMAN====
  +
Neural Network:
  +
<pre>INSL5_HUMAN length = 70
  +
# Measure Position Value Cutoff signal peptide?
  +
max. C 23 0.855 0.32 YES
  +
max. Y 23 0.778 0.33 YES
  +
max. S 13 0.987 0.87 YES
  +
mean S 1-22 0.852 0.48 YES
  +
D 1-22 0.815 0.43 YES
  +
# Most likely cleavage site between pos. 22 and 23: VRS-KE</pre>
  +
  +
HMM:
  +
<pre>INSL5_HUMAN
  +
Prediction: Signal peptide
  +
Signal peptide probability: 0.999
  +
Signal anchor probability: 0.000
  +
Max cleavage site probability: 0.911 between pos. 22 and 23</pre>
  +
  +
  +
====LAMP1_HUMAN====
  +
Neural Network:
  +
<pre>LAMP1_HUMAN length = 70
  +
# Measure Position Value Cutoff signal peptide?
  +
max. C 29 0.978 0.32 YES
  +
max. Y 29 0.903 0.33 YES
  +
max. S 19 0.999 0.87 YES
  +
mean S 1-28 0.960 0.48 YES
  +
D 1-28 0.932 0.43 YES
  +
# Most likely cleavage site between pos. 28 and 29: ASA-AM</pre>
  +
  +
HMM:
  +
<pre>LAMP1_HUMAN
  +
Prediction: Signal peptide
  +
Signal peptide probability: 1.000
  +
Signal anchor probability: 0.000
  +
Max cleavage site probability: 0.847 between pos. 28 and 29</pre>
  +
  +
  +
====A4_HUMAN====
  +
Neural Network:
  +
<pre>A4_HUMAN length = 70
  +
# Measure Position Value Cutoff signal peptide?
  +
max. C 18 0.891 0.32 YES
  +
max. Y 18 0.850 0.33 YES
  +
max. S 2 0.992 0.87 YES
  +
mean S 1-17 0.967 0.48 YES
  +
D 1-17 0.909 0.43 YES
  +
# Most likely cleavage site between pos. 17 and 18: ARA-LE</pre>
  +
  +
HMM:
  +
<pre>A4_HUMAN
  +
Prediction: Signal peptide
  +
Signal peptide probability: 1.000
  +
Signal anchor probability: 0.000
  +
Max cleavage site probability: 0.993 between pos. 17 and 18
  +
</pre>
   
 
===TargetP===
 
===TargetP===
  +
...
 
  +
TargetP is a software for the prediction of the cellular location of certain proteins, based on location signals in their sequence. It was published in 2000 by Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1.
  +
  +
Reference: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1. J. Mol. Biol., 300: 1005-1016, 2000.
  +
  +
TargetP confines its analysis to the N-terminal part of the sequence, it can discriminate between proteins destined for either mitochondrion, chloroplast (plants only, for obvious reasons), the secretory pathway or another location.
  +
  +
The prediction for Aspartoacyclase was "other location", which is plausible, as the enzyme is known to reside in the cytosol.
  +
  +
<pre>### targetp v1.1 prediction results ##################################
  +
Number of query sequences: 1
  +
Cleavage site predictions not included.
  +
Using NON-PLANT networks.
  +
  +
Name Len mTP SP other Loc RC
  +
----------------------------------------------------------------------
  +
sp_P45381_ACY2_HUMAN 313 0.073 0.109 0.898 _ 2
  +
----------------------------------------------------------------------
  +
cutoff 0.000 0.000 0.000</pre>
  +
  +
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php
  +
  +
  +
====BACR_HALSA====
  +
<pre>Number of query sequences: 1
  +
Cleavage site predictions not included.
  +
Using NON-PLANT networks.
  +
  +
Name Len mTP SP other Loc RC
  +
----------------------------------------------------------------------
  +
BACR_HALSA 262 0.019 0.897 0.562 S 4
  +
----------------------------------------------------------------------
  +
cutoff 0.000 0.000 0.000
  +
</pre>
  +
  +
====RET4_HUMAN====
  +
<pre>Number of query sequences: 1
  +
Cleavage site predictions not included.
  +
Using NON-PLANT networks.
  +
  +
Name Len mTP SP other Loc RC
  +
----------------------------------------------------------------------
  +
RET4_HUMAN 201 0.242 0.928 0.020 S 2
  +
----------------------------------------------------------------------
  +
cutoff 0.000 0.000 0.000
  +
</pre>
  +
  +
====INSL5_HUMAN====
  +
<pre>Number of query sequences: 1
  +
Cleavage site predictions not included.
  +
Using NON-PLANT networks.
  +
  +
Name Len mTP SP other Loc RC
  +
----------------------------------------------------------------------
  +
INSL5_HUMAN 135 0.074 0.899 0.037 S 1
  +
----------------------------------------------------------------------
  +
cutoff 0.000 0.000 0.000
  +
</pre>
  +
  +
====LAMP1_HUMAN====
  +
<pre>Number of query sequences: 1
  +
Cleavage site predictions not included.
  +
Using NON-PLANT networks.
  +
  +
Name Len mTP SP other Loc RC
  +
----------------------------------------------------------------------
  +
LAMP1_HUMAN 417 0.043 0.953 0.017 S 1
  +
----------------------------------------------------------------------
  +
cutoff 0.000 0.000 0.000
  +
</pre>
  +
  +
====A4_HUMAN====
  +
<pre>Number of query sequences: 1
  +
Cleavage site predictions not included.
  +
Using NON-PLANT networks.
  +
  +
Name Len mTP SP other Loc RC
  +
----------------------------------------------------------------------
  +
A4_HUMAN 770 0.035 0.937 0.084 S 1
  +
----------------------------------------------------------------------
  +
cutoff 0.000 0.000 0.000
  +
</pre>
  +
  +
  +
===Analysis===
  +
  +
====BACR_HALSA====
  +
  +
TM Prediction:
  +
TMHMM predicts one helix less than the other tools (ca. 216-237); other than that, all methods consent on 7 TM helices with insignificant differences. The PDB structure shows that this is correct.
  +
  +
Signalpeptid:
  +
SignalP predicts it to be a signal peptide (NN mode) and a signal anchor (HMM mode); according to the information we found on the protein, both predictions are faulty.
  +
  +
Target prediction:
  +
TargetP predicted this protein to be located in the secretory pathway and to be a signal peptide; this is not correct.
  +
  +
====RET4_HUMAN====
  +
  +
TM Prediction:
  +
Only Octopus predicts a TM helix; this is a mis-identified signal sequence. The other programs predict no TM helices, which is correct.
  +
  +
Signal peptide prediction:
  +
Phobius predicts it to be a signal peptide; so do Spoctopus, and SignalP, with the cleaving site at position 18. According to Uniprot, this is correct.
  +
  +
Target prediction:
  +
TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct.
  +
  +
====INSL5_HUMAN====
  +
  +
TM Prediction:
  +
Only Octopus predicts a transmembrane element, which is a mis-identified signal sequence.
  +
  +
Signal peptide prediction:
  +
Phobius predicts a signal sequence with cleaving site at 22; Spoctopus predicts the cleaving site at 23; SignalP predicts it to be between 22 and 23.
  +
  +
Target prediction:
  +
TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct.
  +
  +
====LAMP1_HUMAN====
  +
  +
TM Prediction:
  +
TMHMM detects two TM helices; so does Octopus. One TM helix is detected as a signal sequence by Spoctopus and Phobius.
  +
  +
Signal peptide prediction:
  +
Phobius, Spoctopus and SignalP find a signal sequence with cleaving site at 28-29. This is correct, according to Uniprot.
  +
  +
Target prediction:
  +
TargetP predicted this protein to be a signal peptide in the secretory pathway; since it is membrane-located, this is not correct.
  +
  +
====A4_HUMAN====
  +
  +
TM Prediction:
  +
One TM helix from 701 to 723 predicted by all programms (end is 722 in case of Octopus and Spoctopus). One short reentrant sequence predicted by Octopus, which is a mis-identified signal sequence.
  +
  +
Signal peptide prediction:
  +
Spoctopus, SignalP and Phobius all report a signal sequence with cleaving site at 17-19. According to Uniprot, this is correct.
  +
  +
Target prediction:
  +
TargetP predicted this protein to be a signal peptide in the secretory pathway; this is wrong, as it is membrane-associated.
   
 
==Prediction of GO terms==
 
==Prediction of GO terms==
   
 
===GOPET===
 
===GOPET===
  +
...
 
  +
GOPET is a tool aimed at automatically assigning Gene Ontology terms to proteins. It was published in 2006 by Arunachalam Vinayagam, Coral del Val, Falk Schubert, Roland Eils, Karl-Heinz Glatting, Sándor Suhai and Rainer König.
  +
  +
Reference: [http://www.biomedcentral.com/1471-2105/7/161 Paper]
  +
  +
The input sequence is first BLASTed against a database of proteins with known GO terms; a support vector machine is then used to discriminate between correct and false terms.
  +
  +
Results for Aspartoacyclase, all coinciding nicely with the current knowledge on the enzyme:
  +
  +
{| border="1" style="text-align:center; border-spacing:0;"
  +
|-
  +
! GOid
  +
! Aspect
  +
! Confidence
  +
! GO Term
  +
|-
  +
| GO:0016787 || F || 96% || hydrolase activity
  +
|-
  +
| GO:0004046 || F || 82% || aminoacyclase activity
  +
|-
  +
| GO:0019807 || F || 82% || aspartoacyclase activity
  +
|-
  +
| GO:0016788 || F || 81% || hydrolase activity acting on ester bonds
  +
|-
  +
|}
  +
  +
  +
Other proteins:
  +
  +
[[File:Aspa other goped.gif]]
   
 
===Pfam===
 
===Pfam===
  +
...
 
  +
PFAM is a large database of protein functions. It was established in 1998 at the Wellcome Trust Sanger Institute.
  +
  +
It is comprised of two database: Pfam-A, a manually curated high-quality database with a limited number of entries, and the much larger, automatically curated, Pfam-B.
  +
  +
Reference: The Pfam protein families database: R.D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J.E. Pollington, O.L. Gavin, P. Gunesekaran, G. Ceric, K. Forslund, L. Holm, E.L. Sonnhammer, S.R. Eddy, A. Bateman
  +
  +
The result for Aspartoacyclase is spot-on:
  +
  +
[[File:Aspa pfam significant.png]]
  +
  +
  +
====BACR_HALSA====
  +
[[File:Aspa bacr.png]]
  +
  +
====RET4_HUMAN====
  +
[[File:Aspa ret4.png]]
  +
  +
====INSL5_HUMAN====
  +
[[File:Aspa insl.png]]
  +
  +
====LAMP1_HUMAN====
  +
[[File:Aspa lamp1.png]]
  +
  +
====A4_HUMAN====
  +
[[File:Aspa a4.png]]
   
 
===ProtFun 2.2===
 
===ProtFun 2.2===
  +
...
 
  +
ProtFun is a program for ab-initio protein function prediction. It was published in 2002 by Juhl Jensen et al.
  +
  +
Reference: [http://www.cbs.dtu.dk/services/ProtFun/abstract.php Paper Abstract]
  +
  +
The software queries a number of existing prediction servers for a wide range of features, from isoelectic point to posttranslational modifications, and deduces its function from this data.
  +
  +
Results for Aspartoacyclase:
  +
  +
<pre>############## ProtFun 2.2 predictions ##############
  +
  +
>sp_P45381_A
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.071 3.233
  +
Biosynthesis_of_cofactors 0.144 2.003
  +
Cell_envelope 0.033 0.535
  +
Cellular_processes 0.137 1.875
  +
Central_intermediary_metabolism => 0.334 5.309
  +
Energy_metabolism 0.226 2.511
  +
Fatty_acid_metabolism 0.022 1.663
  +
Purines_and_pyrimidines 0.367 1.512
  +
Regulatory_functions 0.021 0.128
  +
Replication_and_transcription 0.167 0.625
  +
Translation 0.113 2.559
  +
Transport_and_binding 0.017 0.042
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme => 0.703 2.454
  +
Nonenzyme 0.297 0.416
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.111 0.534
  +
Transferase (EC 2.-.-.-) 0.202 0.585
  +
Hydrolase (EC 3.-.-.-) 0.115 0.363
  +
Lyase (EC 4.-.-.-) 0.031 0.662
  +
Isomerase (EC 5.-.-.-) => 0.084 2.637
  +
Ligase (EC 6.-.-.-) 0.074 1.460
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.053 0.246
  +
Receptor 0.004 0.024
  +
Hormone 0.001 0.206
  +
Structural_protein 0.001 0.041
  +
Transporter 0.025 0.230
  +
Ion_channel 0.015 0.257
  +
Voltage-gated_ion_channel 0.004 0.173
  +
Cation_channel 0.011 0.234
  +
Transcription 0.100 0.785
  +
Transcription_regulation 0.039 0.313
  +
Stress_response 0.010 0.117
  +
Immune_response 0.061 0.720
  +
Growth_factor 0.006 0.450
  +
Metal_ion_transport 0.009 0.020
  +
</pre>
  +
  +
  +
Other proteins:
  +
<pre>############## ProtFun 2.2 predictions ##############
  +
  +
>LAMP1_HUMAN
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.011 0.484
  +
Biosynthesis_of_cofactors 0.053 0.735
  +
Cell_envelope => 0.804 13.186
  +
Cellular_processes 0.027 0.373
  +
Central_intermediary_metabolism 0.138 2.188
  +
Energy_metabolism 0.037 0.411
  +
Fatty_acid_metabolism 0.016 1.265
  +
Purines_and_pyrimidines 0.533 2.195
  +
Regulatory_functions 0.015 0.090
  +
Replication_and_transcription 0.019 0.073
  +
Translation 0.027 0.613
  +
Transport_and_binding 0.834 2.033
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme 0.276 0.965
  +
Nonenzyme => 0.724 1.014
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.039 0.187
  +
Transferase (EC 2.-.-.-) 0.046 0.134
  +
Hydrolase (EC 3.-.-.-) 0.058 0.184
  +
Lyase (EC 4.-.-.-) 0.020 0.430
  +
Isomerase (EC 5.-.-.-) 0.010 0.321
  +
Ligase (EC 6.-.-.-) 0.017 0.326
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.396 1.849
  +
Receptor 0.282 1.659
  +
Hormone 0.001 0.206
  +
Structural_protein 0.011 0.408
  +
Transporter 0.024 0.222
  +
Ion_channel 0.008 0.147
  +
Voltage-gated_ion_channel 0.002 0.111
  +
Cation_channel 0.010 0.215
  +
Transcription 0.032 0.247
  +
Transcription_regulation 0.018 0.142
  +
Stress_response 0.246 2.795
  +
Immune_response => 0.371 4.368
  +
Growth_factor 0.013 0.956
  +
Metal_ion_transport 0.009 0.020
  +
  +
//
  +
  +
  +
>RET4_HUMAN
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.017 0.751
  +
Biosynthesis_of_cofactors 0.044 0.610
  +
Cell_envelope => 0.804 13.186
  +
Cellular_processes 0.075 1.021
  +
Central_intermediary_metabolism 0.197 3.128
  +
Energy_metabolism 0.043 0.475
  +
Fatty_acid_metabolism 0.016 1.265
  +
Purines_and_pyrimidines 0.275 1.131
  +
Regulatory_functions 0.013 0.080
  +
Replication_and_transcription 0.022 0.084
  +
Translation 0.032 0.721
  +
Transport_and_binding 0.800 1.951
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme => 0.544 1.900
  +
Nonenzyme 0.456 0.639
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.095 0.458
  +
Transferase (EC 2.-.-.-) 0.038 0.109
  +
Hydrolase (EC 3.-.-.-) 0.235 0.742
  +
Lyase (EC 4.-.-.-) => 0.059 1.264
  +
Isomerase (EC 5.-.-.-) 0.010 0.321
  +
Ligase (EC 6.-.-.-) 0.017 0.326
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.202 0.942
  +
Receptor 0.147 0.862
  +
Hormone 0.004 0.667
  +
Structural_protein 0.002 0.058
  +
Transporter 0.025 0.232
  +
Ion_channel 0.016 0.288
  +
Voltage-gated_ion_channel 0.003 0.148
  +
Cation_channel 0.010 0.215
  +
Transcription 0.027 0.207
  +
Transcription_regulation 0.025 0.196
  +
Stress_response 0.161 1.829
  +
Immune_response => 0.239 2.813
  +
Growth_factor 0.023 1.617
  +
Metal_ion_transport 0.009 0.020
  +
  +
//
  +
  +
  +
>BACR_HALSA
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.033 1.495
  +
Biosynthesis_of_cofactors 0.186 2.589
  +
Cell_envelope 0.029 0.483
  +
Cellular_processes 0.051 0.694
  +
Central_intermediary_metabolism 0.045 0.711
  +
Energy_metabolism 0.138 1.537
  +
Fatty_acid_metabolism 0.016 1.265
  +
Purines_and_pyrimidines 0.302 1.244
  +
Regulatory_functions 0.013 0.080
  +
Replication_and_transcription 0.019 0.073
  +
Translation 0.059 1.339
  +
Transport_and_binding => 0.791 1.929
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme 0.199 0.696
  +
Nonenzyme => 0.801 1.122
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.114 0.549
  +
Transferase (EC 2.-.-.-) 0.031 0.091
  +
Hydrolase (EC 3.-.-.-) 0.057 0.180
  +
Lyase (EC 4.-.-.-) 0.020 0.430
  +
Isomerase (EC 5.-.-.-) 0.010 0.321
  +
Ligase (EC 6.-.-.-) 0.017 0.326
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.258 1.205
  +
Receptor 0.355 2.087
  +
Hormone 0.001 0.206
  +
Structural_protein 0.006 0.200
  +
Transporter => 0.440 4.036
  +
Ion_channel 0.010 0.169
  +
Voltage-gated_ion_channel 0.004 0.172
  +
Cation_channel 0.078 1.689
  +
Transcription 0.026 0.205
  +
Transcription_regulation 0.028 0.226
  +
Stress_response 0.012 0.139
  +
Immune_response 0.011 0.128
  +
Growth_factor 0.010 0.727
  +
Metal_ion_transport 0.049 0.106
  +
  +
//
  +
  +
  +
>INSL5_HUMAN
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.011 0.484
  +
Biosynthesis_of_cofactors 0.040 0.558
  +
Cell_envelope => 0.756 12.393
  +
Cellular_processes 0.033 0.448
  +
Central_intermediary_metabolism 0.048 0.755
  +
Energy_metabolism 0.036 0.397
  +
Fatty_acid_metabolism 0.016 1.265
  +
Purines_and_pyrimidines 0.144 0.592
  +
Regulatory_functions 0.014 0.087
  +
Replication_and_transcription 0.020 0.075
  +
Translation 0.032 0.735
  +
Transport_and_binding 0.834 2.033
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme 0.209 0.729
  +
Nonenzyme => 0.791 1.109
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.056 0.268
  +
Transferase (EC 2.-.-.-) 0.031 0.091
  +
Hydrolase (EC 3.-.-.-) 0.062 0.195
  +
Lyase (EC 4.-.-.-) 0.020 0.430
  +
Isomerase (EC 5.-.-.-) 0.010 0.321
  +
Ligase (EC 6.-.-.-) 0.017 0.327
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.374 1.746
  +
Receptor 0.128 0.750
  +
Hormone => 0.247 37.936
  +
Structural_protein 0.001 0.041
  +
Transporter 0.025 0.228
  +
Ion_channel 0.010 0.168
  +
Voltage-gated_ion_channel 0.003 0.131
  +
Cation_channel 0.010 0.215
  +
Transcription 0.054 0.425
  +
Transcription_regulation 0.091 0.724
  +
Stress_response 0.099 1.128
  +
Immune_response 0.178 2.090
  +
Growth_factor 0.061 4.379
  +
Metal_ion_transport 0.009 0.020
  +
  +
//
  +
  +
  +
>A4_HUMAN
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.020 0.921
  +
Biosynthesis_of_cofactors 0.261 3.623
  +
Cell_envelope => 0.804 13.186
  +
Cellular_processes 0.053 0.730
  +
Central_intermediary_metabolism 0.184 2.920
  +
Energy_metabolism 0.023 0.259
  +
Fatty_acid_metabolism 0.016 1.265
  +
Purines_and_pyrimidines 0.417 1.716
  +
Regulatory_functions 0.013 0.084
  +
Replication_and_transcription 0.029 0.109
  +
Translation 0.027 0.613
  +
Transport_and_binding 0.827 2.016
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme => 0.392 1.368
  +
Nonenzyme 0.608 0.852
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.024 0.114
  +
Transferase (EC 2.-.-.-) 0.208 0.603
  +
Hydrolase (EC 3.-.-.-) 0.190 0.600
  +
Lyase (EC 4.-.-.-) 0.020 0.430
  +
Isomerase (EC 5.-.-.-) 0.010 0.324
  +
Ligase (EC 6.-.-.-) 0.048 0.946
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.126 0.586
  +
Receptor 0.036 0.211
  +
Hormone 0.001 0.206
  +
Structural_protein => 0.034 1.205
  +
Transporter 0.024 0.222
  +
Ion_channel 0.009 0.162
  +
Voltage-gated_ion_channel 0.002 0.108
  +
Cation_channel 0.010 0.215
  +
Transcription 0.043 0.335
  +
Transcription_regulation 0.018 0.143
  +
Stress_response 0.076 0.862
  +
Immune_response 0.016 0.183
  +
Growth_factor 0.005 0.372
  +
Metal_ion_transport 0.009 0.020
  +
  +
//</pre>

Latest revision as of 03:12, 14 June 2011

Prediction of Secondary Structure Elements

PsiPred

For a description of PsiPred, see Psipred.

PsiPred results for Aspartoacyclase
# PSIPRED HFORMAT (PSIPRED V3.0)

Conf: 987522213466199993246776008999999984450000587389976339987971
Pred: CCCCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCCCCEEEEEECCHHHHHH
  AA: MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK
              10        20        30        40        50        60


Conf: 998788998878786647999999984999999999988199999997428994187898
Pred: CCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCEEEECCCCCC
  AA: CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS
              70        80        90       100       110       120


Conf: 999505864599448999999998762999737862048886301220027861499667
Pred: CCCCEEEEECCCCHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCHHHHCCCCCCEEEEEC
  AA: NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG
             130       140       150       160       170       180


Conf: 877898808999999999999998976406998899973479998113515579877700
Pred: CCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEEECCCCCCCCCE
  AA: PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA
             190       200       210       220       230       240


Conf: 552467669998546888832213699778518622057770372000011102000100
Pred: EEECCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCEEEEECCCCHHCCCCHHHEECCE
  AA: AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK
             250       260       270       280       290       300


Conf: 3544256113309
Pred: EEEEECCCEEECC
  AA: LTLNAKSIRCCLH
             310 

JPred3

JPred3 was published in 1998 by Christian Cole, Jonathan D. Barber and Geoffrey J. Barton.

Reference: Original paper, current version

JPred3 uses the JNet 2.0 algorithm to make its predictions. This algorithm generates profiles using PSI-Blast (which is used to build a position-specific scoring matrix) and HMMer (which is used to construct HMM profiles.) Both position-specific scoring matrix and the HMMs are used to predict secondary structure and solvent accessibility.

Input: A protein sequence or a pre-made MSA; a PDB database is needed, too, but provided by the JPred3 server.

MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYID
------------EEEEEEEE------HHHHHHHHHH---------EEEEEEEE-HHHHHH-----H



CDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLILEDSR
---------------------HHHHHHHHHHHHHH-------EEEEEE-----------EEEE---



NNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLRADILDQMRKM
-HHHHHHHHHHHH------EEEEEE---------HHEE----EEEEE---------HHHHHHHHHH



IKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQDQDWKPLHPGDPMFLT
HHHHHHHHHHH----------EEEEEEEEEE----------EEEE----------------HHE--



LDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSIRCCLH
----EEEE----EEEEEEE-----HHH-HHHHHHHHEEE-----EEEE-

DSSP

DSSP (Define Secondary Structure of Proteins) is a software for secondary structure assignment and was published in 1983 by Wolfgang Kabsch and Chris Sander. Reference: Original paper

DSSP does not predict secondary structure from amino acid sequences; instead, it uses a 3D structure (a PDB file) to deduce the secondary structure from the 3D structure. To this end, DSSP examines the phi and psi angles and the C alpha positions in the protein backbone and H-bonds present in the structure; these are used to define "n-turns", which are H-bonds between the NH and CO groups of amino acids with sequence separations of 3-5 residues, and "bridges" with greater sequence separations. Repeating 4-turns are used to identify helices, repeating bridges identify beta sheets.

Input: A 3D structure (a PDB file, ID 2o53 in our case)

Output: (from [1])

    H = alpha helix
    B = residue in isolated beta-bridge
    E = extended strand, participates in beta ladder
    G = 3-helix (3/10 helix)
    I = 5 helix (pi helix)
    T = hydrogen bonded turn
    S = bend 

The results differ from those of the two secondary structure predictors, as the PDB file contains a dimer, whereas the Uniprot sequence only contains one domain (which is a sensible thing, since both domains are essentially identical.)

The prediction shows slight differences between both domains; we assume that reasons for this are slight differences in the actual 3D structure of the two chains as well as H-bonds between the two chains.

                     10        20        30        40        50        60
                      |         |         |         |         |         |
    1 -   60 EHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYIDCD
    1 -   60     SSSSSS TTTT HHHHHHHHHHTT  333  TT SSSSSST HHHHHTTTT TTT
    1 -   60
    1 -   60 AA AA               A  AA AAA AA AAAA A A     AA  AAAA AAAA
                     70        80        90       100       110       120
                      |         |         |         |         |         |
   61 -  120 LNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLIL
   61 -  120 333  THHHHTT   TTT HHHHHHHHHHHHH  TTTTTT TSSSSSSS TTT SSSSSS
   61 -  120       **  * * **            **   * *
   61 -  120   A  AAA  AAAAAAAAA   A   A  AA  AAA AA             A
                    130       140       150       160       170       180
                      |         |         |         |         |         |
  121 -  180 EDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLR
  121 -  180 T TT HHHHHHHHHHHHHHTTT SSSSS   TT     3333TTSSSSSSSST  TT
  121 -  180              *  ** **
  121 -  180  A A A      AA AAA AAAA A    AAAAAA        AA       A  A   A
                    190       200       210       220       230       240
                      |         |         |         |         |         |
  181 -  240 ADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQ
  181 -  240 HHHHHHHHHHHHHHHHHHHHHHTT     SSSSSSSSSSSS     TTT   TSS TTTT
  181 -  240
  181 -  240  A  AA AA  AA     AA  AAAA AA A A  A AAA A AAAAA A      AA
                    250       260       270       280       290       300
                      |         |         |         |         |         |
  241 -  300 DQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSI
  241 -  300 T TTT   TTTSSSS TT  SSS  TTT  SSSTTT 333TTTT TSSSSSSSSSSS
  241 -  300
  241 -  300 AA  AA AAAAA  A  AAAAAA AAAAA A          AAA    A  AAA A A


  301 -  302 RC
  301 -  302
  301 -  302
  301 -  302 AA
 
                  310       320       330       340       350       360
                    |         |         |         |         |         |
  303 -  362 EHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYIDCD
  303 -  362     SSSSSS TTTT HHHHHHHHHHHH 3333  TT SSSSSST HHHHHTT T TTT
  303 -  362
  303 -  362 AA AA                  AA AAA AA AAAA A A A   AA  AAAA AAAA
                  370       380       390       400       410       420
                    |         |         |         |         |         |
  363 -  422 LNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLIL
  363 -  422 333  THHHHTT   TTT HHHHHHHHHHHHH  TTTTTT TSSSSSSS TTT SSSSSS
  363 -  422       **  ***  *            **    ****
  363 -  422   A  AAA  AAAAAAAAA   A   A  AA  AAA AA             A
                  430       440       450       460       470       480
                    |         |         |         |         |         |
  423 -  482 EDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLR
  423 -  482 T TT HHHHHHHHHHHHHHTTT SSSSS  TTTT    3333TTSSSSSSSS   TT
  423 -  482
  423 -  482  A A A      AA AA  AAAA A    AAAAAA        AA       A      A
                  490       500       510       520       530       540
                    |         |         |         |         |         |
  483 -  542 ADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQ
  483 -  542 HHHHHHHHHHHHHHHHHHHHHHTT     SSSSSSSSSSSS     TTT   TSS TTTT
  483 -  542                             *** *
  483 -  542  A  AA AA  A      AA  A AA AA A A  A AAA A AAAAA A A    AA
                  550       560       570       580       590       600
                    |         |         |         |         |         |
  543 -  602 DQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSI
  543 -  602 T TTT   TTTSSSS TT  SSS  TTT  SSSTTT THHHHTT TSSSSSSSSSSS
  543 -  602                                                        *
  543 -  602 AA  AA AAAAA  A  AAAAAA AAAAA A          AAA    A AAAA A AA


  603 -  604 RC
  603 -  604
  603 -  604
  603 -  604 AA

Clearly solvent accessible: A; involved in symmetry contacts: *

All in all, the two prediction methods Psipred and JPred3 did a good job; they managed to predict most of the main secondary structure elements, with only minor variations in length and position of the individual helices/sheets and very minor variations between each other. A somewhat more detailed result from DSSP is to be expected, as it has pointedly better information to and merely assigns instead of actually predicting the secondary structure.

Prediction of disordered regions

DISOPRED

DISOPRED predicts native disorder in proteins. It was published in 2004 by Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF and Jones DT. Reference: [2]

DISOPRED uses linear support vector machines to predict disorder in a given protein sequence. A set of 750 proteins with high-quality structures was used as training data; to this end, PSI-Blast profiles were generated by aligning the training structures against a filtered database of protein structures. The resulting profiles were used to train the SVMs.

DISOPRED result graph for Aspartoacyclase
DISOPRED predictions for a false positive rate threshold of: 2%

conf: 999999999877640000000000000000000000000000000000000000000000
pred: **********..................................................
  AA: MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK
              10        20        30        40        50        60

conf: 000000000000000356777788777654200000000000000000000000000000
pred: ......................**....................................
  AA: CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS
              70        80        90       100       110       120

conf: 000000000000000000000000000000000000000000000000000000000000
pred: ............................................................
  AA: NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG
             130       140       150       160       170       180

conf: 000000000000000000000000000000000000000000000000000000000000
pred: ............................................................
  AA: PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA
             190       200       210       220       230       240

conf: 000000000000000000000000000000000000000000000000000000000000
pred: ............................................................
  AA: AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK
             250       260       270       280       290       300

conf: 0000000000002
pred: .............
  AA: LTLNAKSIRCCLH
             310

Asterisks (*) represent disorder predictions and dots (.) 
prediction of order. The confidence estimates give a rough
indication of the probability that each residue is disordered.

POODLE

POODLE (Prediction Of Order and Disorder by machine LEarning) is a series of programs published between 2005 and 2008. We used the latest variant, POODLE-I, which was published in 2008 by S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi.

Reference: S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi, "Disordered region prediction by integrating POODLE series", CASP8 Proceedings 2008, 14-15.

Input: Protein amino acid sequence

POODLE-I is an integrated variant of other flavors of POODLE (-S and -L for short/long regions of disorder and -W for proteins that are mostly disordered) and several other tools like Psipred, JNet etc. It employs a rather involved workflow.

Custom-formatted output for Aspartoacyclase:

Aspa disopred.png


POS 1    M      T      S      C      H      I      A      E      E      H      I      
         -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
         0.461  0.444  0.413  0.401  0.418  0.461  0.537  0.644  0.693  0.62   0.468  


POS 12    Q      K      V      A      I      F      G      G      T      H      G      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.321  0.238  0.177  0.146  0.128  0.116  0.106  0.104  0.111  0.126  0.132  


POS 23    N      E      L      T      G      V      F      L      V      K      H      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.131  0.118  0.098  0.073  0.053  0.041  0.036  0.035  0.035  0.036  0.036  


POS 34    W      L      E      N      G      A      E      I      Q      R      T      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.038  0.045  0.06   0.081  0.099  0.119  0.133  0.146  0.147  0.143  0.129  


POS 45    G      L      E      V      K      P      F      I      T      N      P      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.111  0.09   0.073  0.062  0.054  0.047  0.039  0.033  0.033  0.037  0.041  


POS 56    R      A      V      K      K      C      T      R      Y      I      D      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.043  0.047  0.054  0.062  0.068  0.071  0.073  0.07   0.067  0.069  0.075  


POS 67    C      D      L      N      R      I      F      D      L      E      N      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.08   0.081  0.078  0.075  0.073  0.072  0.076  0.094  0.127  0.176  0.249  


POS 78    L      G      K      K      M      S      E      D      L      P      Y      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.403  0.554  0.737  0.766  0.804  0.755  0.682  0.65   0.632  0.636  0.583  


POS 89    E      V      R      R      A      Q      E      I      N      H      L      
          -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
          0.505  0.448  0.348  0.262  0.201  0.16   0.131  0.11   0.103  0.104  0.111  


POS 100    F      G      P      K      D      S      E      D      S      Y      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.116  0.117  0.108  0.089  0.067  0.049  0.039  0.034  0.033  0.035  


POS 110    D      I      I      F      D      L      H      N      T      T      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.038  0.041  0.043  0.043  0.042  0.041  0.042  0.045  0.052  0.06   


POS 120    S      N      M      G      C      T      L      I      L      E      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.07   0.081  0.092  0.101  0.109  0.111  0.107  0.096  0.085  0.072  


POS 130    D      S      R      N      N      F      L      I      Q      M      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.06   0.051  0.046  0.04   0.036  0.033  0.032  0.031  0.031  0.031  


POS 140    F      H      Y      I      K      T      S      L      A      P      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.033  0.036  0.04   0.043  0.046  0.049  0.05   0.053  0.055  0.059  


POS 150    L      P      C      Y      V      Y      L      I      E      H      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.065  0.073  0.088  0.103  0.115  0.119  0.118  0.111  0.104  0.104  


POS 160    P      S      L      K      Y      A      T      T      R      S      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.121  0.147  0.19   0.229  0.264  0.263  0.245  0.196  0.149  0.098  


POS 170    I      A      K      Y      P      V      G      I      E      V      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.069  0.053  0.051  0.057  0.066  0.08   0.093  0.102  0.103  0.099  


POS 180    G      P      Q      P      Q      G      V      L      R      A      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.096  0.094  0.095  0.095  0.099  0.098  0.099  0.095  0.094  0.086  


POS 190    D      I      L      D      Q      M      R      K      M      I      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.074  0.059  0.048  0.04   0.038  0.038  0.038  0.039  0.04   0.043  


POS 200    K      H      A      L      D      F      I      H      H      F      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.046  0.049  0.051  0.052  0.056  0.064  0.077  0.092  0.112  0.142  


POS 210    N      E      G      K      E      F      P      P      C      A      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.17   0.198  0.21   0.311  0.281  0.248  0.105  0.084  0.072  0.071  


POS 220    I      E      V      Y      K      I      I      E      K      V      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.069  0.065  0.06   0.056  0.052  0.054  0.062  0.076  0.105  0.141  


POS 230    D      Y      P      R      D      E      N      G      E      I      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.176  0.203  0.224  0.227  0.217  0.209  0.228  0.248  0.271  0.282  


POS 240    A      A      I      I      H      P      N      L      Q      D      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.289  0.269  0.24   0.208  0.188  0.169  0.155  0.152  0.167  0.193  


POS 250    Q      D      W      K      P      L      H      P      G      D      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.222  0.236  0.235  0.21   0.175  0.136  0.11   0.097  0.099  0.104  


POS 260    P      M      F      L      T      L      D      G      K      T      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.107  0.108  0.104  0.095  0.084  0.077  0.073  0.082  0.102  0.125  


POS 270    I      P      L      G      G      D      C      T      V      Y      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.144  0.162  0.169  0.166  0.156  0.149  0.133  0.117  0.099  0.089  


POS 280    P      V      F      V      N      E      A      A      Y      Y      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.077  0.072  0.067  0.064  0.066  0.082  0.122  0.184  0.241  0.279  


POS 290    E      K      K      E      A      F      A      K      T      T      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.282  0.264  0.236  0.229  0.238  0.257  0.263  0.252  0.231  0.222  


POS 300    K      L      T      L      N      A      K      S      I      R      
           -1     -1     -1     -1     -1     -1     -1     -1     -1     -1     
           0.246  0.278  0.305  0.31   0.303  0.277  0.263  0.371  0.382  0.38   


POS 310    C      C      L      H      
           -1     -1     -1     -1     
           0.348  0.51   0.496  0.489  

IUPRED

IUPRED is a software for the prediction of intrinsically unstructured regions in proteins. It was published in 2005 by Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon.

Reference: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon, Bioinformatics (2005) 21, 3433-3434.

IUPRED predicts disordered regions by estimating the capacity of the amino acid chain to form stabilizing contacts. The underlying assumption is that proteins intrinsically unable to do so have distinct sequences that can be identified via their unfavorable energy values. To this end a 20x20 predictor matrix was calculated from a set of globular proteins with known structure. IUPRED uses this matrix to derive a tendency to be intrinsically unstructured from the amino acid composition alone.

Input: An amino acid sequence.

IUPRED comes in three flavors: Long Disorder, which specializes in finding long stretches of disorder, Short Disorder, which does the same for short stretches of disorder, and structured regions, which predicts regions lacking disorder.

Long Disorder

Aspa iupred1.png

POS 1    M      T      S      C      H      I      A      E      E      H      I      
         0.3215  0.3426  0.2817  0.2783  0.2064  0.1275  0.1554  0.1823  0.2094  0.2364  0.2575  


POS 12    Q      K      V      A      I      F      G      G      T      H      G      
          0.2988  0.3087  0.2364  0.3215  0.3149  0.3321  0.2609  0.1823  0.1275  0.1206  0.1759  


POS 23    N      E      L      T      G      V      F      L      V      K      H      
          0.1028  0.0676  0.1070  0.1298  0.1881  0.2575  0.2715  0.1969  0.2034  0.2034  0.2064  


POS 34    W      L      E      N      G      A      E      I      Q      R      T      
          0.1942  0.1206  0.1914  0.1399  0.1373  0.2064  0.2002  0.1969  0.2541  0.2715  0.2951  


POS 45    G      L      E      V      K      P      F      I      T      N      P      
          0.3840  0.4256  0.3460  0.3321  0.3286  0.2609  0.2503  0.3249  0.2292  0.1583  0.1611  


POS 56    R      A      V      K      K      C      T      R      Y      I      D      
          0.0985  0.1554  0.0929  0.1373  0.1424  0.0765  0.0749  0.1229  0.0749  0.0780  0.0719  


POS 67    C      D      L      N      R      I      F      D      L      E      N      
          0.0424  0.0506  0.0734  0.0734  0.0605  0.1048  0.1115  0.1184  0.1229  0.2064  0.1323  


POS 78    L      G      K      K      M      S      E      D      L      P      Y      
          0.2258  0.1643  0.2364  0.2292  0.2002  0.2884  0.4087  0.3215  0.4119  0.3948  0.3053  


POS 89    E      V      R      R      A      Q      E      I      N      H      L      
          0.2849  0.2849  0.3149  0.3182  0.3631  0.3667  0.3667  0.3631  0.4441  0.3286  0.4220  


POS 100    F      G      P      K      D      S      E      D      S      Y      
           0.3215  0.3053  0.1969  0.2034  0.1611  0.1501  0.1476  0.2224  0.2164  0.2164  


POS 110    D      I      I      F      D      L      H      N      T      T      
           0.3053  0.3704  0.3704  0.2609  0.2680  0.1823  0.1184  0.0662  0.0690  0.0734  


POS 120    S      N      M      G      C      T      L      I      L      E      
           0.1229  0.1184  0.1914  0.2817  0.2849  0.2034  0.2064  0.2193  0.1759  0.0985  


POS 130    D      S      R      N      N      F      L      I      Q      M      
           0.0948  0.0581  0.0327  0.0398  0.0414  0.0719  0.0414  0.0581  0.1092  0.1092  


POS 140    F      H      Y      I      K      T      S      L      A      P      
           0.1162  0.0662  0.0398  0.0269  0.0163  0.0105  0.0115  0.0184  0.0275  0.0300  


POS 150    L      P      C      Y      V      Y      L      I      E      H      
           0.0372  0.0433  0.0424  0.0405  0.0581  0.0618  0.0618  0.0618  0.1007  0.0749  


POS 160    P      S      L      K      Y      A      T      T      R      S      
           0.0543  0.0870  0.0443  0.0888  0.1007  0.1424  0.1449  0.2292  0.2470  0.2328  


POS 170    I      A      K      Y      P      V      G      I      E      V      
           0.2575  0.2503  0.2752  0.3667  0.3704  0.3948  0.3426  0.3356  0.3019  0.3149  


POS 180    G      P      Q      P      Q      G      V      L      R      A      
           0.2328  0.2328  0.2752  0.2951  0.3321  0.3087  0.3631  0.3182  0.3182  0.2918  


POS 190    D      I      L      D      Q      M      R      K      M      I      
           0.3494  0.3182  0.2164  0.2129  0.1115  0.0605  0.0592  0.0870  0.0734  0.0780  


POS 200    K      H      A      L      D      F      I      H      H      F      
           0.1048  0.0967  0.1501  0.2364  0.1349  0.1399  0.1942  0.1206  0.1048  0.0817  


POS 210    N      E      G      K      E      F      P      P      C      A      
           0.1449  0.1048  0.0618  0.0734  0.0704  0.0389  0.0835  0.1349  0.0948  0.1028  


POS 220    I      E      V      Y      K      I      I      E      K      V      
           0.1115  0.1184  0.1092  0.1184  0.1323  0.1275  0.2129  0.2094  0.1229  0.1731  


POS 230    D      Y      P      R      D      E      N      G      E      I      
           0.1731  0.1759  0.1028  0.1476  0.2470  0.2609  0.2680  0.3631  0.3566  0.3740  


POS 240    A      A      I      I      H      P      N      L      Q      D      
           0.4476  0.3392  0.4256  0.4256  0.3460  0.3356  0.3392  0.3249  0.3392  0.3460  


POS 250    Q      D      W      K      P      L      H      P      G      D      
           0.3910  0.3215  0.2783  0.3631  0.3667  0.3774  0.3566  0.3392  0.4220  0.3321  


POS 260    P      M      F      L      T      L      D      G      K      T      
           0.3426  0.2541  0.2436  0.3426  0.3566  0.2470  0.3286  0.2680  0.1643  0.1852  


POS 270    I      P      L      G      G      D      C      T      V      Y      
           0.1298  0.0631  0.0543  0.1048  0.1731  0.1449  0.1881  0.1115  0.0646  0.0734  


POS 280    P      V      F      V      N      E      A      A      Y      Y      
           0.0690  0.1092  0.1048  0.1399  0.0765  0.0646  0.0581  0.1028  0.1007  0.1373  


POS 290    E      K      K      E      A      F      A      K      T      T      
           0.1449  0.1298  0.1184  0.2034  0.2364  0.2164  0.2002  0.1583  0.1823  0.1852  


POS 300    K      L      T      L      N      A      K      S      I      R      
           0.1881  0.1184  0.1184  0.1184  0.0888  0.0851  0.1349  0.1349  0.1137  0.0870 


POS 310    C      C      L      H      
           0.0631  0.0473  0.0734  0.0483  

Short Disorder

Aspa iupred2.png

POS 1    M      T      S      C      H      I      A      E      E      H      I      
         0.8886  0.7772  0.7418  0.6984  0.5992  0.5296  0.4149  0.2748  0.2333  0.1921  0.1566  


POS 12    Q      K      V      A      I      F      G      G      T      H      G      
          0.1805  0.1844  0.1732  0.2700  0.1766  0.2531  0.2913  0.2080  0.1292  0.0832  0.0965  


POS 23    N      E      L      T      G      V      F      L      V      K      H      
          0.0909  0.0991  0.0935  0.0660  0.1088  0.1766  0.1766  0.1495  0.1566  0.0991  0.1041  


POS 34    W      L      E      N      G      A      E      I      Q      R      T      
          0.1041  0.0909  0.1456  0.0935  0.0935  0.0909  0.1416  0.2385  0.2080  0.1416  0.1322  


POS 45    G      L      E      V      K      P      F      I      T      N      P      
          0.1844  0.2963  0.2820  0.2558  0.1921  0.2167  0.2041  0.1998  0.1921  0.1921  0.1380  


POS 56    R      A      V      K      K      C      T      R      Y      I      D      
          0.0935  0.1495  0.0935  0.1416  0.0935  0.0771  0.1322  0.1416  0.0935  0.0965  0.0464  


POS 67    C      D      L      N      R      I      F      D      L      E      N      
          0.0490  0.0567  0.0542  0.0554  0.0567  0.0935  0.0813  0.0858  0.1292  0.1958  0.1322  


POS 78    L      G      K      K      M      S      E      D      L      P      Y      
          0.2385  0.1732  0.2558  0.1958  0.1878  0.2432  0.2963  0.3184  0.4149  0.3359  0.3399  


POS 89    E      V      R      R      A      Q      E      I      N      H      L      
          0.4116  0.3491  0.2820  0.2913  0.3535  0.3399  0.3456  0.3399  0.4333  0.4078  0.4825  


POS 100    F      G      P      K      D      S      E      D      S      Y      
           0.3992  0.4651  0.4149  0.3578  0.3005  0.2820  0.1878  0.2483  0.2385  0.2385  


POS 110    D      I      I      F      D      L      H      N      T      T      
           0.2963  0.3668  0.3630  0.2865  0.2963  0.2209  0.2122  0.1292  0.0789  0.0441  


POS 120    S      N      M      G      C      T      L      I      L      E      
           0.0771  0.0771  0.1117  0.1635  0.2333  0.2209  0.1878  0.1292  0.0771  0.0858  


POS 130    D      S      R      N      N      F      L      I      Q      M      
           0.0660  0.0336  0.0316  0.0226  0.0102  0.0179  0.0179  0.0327  0.0279  0.0363  


POS 140    F      H      Y      I      K      T      S      L      A      P      
           0.0387  0.0173  0.0218  0.0128  0.0078  0.0044  0.0055  0.0059  0.0055  0.0055  


POS 150    L      P      C      Y      V      Y      L      I      E      H      
           0.0070  0.0167  0.0194  0.0200  0.0387  0.0212  0.0160  0.0157  0.0327  0.0455  


POS 160    P      S      L      K      Y      A      T      T      R      S      
           0.0387  0.0414  0.0279  0.0441  0.0455  0.0884  0.0991  0.1602  0.1635  0.1602  


POS 170    I      A      K      Y      P      V      G      I      E      V      
           0.1088  0.0965  0.1150  0.1878  0.2167  0.3146  0.3535  0.2865  0.2080  0.2080  


POS 180    G      P      Q      P      Q      G      V      L      R      A      
           0.1766  0.2748  0.2333  0.1667  0.2531  0.2385  0.2748  0.2748  0.3630  0.3184  


POS 190    D      I      L      D      Q      M      R      K      M      I      
           0.3146  0.3096  0.2748  0.2292  0.1292  0.1292  0.0744  0.0701  0.1150  0.1117  


POS 200    K      H      A      L      D      F      I      H      H      F      
           0.0723  0.0701  0.1150  0.1732  0.1602  0.1602  0.1205  0.1456  0.1766  0.1532  


POS 210    N      E      G      K      E      F      P      P      C      A      
           0.1958  0.1322  0.1416  0.1178  0.1205  0.1088  0.1205  0.1240  0.1380  0.1322  


POS 220    I      E      V      Y      K      I      I      E      K      V      
           0.1566  0.1698  0.1060  0.1266  0.1178  0.1240  0.2080  0.1766  0.1566  0.2292  


POS 230    D      Y      P      R      D      E      N      G      E      I      
           0.1878  0.2432  0.2041  0.2041  0.2122  0.2122  0.3225  0.3992  0.3005  0.3359  


POS 240    A      A      I      I      H      P      N      L      Q      D      
           0.4149  0.4245  0.5173  0.4078  0.4116  0.4245  0.3359  0.3263  0.3578  0.3399  


POS 250    Q      D      W      K      P      L      H      P      G      D      
           0.4282  0.4825  0.4703  0.4651  0.4600  0.4600  0.3399  0.3578  0.4333  0.4078  


POS 260    P      M      F      L      T      L      D      G      K      T      
           0.3885  0.2913  0.3053  0.3096  0.3184  0.2820  0.3630  0.3005  0.2657  0.1998  


POS 270    I      P      L      G      G      D      C      T      V      Y      
           0.1205  0.1205  0.1041  0.1018  0.1117  0.1150  0.1844  0.1380  0.1117  0.0701  


POS 280    P      V      F      V      N      E      A      A      Y      Y      
           0.0425  0.0744  0.0567  0.0909  0.0965  0.0744  0.0387  0.0464  0.0441  0.0607  


POS 290    E      K      K      E      A      F      A      K      T      T      
           0.0991  0.0771  0.0660  0.1041  0.0935  0.0935  0.0660  0.0832  0.1041  0.1041  


POS 300    K      L      T      L      N      A      K      S      I      R      
           0.1698  0.1041  0.0660  0.0441  0.0405  0.0701  0.1602  0.2333  0.2865  0.3456  


POS 310    C      C      L      H      
           0.4037  0.4556  0.5802  0.6334  


Structured Regions

Aspa iupred3.png

IUPRED predicts one structured region comprised of the whole input sequence.


Results

IUPRED predicts no significant disorder in Aspartoacyclase. The disorder tendency stays below 0.5 in all cases (except for short stretches of about 3-5 residues at each end of the sequence in short disorder mode, which are negligible) and the structured regions mode predicts one continuous structured region spanning all of the protein sequence. This makes sense when looking at the 3D structure: Aspartoacyclase is a rather densely packed globular structure, which according to the assumptions that IUPRED makes has a strong tendency to form many inter-residue contacts and to stabilize itself thereby, markedly reducing the tendency for disorder in the process.

Meta-Disorder

Meta-Disorder, as the name implies, employs a set of so-called orthogonal disorder predictors in order to combine their strengths and mitigate their weak points. It was published in 2009 by Avner Schlessinger, Marco Punta, Guy Yachdav, Laszlo Kajan and Burkhard Rost.

Reference: Paper

As with the previous methods, Meta-Disorder predicts disorder from the amino acid sequence alone; results from the predictors IUPRED, DISOPRED, NORSnet and Ucon are molded into one final result using a neural network.

Results for Aspartoacyclase:

Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw   MD_rel  MD2st 
    1	M	0.33	-	0.99	D	0.17	-	0.551	1	D
    2	T	0.26	-	0.78	D	0.25	-	0.531	0	D
    3	S	0.16	-	0.72	D	0.35	-	0.535	0	D
    4	C	0.23	-	0.65	D	0.33	-	0.505	0	-
    5	H	0.20	-	0.48	D	0.25	-	0.475	1	-
    6	I	0.16	-	0.55	D	0.30	-	0.465	1	-
    7	A	0.34	-	0.56	D	0.40	-	0.444	2	-
    8	E	0.28	-	0.67	D	0.30	-	0.424	3	-
    9	E	0.21	-	0.73	D	0.38	-	0.404	3	-
   10	H	0.15	-	0.70	D	0.30	-	0.374	4	-
   11	I	0.15	-	0.59	D	0.29	-	0.354	5	-
   12	Q	0.15	-	0.60	D	0.28	-	0.313	6	-
   13	K	0.14	-	0.51	D	0.23	-	0.263	8	-
   14	V	0.14	-	0.30	-	0.19	-	0.253	8	-
   15	A	0.16	-	0.24	-	0.19	-	0.250	9	-
   16	I	0.13	-	0.20	-	0.24	-	0.242	9	-
   17	F	0.10	-	0.13	-	0.23	-	0.250	9	-
   18	G	0.13	-	0.18	-	0.21	-	0.242	9	-
   19	G	0.10	-	0.24	-	0.20	-	0.253	8	-
   20	T	0.07	-	0.34	-	0.20	-	0.253	8	-
   21	H	0.06	-	0.26	-	0.26	-	0.260	8	-
   22	G	0.06	-	0.39	-	0.29	-	0.253	8	-
   23	N	0.06	-	0.48	D	0.22	-	0.250	9	-
   24	E	0.06	-	0.47	D	0.18	-	0.242	9	-
   25	L	0.11	-	0.43	-	0.16	-	0.242	9	-
   26	T	0.12	-	0.39	-	0.20	-	0.253	8	-
   27	G	0.10	-	0.32	-	0.20	-	0.242	9	-
   28	V	0.08	-	0.28	-	0.15	-	0.242	9	-
   29	F	0.12	-	0.35	-	0.13	-	0.242	9	-
   30	L	0.14	-	0.28	-	0.15	-	0.242	9	-
   31	V	0.09	-	0.30	-	0.16	-	0.253	8	-
   32	K	0.07	-	0.40	-	0.16	-	0.263	8	-
   33	H	0.06	-	0.40	-	0.18	-	0.293	7	-
   34	W	0.08	-	0.38	-	0.29	-	0.273	8	-
   35	L	0.09	-	0.45	-	0.30	-	0.283	7	-
   36	E	0.09	-	0.56	D	0.41	-	0.313	6	-
   37	N	0.12	-	0.62	D	0.32	-	0.313	6	-
   38	G	0.16	-	0.62	D	0.35	-	0.330	6	-
   39	A	0.11	-	0.64	D	0.46	-	0.313	6	-
   40	E	0.10	-	0.66	D	0.47	-	0.323	6	-
   41	I	0.09	-	0.65	D	0.47	-	0.323	6	-
   42	Q	0.10	-	0.64	D	0.36	-	0.293	7	-
   43	R	0.09	-	0.61	D	0.50	-	0.273	8	-
   44	T	0.08	-	0.61	D	0.56	-	0.273	8	-
   45	G	0.08	-	0.53	D	0.34	-	0.263	8	-
   46	L	0.09	-	0.43	-	0.35	-	0.260	8	-
   47	E	0.10	-	0.33	-	0.32	-	0.253	8	-
   48	V	0.07	-	0.23	-	0.32	-	0.250	9	-
   49	K	0.06	-	0.17	-	0.34	-	0.253	8	-
   50	P	0.08	-	0.18	-	0.37	-	0.263	8	-
   51	F	0.08	-	0.17	-	0.49	-	0.273	8	-
   52	I	0.07	-	0.21	-	0.33	-	0.273	8	-
   53	T	0.06	-	0.28	-	0.53	-	0.303	7	-
   54	N	0.07	-	0.28	-	0.53	-	0.303	7	-
   55	P	0.09	-	0.36	-	0.37	-	0.313	6	-
   56	R	0.08	-	0.41	-	0.51	-	0.313	6	-
   57	A	0.10	-	0.40	-	0.66	D	0.280	7	-
   58	V	0.13	-	0.40	-	0.51	-	0.263	8	-
   59	K	0.16	-	0.48	D	0.37	-	0.263	8	-
   60	K	0.19	-	0.47	D	0.40	-	0.263	8	-
   61	C	0.18	-	0.47	D	0.29	-	0.253	8	-
   62	T	0.16	-	0.55	D	0.35	-	0.263	8	-
   63	R	0.18	-	0.51	D	0.31	-	0.253	8	-
   64	Y	0.22	-	0.47	D	0.25	-	0.273	8	-
   65	I	0.23	-	0.47	D	0.20	-	0.260	8	-
   66	D	0.23	-	0.56	D	0.21	-	0.263	8	-
   67	C	0.25	-	0.57	D	0.16	-	0.263	8	-
   68	D	0.30	-	0.43	-	0.18	-	0.263	8	-
   69	L	0.29	-	0.40	-	0.18	-	0.260	8	-
   70	N	0.28	-	0.40	-	0.25	-	0.263	8	-
   71	R	0.40	-	0.39	-	0.23	-	0.273	8	-
   72	I	0.46	-	0.43	-	0.22	-	0.280	7	-
   73	F	0.46	-	0.37	-	0.19	-	0.273	8	-
   74	D	0.37	-	0.46	-	0.32	-	0.310	6	-
   75	L	0.33	-	0.57	D	0.40	-	0.390	4	-
   76	E	0.36	-	0.61	D	0.30	-	0.444	2	-
   77	N	0.44	-	0.62	D	0.41	-	0.465	1	-
   78	L	0.38	-	0.66	D	0.65	D	0.531	0	D
   79	G	0.30	-	0.70	D	0.64	D	0.485	1	-
   80	K	0.35	-	0.69	D	0.64	D	0.515	0	-
   81	K	0.23	-	0.69	D	0.59	D	0.475	1	-
   82	M	0.23	-	0.66	D	0.42	-	0.444	2	-
   83	S	0.28	-	0.69	D	0.64	D	0.449	2	-
   84	E	0.34	-	0.72	D	0.56	-	0.485	1	-
   85	D	0.29	-	0.74	D	0.45	-	0.424	3	-
   86	L	0.20	-	0.64	D	0.35	-	0.424	3	-
   87	P	0.20	-	0.64	D	0.45	-	0.404	3	-
   88	Y	0.17	-	0.55	D	0.46	-	0.384	4	-
   89	E	0.14	-	0.50	D	0.46	-	0.364	5	-
   90	V	0.13	-	0.45	-	0.30	-	0.333	6	-
   91	R	0.12	-	0.43	-	0.43	-	0.320	6	-
   92	R	0.11	-	0.40	-	0.36	-	0.293	7	-
   93	A	0.11	-	0.34	-	0.36	-	0.283	7	-
   94	Q	0.10	-	0.45	-	0.22	-	0.290	7	-
   95	E	0.12	-	0.41	-	0.25	-	0.303	7	-
   96	I	0.09	-	0.34	-	0.26	-	0.283	7	-
   97	N	0.11	-	0.40	-	0.33	-	0.313	6	-
   98	H	0.10	-	0.49	D	0.39	-	0.313	6	-
   99	L	0.10	-	0.47	D	0.38	-	0.313	6	-
  100	F	0.13	-	0.47	D	0.38	-	0.293	7	-
  101	G	0.14	-	0.54	D	0.58	D	0.323	6	-
  102	P	0.13	-	0.61	D	0.58	D	0.333	6	-
  103	K	0.13	-	0.60	D	0.47	-	0.323	6	-
  104	D	0.11	-	0.61	D	0.71	D	0.323	6	-
  105	S	0.10	-	0.65	D	0.73	D	0.283	7	-
  106	E	0.10	-	0.70	D	0.62	D	0.283	7	-
  107	D	0.12	-	0.70	D	0.42	-	0.273	8	-
  108	S	0.11	-	0.64	D	0.37	-	0.270	8	-
  109	Y	0.12	-	0.50	D	0.23	-	0.253	8	-
  110	D	0.13	-	0.39	-	0.20	-	0.242	9	-
  111	I	0.16	-	0.29	-	0.18	-	0.240	9	-
  112	I	0.15	-	0.20	-	0.16	-	0.240	9	-
  113	F	0.14	-	0.20	-	0.16	-	0.240	9	-
  114	D	0.17	-	0.21	-	0.20	-	0.242	9	-
  115	L	0.21	-	0.20	-	0.19	-	0.253	8	-
  116	H	0.17	-	0.28	-	0.19	-	0.273	8	-
  117	N	0.11	-	0.48	D	0.23	-	0.283	7	-
  118	T	0.13	-	0.39	-	0.24	-	0.283	7	-
  119	T	0.13	-	0.41	-	0.21	-	0.273	8	-
  120	S	0.15	-	0.46	-	0.21	-	0.273	8	-
  121	N	0.22	-	0.54	D	0.18	-	0.263	8	-
  122	M	0.25	-	0.51	D	0.14	-	0.260	8	-
  123	G	0.30	-	0.51	D	0.16	-	0.253	8	-
  124	C	0.26	-	0.42	-	0.18	-	0.250	9	-
  125	T	0.29	-	0.40	-	0.18	-	0.242	9	-
  126	L	0.24	-	0.34	-	0.18	-	0.253	8	-
  127	I	0.17	-	0.28	-	0.23	-	0.260	8	-
  128	L	0.13	-	0.28	-	0.25	-	0.263	8	-
  129	E	0.14	-	0.41	-	0.24	-	0.253	8	-
  130	D	0.14	-	0.54	D	0.18	-	0.253	8	-
  131	S	0.10	-	0.59	D	0.19	-	0.273	8	-
  132	R	0.07	-	0.68	D	0.27	-	0.280	7	-
  133	N	0.05	-	0.64	D	0.28	-	0.273	8	-
  134	N	0.06	-	0.61	D	0.18	-	0.273	8	-
  135	F	0.07	-	0.53	D	0.15	-	0.260	8	-
  136	L	0.08	-	0.47	D	0.13	-	0.242	9	-
  137	I	0.10	-	0.47	D	0.13	-	0.242	9	-
  138	Q	0.16	-	0.42	-	0.13	-	0.242	9	-
  139	M	0.15	-	0.34	-	0.13	-	0.242	9	-
  140	F	0.11	-	0.32	-	0.14	-	0.250	9	-
  141	H	0.13	-	0.41	-	0.14	-	0.263	8	-
  142	Y	0.16	-	0.36	-	0.16	-	0.263	8	-
  143	I	0.12	-	0.34	-	0.16	-	0.263	8	-
  144	K	0.11	-	0.46	-	0.15	-	0.283	7	-
  145	T	0.07	-	0.54	D	0.20	-	0.273	8	-
  146	S	0.07	-	0.55	D	0.17	-	0.253	8	-
  147	L	0.09	-	0.56	D	0.17	-	0.242	9	-
  148	A	0.10	-	0.57	D	0.17	-	0.250	9	-
  149	P	0.09	-	0.60	D	0.13	-	0.253	8	-
  150	L	0.11	-	0.51	D	0.13	-	0.242	9	-
  151	P	0.13	-	0.44	-	0.14	-	0.242	9	-
  152	C	0.12	-	0.38	-	0.13	-	0.240	9	-
  153	Y	0.13	-	0.31	-	0.13	-	0.240	9	-
  154	V	0.18	-	0.28	-	0.13	-	0.242	9	-
  155	Y	0.17	-	0.33	-	0.14	-	0.250	9	-
  156	L	0.21	-	0.47	D	0.13	-	0.253	8	-
  157	I	0.22	-	0.54	D	0.15	-	0.263	8	-
  158	E	0.17	-	0.58	D	0.15	-	0.283	7	-
  159	H	0.16	-	0.62	D	0.17	-	0.303	7	-
  160	P	0.13	-	0.65	D	0.21	-	0.323	6	-
  161	S	0.14	-	0.59	D	0.29	-	0.303	7	-
  162	L	0.17	-	0.58	D	0.41	-	0.303	7	-
  163	K	0.21	-	0.56	D	0.36	-	0.293	7	-
  164	Y	0.32	-	0.51	D	0.29	-	0.273	8	-
  165	A	0.31	-	0.47	D	0.26	-	0.273	8	-
  166	T	0.28	-	0.45	-	0.32	-	0.273	8	-
  167	T	0.22	-	0.41	-	0.33	-	0.273	8	-
  168	R	0.15	-	0.47	D	0.26	-	0.273	8	-
  169	S	0.14	-	0.47	D	0.28	-	0.280	7	-
  170	I	0.12	-	0.46	-	0.29	-	0.273	8	-
  171	A	0.12	-	0.47	D	0.22	-	0.283	7	-
  172	K	0.11	-	0.58	D	0.27	-	0.290	7	-
  173	Y	0.13	-	0.47	D	0.20	-	0.263	8	-
  174	P	0.11	-	0.38	-	0.19	-	0.253	8	-
  175	V	0.10	-	0.26	-	0.26	-	0.250	9	-
  176	G	0.09	-	0.24	-	0.31	-	0.250	9	-
  177	I	0.13	-	0.33	-	0.25	-	0.250	9	-
  178	E	0.20	-	0.28	-	0.37	-	0.253	8	-
  179	V	0.26	-	0.41	-	0.33	-	0.253	8	-
  180	G	0.20	-	0.45	-	0.33	-	0.270	8	-
  181	P	0.17	-	0.59	D	0.25	-	0.283	7	-
  182	Q	0.12	-	0.49	D	0.35	-	0.283	7	-
  183	P	0.12	-	0.51	D	0.28	-	0.273	8	-
  184	Q	0.13	-	0.54	D	0.42	-	0.273	8	-
  185	G	0.10	-	0.51	D	0.33	-	0.263	8	-
  186	V	0.12	-	0.55	D	0.22	-	0.253	8	-
  187	L	0.17	-	0.54	D	0.24	-	0.253	8	-
  188	R	0.14	-	0.48	D	0.24	-	0.263	8	-
  189	A	0.11	-	0.53	D	0.18	-	0.273	8	-
  190	D	0.11	-	0.51	D	0.19	-	0.273	8	-
  191	I	0.08	-	0.41	-	0.31	-	0.283	7	-
  192	L	0.07	-	0.42	-	0.33	-	0.263	8	-
  193	D	0.06	-	0.47	D	0.24	-	0.273	8	-
  194	Q	0.08	-	0.45	-	0.33	-	0.270	8	-
  195	M	0.04	-	0.34	-	0.26	-	0.263	8	-
  196	R	0.04	-	0.43	-	0.34	-	0.273	8	-
  197	K	0.05	-	0.44	-	0.34	-	0.263	8	-
  198	M	0.06	-	0.29	-	0.34	-	0.263	8	-
  199	I	0.06	-	0.28	-	0.22	-	0.253	8	-
  200	K	0.07	-	0.34	-	0.22	-	0.263	8	-
  201	H	0.07	-	0.32	-	0.20	-	0.253	8	-
  202	A	0.08	-	0.28	-	0.15	-	0.250	9	-
  203	L	0.08	-	0.35	-	0.15	-	0.253	8	-
  204	D	0.09	-	0.43	-	0.19	-	0.263	8	-
  205	F	0.11	-	0.41	-	0.18	-	0.263	8	-
  206	I	0.12	-	0.45	-	0.18	-	0.253	8	-
  207	H	0.15	-	0.59	D	0.23	-	0.270	8	-
  208	H	0.18	-	0.59	D	0.40	-	0.290	7	-
  209	F	0.22	-	0.58	D	0.24	-	0.283	7	-
  210	N	0.27	-	0.63	D	0.37	-	0.293	7	-
  211	E	0.27	-	0.66	D	0.53	-	0.313	6	-
  212	G	0.28	-	0.68	D	0.44	-	0.313	6	-
  213	K	0.26	-	0.70	D	0.46	-	0.323	6	-
  214	E	0.26	-	0.71	D	0.50	-	0.323	6	-
  215	F	0.20	-	0.70	D	0.56	-	0.303	7	-
  216	P	0.21	-	0.69	D	0.37	-	0.293	7	-
  217	P	0.24	-	0.69	D	0.28	-	0.280	7	-
  218	C	0.14	-	0.66	D	0.28	-	0.263	8	-
  219	A	0.14	-	0.58	D	0.19	-	0.263	8	-
  220	I	0.15	-	0.52	D	0.19	-	0.263	8	-
  221	E	0.11	-	0.47	D	0.22	-	0.270	8	-
  222	V	0.11	-	0.34	-	0.26	-	0.273	8	-
  223	Y	0.12	-	0.30	-	0.28	-	0.280	7	-
  224	K	0.08	-	0.37	-	0.34	-	0.280	7	-
  225	I	0.09	-	0.32	-	0.33	-	0.273	8	-
  226	I	0.07	-	0.35	-	0.29	-	0.283	7	-
  227	E	0.09	-	0.43	-	0.38	-	0.313	6	-
  228	K	0.09	-	0.49	D	0.61	D	0.333	6	-
  229	V	0.12	-	0.49	D	0.58	D	0.337	6	-
  230	D	0.16	-	0.53	D	0.75	D	0.354	5	-
  231	Y	0.14	-	0.52	D	0.84	D	0.343	5	-
  232	P	0.12	-	0.57	D	0.84	D	0.313	6	-
  233	R	0.13	-	0.66	D	0.59	D	0.303	7	-
  234	D	0.15	-	0.69	D	0.70	D	0.310	6	-
  235	E	0.10	-	0.71	D	0.59	D	0.293	7	-
  236	N	0.12	-	0.71	D	0.62	D	0.303	7	-
  237	G	0.17	-	0.67	D	0.44	-	0.293	7	-
  238	E	0.22	-	0.60	D	0.40	-	0.283	7	-
  239	I	0.17	-	0.53	D	0.36	-	0.270	8	-
  240	A	0.16	-	0.38	-	0.30	-	0.260	8	-
  241	A	0.19	-	0.29	-	0.22	-	0.253	8	-
  242	I	0.16	-	0.28	-	0.24	-	0.263	8	-
  243	I	0.22	-	0.33	-	0.24	-	0.263	8	-
  244	H	0.25	-	0.34	-	0.34	-	0.293	7	-
  245	P	0.14	-	0.48	D	0.41	-	0.323	6	-
  246	N	0.16	-	0.53	D	0.30	-	0.343	5	-
  247	L	0.16	-	0.58	D	0.53	-	0.343	5	-
  248	Q	0.16	-	0.61	D	0.71	D	0.374	4	-
  249	D	0.22	-	0.64	D	0.59	D	0.354	5	-
  250	Q	0.30	-	0.64	D	0.51	-	0.364	5	-
  251	D	0.34	-	0.62	D	0.52	-	0.333	6	-
  252	W	0.33	-	0.52	D	0.65	D	0.313	6	-
  253	K	0.22	-	0.58	D	0.68	D	0.283	7	-
  254	P	0.21	-	0.58	D	0.63	D	0.283	7	-
  255	L	0.18	-	0.54	D	0.45	-	0.263	8	-
  256	H	0.16	-	0.68	D	0.27	-	0.263	8	-
  257	P	0.18	-	0.69	D	0.28	-	0.273	8	-
  258	G	0.19	-	0.57	D	0.21	-	0.270	8	-
  259	D	0.28	-	0.54	D	0.16	-	0.290	7	-
  260	P	0.25	-	0.54	D	0.23	-	0.270	8	-
  261	M	0.19	-	0.40	-	0.24	-	0.273	8	-
  262	F	0.16	-	0.34	-	0.29	-	0.253	8	-
  263	L	0.13	-	0.37	-	0.30	-	0.253	8	-
  264	T	0.10	-	0.46	-	0.20	-	0.242	9	-
  265	L	0.14	-	0.56	D	0.20	-	0.253	8	-
  266	D	0.13	-	0.61	D	0.20	-	0.263	8	-
  267	G	0.11	-	0.62	D	0.26	-	0.280	7	-
  268	K	0.10	-	0.60	D	0.34	-	0.283	7	-
  269	T	0.10	-	0.60	D	0.40	-	0.273	8	-
  270	I	0.08	-	0.46	-	0.41	-	0.250	9	-
  271	P	0.07	-	0.43	-	0.35	-	0.250	9	-
  272	L	0.12	-	0.46	-	0.29	-	0.242	9	-
  273	G	0.10	-	0.53	D	0.18	-	0.242	9	-
  274	G	0.08	-	0.52	D	0.19	-	0.250	9	-
  275	D	0.05	-	0.62	D	0.19	-	0.253	8	-
  276	C	0.06	-	0.68	D	0.15	-	0.263	8	-
  277	T	0.10	-	0.59	D	0.16	-	0.263	8	-
  278	V	0.08	-	0.52	D	0.17	-	0.263	8	-
  279	Y	0.09	-	0.33	-	0.20	-	0.242	9	-
  280	P	0.10	-	0.27	-	0.17	-	0.242	9	-
  281	V	0.12	-	0.23	-	0.21	-	0.242	9	-
  282	F	0.12	-	0.18	-	0.17	-	0.253	8	-
  283	V	0.09	-	0.24	-	0.16	-	0.263	8	-
  284	N	0.05	-	0.28	-	0.23	-	0.303	7	-
  285	E	0.06	-	0.39	-	0.28	-	0.384	4	-
  286	A	0.09	-	0.35	-	0.46	-	0.404	3	-
  287	A	0.08	-	0.43	-	0.72	D	0.418	3	-
  288	Y	0.08	-	0.37	-	0.79	D	0.374	4	-
  289	Y	0.09	-	0.55	D	0.61	D	0.354	5	-
  290	E	0.10	-	0.41	-	0.49	-	0.333	6	-
  291	K	0.10	-	0.50	D	0.65	D	0.323	6	-
  292	K	0.07	-	0.47	D	0.66	D	0.323	6	-
  293	E	0.07	-	0.36	-	0.79	D	0.333	6	-
  294	A	0.06	-	0.29	-	0.95	D	0.354	5	-
  295	F	0.08	-	0.27	-	0.82	D	0.333	6	-
  296	A	0.09	-	0.32	-	0.70	D	0.323	6	-
  297	K	0.09	-	0.42	-	0.41	-	0.343	5	-
  298	T	0.08	-	0.42	-	0.36	-	0.343	5	-
  299	T	0.10	-	0.51	D	0.36	-	0.414	3	-
  300	K	0.09	-	0.54	D	0.64	D	0.455	2	-
  301	L	0.09	-	0.48	D	0.70	D	0.394	4	-
  302	T	0.15	-	0.55	D	0.43	-	0.404	3	-
  303	L	0.15	-	0.48	D	0.46	-	0.374	4	-
  304	N	0.12	-	0.48	D	0.34	-	0.374	4	-
  305	A	0.13	-	0.47	D	0.19	-	0.374	4	-
  306	K	0.22	-	0.55	D	0.19	-	0.384	4	-
  307	S	0.07	-	0.53	D	0.18	-	0.434	2	-
  308	I	0.07	-	0.41	-	0.23	-	0.424	3	-
  309	R	0.07	-	0.37	-	0.21	-	0.414	3	-
  310	C	0.11	-	0.60	D	0.19	-	0.414	3	-
  311	C	0.14	-	0.67	D	0.14	-	0.394	4	-
  312	L	0.15	-	0.72	D	0.13	-	0.424	3	-
  313	H	0.44	-	0.80	D	0.13	-	0.485	1	-


Key for output
----------------
Number - residue number
Residue - amino-acid type
NORSnet - raw score by NORSnet (prediction of unstructured loops)
NORS2st - two-state prediction by NORSnet; D=disordered
PROFbval - raw score by PROFbval (prediction of residue flexibility from sequence)
Bval2st - two-state prediction by PROFbval
Ucon - raw score by Ucon (prediction of protein disorder using predicted internal contacts)
Ucon2st - two-state prediction by Ucon
MD - raw score by MD (prediction of protein disorder using orthogonal sources)
MD_rel - reliability of the prediction by MD; values range from 0-9. 9=strong prediction
MD2st - two-state prediction by MD


The last column indicates whether or not disorder was predicted at the current position. Meta-Disorder predicts a total of four disorder positions, which are not significant. This coincides with the predictions of the other programs employed previously - not alltogether surprising, since Meta-Disorder draws its predictions from two of them.

Prediction of transmembrane alpha-helices and signal peptides

The results of this task are unequivocal: Aspartoacyclase does not contain any transmembrane regions. From a biological point of view this was to be expected, as Aspartoacyclase is known to be located in the cytosol.

TMHMM

Since the VM version could not be made to work, we used the server at http://www.cbs.dtu.dk/services/TMHMM/.

TMHMM uses a hidden markov model to predict transmembrane helices in proteins. It was published in 1998 by E. L.L. Sonnhammer, G. von Heijne, and A. Krogh.

Reference: Original paper

The hidden markov model used by TMHMM models the biological structure with states for helix turns, helix caps and loops on either side of the membrane, which are specially designed to model membrane insertion, too. The HMM probabilities were estimated both by using a maximum likelihood method and a discriminative method.

Results for Aspartoacyclase very clearly show absence of any sort of transmembrane structure, which is biologically sound.

Sp P45381 ACY2 HUMAN.gif

# sp_P45381_ACY2_HUMAN Length: 313
# sp_P45381_ACY2_HUMAN Number of predicted TMHs:  0
# sp_P45381_ACY2_HUMAN Exp number of AAs in TMHs: 0.2005
# sp_P45381_ACY2_HUMAN Exp number, first 60 AAs:  0.01618
# sp_P45381_ACY2_HUMAN Total prob of N-in:        0.03827
sp_P45381_ACY2_HUMAN	TMHMM2.0	outside	     1   313

http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output


BACR_HALSA

# BACR_HALSA Length: 262
# BACR_HALSA Number of predicted TMHs:  6
# BACR_HALSA Exp number of AAs in TMHs: 140.4032
# BACR_HALSA Exp number, first 60 AAs:  26.1196
# BACR_HALSA Total prob of N-in:        0.01887
# BACR_HALSA POSSIBLE N-term signal sequence
BACR_HALSA	TMHMM2.0	outside	     1    22
BACR_HALSA	TMHMM2.0	TMhelix	    23    42
BACR_HALSA	TMHMM2.0	inside	    43    54
BACR_HALSA	TMHMM2.0	TMhelix	    55    77
BACR_HALSA	TMHMM2.0	outside	    78    91
BACR_HALSA	TMHMM2.0	TMhelix	    92   114
BACR_HALSA	TMHMM2.0	inside	   115   120
BACR_HALSA	TMHMM2.0	TMhelix	   121   143
BACR_HALSA	TMHMM2.0	outside	   144   147
BACR_HALSA	TMHMM2.0	TMhelix	   148   170
BACR_HALSA	TMHMM2.0	inside	   171   189
BACR_HALSA	TMHMM2.0	TMhelix	   190   212
BACR_HALSA	TMHMM2.0	outside	   213   262


RET4_HUMAN

# RET4_HUMAN Length: 201
# RET4_HUMAN Number of predicted TMHs:  0
# RET4_HUMAN Exp number of AAs in TMHs: 0.01196
# RET4_HUMAN Exp number, first 60 AAs:  0.01179
# RET4_HUMAN Total prob of N-in:        0.01909
RET4_HUMAN	TMHMM2.0	outside	     1   201


INSL5_HUMAN

# INSL5_HUMAN Length: 135
# INSL5_HUMAN Number of predicted TMHs:  0
# INSL5_HUMAN Exp number of AAs in TMHs: 0.50415
# INSL5_HUMAN Exp number, first 60 AAs:  0.50415
# INSL5_HUMAN Total prob of N-in:        0.03772
INSL5_HUMAN	TMHMM2.0	outside	     1   135

LAMP1_HUMAN

# LAMP1_HUMAN Length: 417
# LAMP1_HUMAN Number of predicted TMHs:  2
# LAMP1_HUMAN Exp number of AAs in TMHs: 44.89582
# LAMP1_HUMAN Exp number, first 60 AAs:  22.24286
# LAMP1_HUMAN Total prob of N-in:        0.99287
# LAMP1_HUMAN POSSIBLE N-term signal sequence
LAMP1_HUMAN	TMHMM2.0	inside	     1    10
LAMP1_HUMAN	TMHMM2.0	TMhelix	    11    33
LAMP1_HUMAN	TMHMM2.0	outside	    34   383
LAMP1_HUMAN	TMHMM2.0	TMhelix	   384   406
LAMP1_HUMAN	TMHMM2.0	inside	   407   417

A4_HUMAN

# A4_HUMAN Length: 770
# A4_HUMAN Number of predicted TMHs:  1
# A4_HUMAN Exp number of AAs in TMHs: 22.72525
# A4_HUMAN Exp number, first 60 AAs:  0.0027
# A4_HUMAN Total prob of N-in:        0.00015
A4_HUMAN	TMHMM2.0	outside	     1   700
A4_HUMAN	TMHMM2.0	TMhelix	   701   723
A4_HUMAN	TMHMM2.0	inside	   724   770

Phobius & PolyPhobius

Phobius is a program for the prediction of transmembrane region with special emphasis on reducing confusion with signal peptides. It was published in 2005 by Käll L, Krogh A, Sonnhammer EL.

Reference: Paper

Signal peptides and transmembrane proteins share a great deal of similarity and are often confused by predictors for either class; Phobius aims to predict both and to discriminate between them. It employs a hidden markov model to do this, modelling the different sequence regions pertaining to either class.

Input: An amino acid sequence.

Again, neither signal nor transmembrane regions were detected in Aspartoacyclase.

Aspa phobius.png


BACR_HALSA

ID   
FT   TOPO_DOM      1     22       NON CYTOPLASMIC.
FT   TRANSMEM     23     42       
FT   TOPO_DOM     43     53       CYTOPLASMIC.
FT   TRANSMEM     54     76       
FT   TOPO_DOM     77     95       NON CYTOPLASMIC.
FT   TRANSMEM     96    114       
FT   TOPO_DOM    115    120       CYTOPLASMIC.
FT   TRANSMEM    121    142       
FT   TOPO_DOM    143    147       NON CYTOPLASMIC.
FT   TRANSMEM    148    169       
FT   TOPO_DOM    170    189       CYTOPLASMIC.
FT   TRANSMEM    190    212       
FT   TOPO_DOM    213    217       NON CYTOPLASMIC.
FT   TRANSMEM    218    237       
FT   TOPO_DOM    238    262       CYTOPLASMIC.
//

With PolyPhobius:

ID   BACR_HALSA
FT   TOPO_DOM      1     21       NON CYTOPLASMIC.
FT   TRANSMEM     22     43       
FT   TOPO_DOM     44     54       CYTOPLASMIC.
FT   TRANSMEM     55     77       
FT   TOPO_DOM     78     94       NON CYTOPLASMIC.
FT   TRANSMEM     95    114       
FT   TOPO_DOM    115    120       CYTOPLASMIC.
FT   TRANSMEM    121    141       
FT   TOPO_DOM    142    147       NON CYTOPLASMIC.
FT   TRANSMEM    148    166       
FT   TOPO_DOM    167    186       CYTOPLASMIC.
FT   TRANSMEM    187    205       
FT   TOPO_DOM    206    215       NON CYTOPLASMIC.
FT   TRANSMEM    216    237       
FT   TOPO_DOM    238    262       CYTOPLASMIC.
//

RET4_HUMAN

ID   RET4_HUMAN
FT   SIGNAL        1     18       
FT   REGION        1      2       N-REGION.
FT   REGION        3     13       H-REGION.
FT   REGION       14     18       C-REGION.
FT   TOPO_DOM     19    201       NON CYTOPLASMIC.
//

With PolyPhobius:

ID   RET4_HUMAN
FT   SIGNAL        1     18       
FT   REGION        1      3       N-REGION.
FT   REGION        4     13       H-REGION.
FT   REGION       14     18       C-REGION.
FT   TOPO_DOM     19    201       NON CYTOPLASMIC.
//

INSL5_HUMAN

ID   
FT   SIGNAL        1     22       
FT   REGION        1      5       N-REGION.
FT   REGION        6     17       H-REGION.
FT   REGION       18     22       C-REGION.
FT   TOPO_DOM     23    135       NON CYTOPLASMIC.
//

With PolyPhobius:

ID   INSL5_HUMAN
FT   SIGNAL        1     22       
FT   REGION        1      4       N-REGION.
FT   REGION        5     16       H-REGION.
FT   REGION       17     22       C-REGION.
FT   TOPO_DOM     23    135       NON CYTOPLASMIC.
//

LAMP1_HUMAN

ID   
FT   SIGNAL        1     28       
FT   REGION        1     10       N-REGION.
FT   REGION       11     22       H-REGION.
FT   REGION       23     28       C-REGION.
FT   TOPO_DOM     29    381       NON CYTOPLASMIC.
FT   TRANSMEM    382    405       
FT   TOPO_DOM    406    417       CYTOPLASMIC.
//

With PolyPhobius:

ID   LAMP1_HUMAN
FT   SIGNAL        1     28       
FT   REGION        1      9       N-REGION.
FT   REGION       10     22       H-REGION.
FT   REGION       23     28       C-REGION.
FT   TOPO_DOM     29    381       NON CYTOPLASMIC.
FT   TRANSMEM    382    405       
FT   TOPO_DOM    406    417       CYTOPLASMIC.
//

A4_HUMAN

ID   A4_HUMAN
FT   SIGNAL        1     17       
FT   REGION        1      1       N-REGION.
FT   REGION        2     12       H-REGION.
FT   REGION       13     17       C-REGION.
FT   TOPO_DOM     18    700       NON CYTOPLASMIC.
FT   TRANSMEM    701    723       
FT   TOPO_DOM    724    770       CYTOPLASMIC.
//

OCTOPUS & SPOCTOPUS

OCTOPUS uses a combination of hidden markov models and neural networks to predict transmembrane regions. It was published in 2004 by Käll L, Krogh A, Sonnhammer EL.

Reference: Original paper

OCROPUS first creates a sequence profile by running BLAST with the input sequence. Neural networks are used to subsequently predict the propensity for each residue to be located in a transmembrane region or in certain structure patterns on either side of the membrane. The resulting propensities are then fed to a hidden markov model, which calculates the most likely topology.

SPOCTOPUS extends OCTOPUS with a preprocessor that uses a neural network to assess the probability that the first 70 residues of the input sequence contain a signal peptide sequence. If this scores high enough, a hidden markov model is used to ascertain the exact offset of the signal region.

No transmembrane/signal regions were predicted for Aspartoacyclase.


BACR_HALSA

OCTOPUS predicted topology:
oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM
MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii
MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii
iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii
iiiiiiiiiiiiiiiiiiiiii
SPOCTOPUS predicted topology:
oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM
MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiM
MMMMMMMMMMMMMMMMMMMMooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii
iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii
iiiiiiiiiiiiiiiiiiiiii

RET4_HUMAN

OCTOPUS predicted topology:
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooo
SPOCTOPUS predicted topology:
nnnnSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooo

INSL5_HUMAN

OCTOPUS predicted topology:
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooo
SPOCTOPUS predicted topology:
nnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooo

LAMP1_HUMAN

OCTOPUS predicted topology:
iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii
SPOCTOPUS predicted topology:
nnnnnnnnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii

A4_HUMAN

OCTOPUS predicted topology:
ooooRRRRRRoooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM
Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SPOCTOPUS predicted topology:
nnnSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM
Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

SignalP

SignalP is a method for the detection of signal peptides. It was first published in 1997 by Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne.

Reference: Original paper, current version

SignalP comes in two flavours: One using a neural network, the other using a hidden markov model. It supports discriminating between cleaved and uncleaved signal peptides and supports both prokaryotic and eukaryotic input.

Input: A protein sequence.

Neither flavour detected any signal sequence in Aspartoacyclase.

Aspartoacyclase: HMM

ASPA Plot.hmm.1.gif

Aspartoacyclase: Neural Network

ASPA Plot.nn.1.gif


BACR_HALSA

Neural Network:

BACR_HALSA            length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    16       0.220   0.32   NO
  max. Y    39       0.196   0.33   NO
  max. S    31       0.970   0.87   YES
  mean S     1-38    0.426   0.48   NO
       D     1-38    0.311   0.43   NO
# Most likely cleavage site between pos. 38 and 39: GTL-YF

HMM:

Prediction: Signal anchor
Signal peptide probability: 0.017
Signal anchor probability: 0.859
Max cleavage site probability: 0.004 between pos. 15 and 16

RET4_HUMAN

Neural Network:

RET4_HUMAN            length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    19       0.929   0.32   YES
  max. Y    19       0.901   0.33   YES
  max. S     1       0.994   0.87   YES
  mean S     1-18    0.938   0.48   YES
       D     1-18    0.920   0.43   YES
# Most likely cleavage site between pos. 18 and 19: GRA-ER

HMM:

RET4_HUMAN
Prediction: Signal peptide
Signal peptide probability: 1.000
Signal anchor probability: 0.000
Max cleavage site probability: 0.979 between pos. 18 and 19


INSL5_HUMAN

Neural Network:

INSL5_HUMAN           length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    23       0.855   0.32   YES
  max. Y    23       0.778   0.33   YES
  max. S    13       0.987   0.87   YES
  mean S     1-22    0.852   0.48   YES
       D     1-22    0.815   0.43   YES
# Most likely cleavage site between pos. 22 and 23: VRS-KE

HMM:

INSL5_HUMAN
Prediction: Signal peptide
Signal peptide probability: 0.999
Signal anchor probability: 0.000
Max cleavage site probability: 0.911 between pos. 22 and 23


LAMP1_HUMAN

Neural Network:

LAMP1_HUMAN           length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    29       0.978   0.32   YES
  max. Y    29       0.903   0.33   YES
  max. S    19       0.999   0.87   YES
  mean S     1-28    0.960   0.48   YES
       D     1-28    0.932   0.43   YES
# Most likely cleavage site between pos. 28 and 29: ASA-AM

HMM:

LAMP1_HUMAN
Prediction: Signal peptide
Signal peptide probability: 1.000
Signal anchor probability: 0.000
Max cleavage site probability: 0.847 between pos. 28 and 29


A4_HUMAN

Neural Network:

A4_HUMAN              length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    18       0.891   0.32   YES
  max. Y    18       0.850   0.33   YES
  max. S     2       0.992   0.87   YES
  mean S     1-17    0.967   0.48   YES
       D     1-17    0.909   0.43   YES
# Most likely cleavage site between pos. 17 and 18: ARA-LE

HMM:

A4_HUMAN
Prediction: Signal peptide
Signal peptide probability: 1.000
Signal anchor probability: 0.000
Max cleavage site probability: 0.993 between pos. 17 and 18

TargetP

TargetP is a software for the prediction of the cellular location of certain proteins, based on location signals in their sequence. It was published in 2000 by Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1.

Reference: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1. J. Mol. Biol., 300: 1005-1016, 2000.

TargetP confines its analysis to the N-terminal part of the sequence, it can discriminate between proteins destined for either mitochondrion, chloroplast (plants only, for obvious reasons), the secretory pathway or another location.

The prediction for Aspartoacyclase was "other location", which is plausible, as the enzyme is known to reside in the cytosol.

### targetp v1.1 prediction results ##################################
Number of query sequences:  1
Cleavage site predictions not included.
Using NON-PLANT networks.

Name                  Len            mTP     SP  other  Loc  RC
----------------------------------------------------------------------
sp_P45381_ACY2_HUMAN  313          0.073  0.109  0.898   _    2
----------------------------------------------------------------------
cutoff                             0.000  0.000  0.000

http://www.cbs.dtu.dk/services/TargetP-1.1/output.php


BACR_HALSA

Number of query sequences:  1
Cleavage site predictions not included.
Using NON-PLANT networks.

Name                  Len            mTP     SP  other  Loc  RC
----------------------------------------------------------------------
BACR_HALSA            262          0.019  0.897  0.562   S    4
----------------------------------------------------------------------
cutoff                             0.000  0.000  0.000

RET4_HUMAN

Number of query sequences:  1
Cleavage site predictions not included.
Using NON-PLANT networks.

Name                  Len            mTP     SP  other  Loc  RC
----------------------------------------------------------------------
RET4_HUMAN            201          0.242  0.928  0.020   S    2
----------------------------------------------------------------------
cutoff                             0.000  0.000  0.000

INSL5_HUMAN

Number of query sequences:  1
Cleavage site predictions not included.
Using NON-PLANT networks.

Name                  Len            mTP     SP  other  Loc  RC
----------------------------------------------------------------------
INSL5_HUMAN           135          0.074  0.899  0.037   S    1
----------------------------------------------------------------------
cutoff                             0.000  0.000  0.000

LAMP1_HUMAN

Number of query sequences:  1
Cleavage site predictions not included.
Using NON-PLANT networks.

Name                  Len            mTP     SP  other  Loc  RC
----------------------------------------------------------------------
LAMP1_HUMAN           417          0.043  0.953  0.017   S    1
----------------------------------------------------------------------
cutoff                             0.000  0.000  0.000

A4_HUMAN

Number of query sequences:  1
Cleavage site predictions not included.
Using NON-PLANT networks.

Name                  Len            mTP     SP  other  Loc  RC
----------------------------------------------------------------------
A4_HUMAN              770          0.035  0.937  0.084   S    1
----------------------------------------------------------------------
cutoff                             0.000  0.000  0.000


Analysis

BACR_HALSA

TM Prediction: TMHMM predicts one helix less than the other tools (ca. 216-237); other than that, all methods consent on 7 TM helices with insignificant differences. The PDB structure shows that this is correct.

Signalpeptid: SignalP predicts it to be a signal peptide (NN mode) and a signal anchor (HMM mode); according to the information we found on the protein, both predictions are faulty.

Target prediction: TargetP predicted this protein to be located in the secretory pathway and to be a signal peptide; this is not correct.

RET4_HUMAN

TM Prediction: Only Octopus predicts a TM helix; this is a mis-identified signal sequence. The other programs predict no TM helices, which is correct.

Signal peptide prediction: Phobius predicts it to be a signal peptide; so do Spoctopus, and SignalP, with the cleaving site at position 18. According to Uniprot, this is correct.

Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct.

INSL5_HUMAN

TM Prediction: Only Octopus predicts a transmembrane element, which is a mis-identified signal sequence.

Signal peptide prediction: Phobius predicts a signal sequence with cleaving site at 22; Spoctopus predicts the cleaving site at 23; SignalP predicts it to be between 22 and 23.

Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; this is correct.

LAMP1_HUMAN

TM Prediction: TMHMM detects two TM helices; so does Octopus. One TM helix is detected as a signal sequence by Spoctopus and Phobius.

Signal peptide prediction: Phobius, Spoctopus and SignalP find a signal sequence with cleaving site at 28-29. This is correct, according to Uniprot.

Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; since it is membrane-located, this is not correct.

A4_HUMAN

TM Prediction: One TM helix from 701 to 723 predicted by all programms (end is 722 in case of Octopus and Spoctopus). One short reentrant sequence predicted by Octopus, which is a mis-identified signal sequence.

Signal peptide prediction: Spoctopus, SignalP and Phobius all report a signal sequence with cleaving site at 17-19. According to Uniprot, this is correct.

Target prediction: TargetP predicted this protein to be a signal peptide in the secretory pathway; this is wrong, as it is membrane-associated.

Prediction of GO terms

GOPET

GOPET is a tool aimed at automatically assigning Gene Ontology terms to proteins. It was published in 2006 by Arunachalam Vinayagam, Coral del Val, Falk Schubert, Roland Eils, Karl-Heinz Glatting, Sándor Suhai and Rainer König.

Reference: Paper

The input sequence is first BLASTed against a database of proteins with known GO terms; a support vector machine is then used to discriminate between correct and false terms.

Results for Aspartoacyclase, all coinciding nicely with the current knowledge on the enzyme:

GOid Aspect Confidence GO Term
GO:0016787 F 96% hydrolase activity
GO:0004046 F 82% aminoacyclase activity
GO:0019807 F 82% aspartoacyclase activity
GO:0016788 F 81% hydrolase activity acting on ester bonds


Other proteins:

Aspa other goped.gif

Pfam

PFAM is a large database of protein functions. It was established in 1998 at the Wellcome Trust Sanger Institute.

It is comprised of two database: Pfam-A, a manually curated high-quality database with a limited number of entries, and the much larger, automatically curated, Pfam-B.

Reference: The Pfam protein families database: R.D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J.E. Pollington, O.L. Gavin, P. Gunesekaran, G. Ceric, K. Forslund, L. Holm, E.L. Sonnhammer, S.R. Eddy, A. Bateman

The result for Aspartoacyclase is spot-on:

Aspa pfam significant.png


BACR_HALSA

Aspa bacr.png

RET4_HUMAN

Aspa ret4.png

INSL5_HUMAN

Aspa insl.png

LAMP1_HUMAN

Aspa lamp1.png

A4_HUMAN

Aspa a4.png

ProtFun 2.2

ProtFun is a program for ab-initio protein function prediction. It was published in 2002 by Juhl Jensen et al.

Reference: Paper Abstract

The software queries a number of existing prediction servers for a wide range of features, from isoelectic point to posttranslational modifications, and deduces its function from this data.

Results for Aspartoacyclase:

############## ProtFun 2.2 predictions ##############

>sp_P45381_A

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.071    3.233
  Biosynthesis_of_cofactors            0.144    2.003
  Cell_envelope                        0.033    0.535
  Cellular_processes                   0.137    1.875
  Central_intermediary_metabolism   => 0.334    5.309
  Energy_metabolism                    0.226    2.511
  Fatty_acid_metabolism                0.022    1.663
  Purines_and_pyrimidines              0.367    1.512
  Regulatory_functions                 0.021    0.128
  Replication_and_transcription        0.167    0.625
  Translation                          0.113    2.559
  Transport_and_binding                0.017    0.042

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                            => 0.703    2.454
  Nonenzyme                            0.297    0.416

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.111    0.534
  Transferase    (EC 2.-.-.-)          0.202    0.585
  Hydrolase      (EC 3.-.-.-)          0.115    0.363
  Lyase          (EC 4.-.-.-)          0.031    0.662
  Isomerase      (EC 5.-.-.-)       => 0.084    2.637
  Ligase         (EC 6.-.-.-)          0.074    1.460

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.053    0.246
  Receptor                             0.004    0.024
  Hormone                              0.001    0.206
  Structural_protein                   0.001    0.041
  Transporter                          0.025    0.230
  Ion_channel                          0.015    0.257
  Voltage-gated_ion_channel            0.004    0.173
  Cation_channel                       0.011    0.234
  Transcription                        0.100    0.785
  Transcription_regulation             0.039    0.313
  Stress_response                      0.010    0.117
  Immune_response                      0.061    0.720
  Growth_factor                        0.006    0.450
  Metal_ion_transport                  0.009    0.020


Other proteins:

############## ProtFun 2.2 predictions ##############

>LAMP1_HUMAN

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.011    0.484
  Biosynthesis_of_cofactors            0.053    0.735
  Cell_envelope                     => 0.804   13.186
  Cellular_processes                   0.027    0.373
  Central_intermediary_metabolism      0.138    2.188
  Energy_metabolism                    0.037    0.411
  Fatty_acid_metabolism                0.016    1.265
  Purines_and_pyrimidines              0.533    2.195
  Regulatory_functions                 0.015    0.090
  Replication_and_transcription        0.019    0.073
  Translation                          0.027    0.613
  Transport_and_binding                0.834    2.033

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                               0.276    0.965
  Nonenzyme                         => 0.724    1.014

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.039    0.187
  Transferase    (EC 2.-.-.-)          0.046    0.134
  Hydrolase      (EC 3.-.-.-)          0.058    0.184
  Lyase          (EC 4.-.-.-)          0.020    0.430
  Isomerase      (EC 5.-.-.-)          0.010    0.321
  Ligase         (EC 6.-.-.-)          0.017    0.326

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.396    1.849
  Receptor                             0.282    1.659
  Hormone                              0.001    0.206
  Structural_protein                   0.011    0.408
  Transporter                          0.024    0.222
  Ion_channel                          0.008    0.147
  Voltage-gated_ion_channel            0.002    0.111
  Cation_channel                       0.010    0.215
  Transcription                        0.032    0.247
  Transcription_regulation             0.018    0.142
  Stress_response                      0.246    2.795
  Immune_response                   => 0.371    4.368
  Growth_factor                        0.013    0.956
  Metal_ion_transport                  0.009    0.020

//


>RET4_HUMAN

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.017    0.751
  Biosynthesis_of_cofactors            0.044    0.610
  Cell_envelope                     => 0.804   13.186
  Cellular_processes                   0.075    1.021
  Central_intermediary_metabolism      0.197    3.128
  Energy_metabolism                    0.043    0.475
  Fatty_acid_metabolism                0.016    1.265
  Purines_and_pyrimidines              0.275    1.131
  Regulatory_functions                 0.013    0.080
  Replication_and_transcription        0.022    0.084
  Translation                          0.032    0.721
  Transport_and_binding                0.800    1.951

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                            => 0.544    1.900
  Nonenzyme                            0.456    0.639

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.095    0.458
  Transferase    (EC 2.-.-.-)          0.038    0.109
  Hydrolase      (EC 3.-.-.-)          0.235    0.742
  Lyase          (EC 4.-.-.-)       => 0.059    1.264
  Isomerase      (EC 5.-.-.-)          0.010    0.321
  Ligase         (EC 6.-.-.-)          0.017    0.326

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.202    0.942
  Receptor                             0.147    0.862
  Hormone                              0.004    0.667
  Structural_protein                   0.002    0.058
  Transporter                          0.025    0.232
  Ion_channel                          0.016    0.288
  Voltage-gated_ion_channel            0.003    0.148
  Cation_channel                       0.010    0.215
  Transcription                        0.027    0.207
  Transcription_regulation             0.025    0.196
  Stress_response                      0.161    1.829
  Immune_response                   => 0.239    2.813
  Growth_factor                        0.023    1.617
  Metal_ion_transport                  0.009    0.020

//


>BACR_HALSA

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.033    1.495
  Biosynthesis_of_cofactors            0.186    2.589
  Cell_envelope                        0.029    0.483
  Cellular_processes                   0.051    0.694
  Central_intermediary_metabolism      0.045    0.711
  Energy_metabolism                    0.138    1.537
  Fatty_acid_metabolism                0.016    1.265
  Purines_and_pyrimidines              0.302    1.244
  Regulatory_functions                 0.013    0.080
  Replication_and_transcription        0.019    0.073
  Translation                          0.059    1.339
  Transport_and_binding             => 0.791    1.929

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                               0.199    0.696
  Nonenzyme                         => 0.801    1.122

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.114    0.549
  Transferase    (EC 2.-.-.-)          0.031    0.091
  Hydrolase      (EC 3.-.-.-)          0.057    0.180
  Lyase          (EC 4.-.-.-)          0.020    0.430
  Isomerase      (EC 5.-.-.-)          0.010    0.321
  Ligase         (EC 6.-.-.-)          0.017    0.326

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.258    1.205
  Receptor                             0.355    2.087
  Hormone                              0.001    0.206
  Structural_protein                   0.006    0.200
  Transporter                       => 0.440    4.036
  Ion_channel                          0.010    0.169
  Voltage-gated_ion_channel            0.004    0.172
  Cation_channel                       0.078    1.689
  Transcription                        0.026    0.205
  Transcription_regulation             0.028    0.226
  Stress_response                      0.012    0.139
  Immune_response                      0.011    0.128
  Growth_factor                        0.010    0.727
  Metal_ion_transport                  0.049    0.106

//


>INSL5_HUMAN

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.011    0.484
  Biosynthesis_of_cofactors            0.040    0.558
  Cell_envelope                     => 0.756   12.393
  Cellular_processes                   0.033    0.448
  Central_intermediary_metabolism      0.048    0.755
  Energy_metabolism                    0.036    0.397
  Fatty_acid_metabolism                0.016    1.265
  Purines_and_pyrimidines              0.144    0.592
  Regulatory_functions                 0.014    0.087
  Replication_and_transcription        0.020    0.075
  Translation                          0.032    0.735
  Transport_and_binding                0.834    2.033

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                               0.209    0.729
  Nonenzyme                         => 0.791    1.109

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.056    0.268
  Transferase    (EC 2.-.-.-)          0.031    0.091
  Hydrolase      (EC 3.-.-.-)          0.062    0.195
  Lyase          (EC 4.-.-.-)          0.020    0.430
  Isomerase      (EC 5.-.-.-)          0.010    0.321
  Ligase         (EC 6.-.-.-)          0.017    0.327

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.374    1.746
  Receptor                             0.128    0.750
  Hormone                           => 0.247   37.936
  Structural_protein                   0.001    0.041
  Transporter                          0.025    0.228
  Ion_channel                          0.010    0.168
  Voltage-gated_ion_channel            0.003    0.131
  Cation_channel                       0.010    0.215
  Transcription                        0.054    0.425
  Transcription_regulation             0.091    0.724
  Stress_response                      0.099    1.128
  Immune_response                      0.178    2.090
  Growth_factor                        0.061    4.379
  Metal_ion_transport                  0.009    0.020

//


>A4_HUMAN

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.020    0.921
  Biosynthesis_of_cofactors            0.261    3.623
  Cell_envelope                     => 0.804   13.186
  Cellular_processes                   0.053    0.730
  Central_intermediary_metabolism      0.184    2.920
  Energy_metabolism                    0.023    0.259
  Fatty_acid_metabolism                0.016    1.265
  Purines_and_pyrimidines              0.417    1.716
  Regulatory_functions                 0.013    0.084
  Replication_and_transcription        0.029    0.109
  Translation                          0.027    0.613
  Transport_and_binding                0.827    2.016

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                            => 0.392    1.368
  Nonenzyme                            0.608    0.852

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.024    0.114
  Transferase    (EC 2.-.-.-)          0.208    0.603
  Hydrolase      (EC 3.-.-.-)          0.190    0.600
  Lyase          (EC 4.-.-.-)          0.020    0.430
  Isomerase      (EC 5.-.-.-)          0.010    0.324
  Ligase         (EC 6.-.-.-)          0.048    0.946

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.126    0.586
  Receptor                             0.036    0.211
  Hormone                              0.001    0.206
  Structural_protein                => 0.034    1.205
  Transporter                          0.024    0.222
  Ion_channel                          0.009    0.162
  Voltage-gated_ion_channel            0.002    0.108
  Cation_channel                       0.010    0.215
  Transcription                        0.043    0.335
  Transcription_regulation             0.018    0.143
  Stress_response                      0.076    0.862
  Immune_response                      0.016    0.183
  Growth_factor                        0.005    0.372
  Metal_ion_transport                  0.009    0.020

//