Difference between revisions of "ASPA Sequence Based Predictions"
(→OCTOPUS & SPOCTOPUS) |
(→OCTOPUS & SPOCTOPUS) |
||
Line 1,199: | Line 1,199: | ||
MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii |
MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii |
||
MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii |
MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii |
||
+ | iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii |
||
+ | iiiiiiiiiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM |
||
+ | MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiM |
||
+ | MMMMMMMMMMMMMMMMMMMMooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii |
||
iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii |
iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii |
||
iiiiiiiiiiiiiiiiiiiiii</pre> |
iiiiiiiiiiiiiiiiiiiiii</pre> |
||
Line 1,205: | Line 1,212: | ||
<pre>OCTOPUS predicted topology: |
<pre>OCTOPUS predicted topology: |
||
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo |
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooo</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnnSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooo |
||
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
Line 1,212: | Line 1,225: | ||
<pre>OCTOPUS predicted topology: |
<pre>OCTOPUS predicted topology: |
||
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo |
iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooo</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooo |
||
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
ooooooooooooooo</pre> |
ooooooooooooooo</pre> |
||
Line 1,218: | Line 1,236: | ||
<pre>OCTOPUS predicted topology: |
<pre>OCTOPUS predicted topology: |
||
iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo |
iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnnnnnnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooo |
||
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
Line 1,240: | Line 1,267: | ||
ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM |
ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM |
||
Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii</pre> |
Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii</pre> |
||
+ | |||
+ | <pre>SPOCTOPUS predicted topology: |
||
+ | nnnSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo |
||
+ | ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM |
||
+ | Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii |
||
+ | </pre> |
||
===SignalP=== |
===SignalP=== |
Revision as of 18:00, 13 June 2011
Contents
Prediction of Secondary Structure Elements
PsiPred
For a description of PsiPred, see Psipred.
# PSIPRED HFORMAT (PSIPRED V3.0) Conf: 987522213466199993246776008999999984450000587389976339987971 Pred: CCCCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCCCCEEEEEECCHHHHHH AA: MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK 10 20 30 40 50 60 Conf: 998788998878786647999999984999999999988199999997428994187898 Pred: CCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCEEEECCCCCC AA: CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS 70 80 90 100 110 120 Conf: 999505864599448999999998762999737862048886301220027861499667 Pred: CCCCEEEEECCCCHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCHHHHCCCCCCEEEEEC AA: NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG 130 140 150 160 170 180 Conf: 877898808999999999999998976406998899973479998113515579877700 Pred: CCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEEEEEECCCCCCCCCE AA: PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA 190 200 210 220 230 240 Conf: 552467669998546888832213699778518622057770372000011102000100 Pred: EEECCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCEEEEECCCCHHCCCCHHHEECCE AA: AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK 250 260 270 280 290 300 Conf: 3544256113309 Pred: EEEEECCCEEECC AA: LTLNAKSIRCCLH 310
JPred3
JPred3 was published in 1998 by Christian Cole, Jonathan D. Barber and Geoffrey J. Barton.
Reference: Original paper, current version
JPred3 uses the JNet 2.0 algorithm to make its predictions. This algorithm generates profiles using PSI-Blast (which is used to build a position-specific scoring matrix) and HMMer (which is used to construct HMM profiles.) Both position-specific scoring matrix and the HMMs are used to predict secondary structure and solvent accessibility.
Input: A protein sequence or a pre-made MSA; a PDB database is needed, too, but provided by the JPred3 server.
MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYID ------------EEEEEEEE------HHHHHHHHHH---------EEEEEEEE-HHHHHH-----H CDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLILEDSR ---------------------HHHHHHHHHHHHHH-------EEEEEE-----------EEEE--- NNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLRADILDQMRKM -HHHHHHHHHHHH------EEEEEE---------HHEE----EEEEE---------HHHHHHHHHH IKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQDQDWKPLHPGDPMFLT HHHHHHHHHHH----------EEEEEEEEEE----------EEEE----------------HHE-- LDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSIRCCLH ----EEEE----EEEEEEE-----HHH-HHHHHHHHEEE-----EEEE-
DSSP
DSSP (Define Secondary Structure of Proteins) is a software for secondary structure assignment and was published in 1983 by Wolfgang Kabsch and Chris Sander. Reference: Original paper
DSSP does not predict secondary structure from amino acid sequences; instead, it uses a 3D structure (a PDB file) to deduce the secondary structure from the 3D structure. To this end, DSSP examines the phi and psi angles and the C alpha positions in the protein backbone and H-bonds present in the structure; these are used to define "n-turns", which are H-bonds between the NH and CO groups of amino acids with sequence separations of 3-5 residues, and "bridges" with greater sequence separations. Repeating 4-turns are used to identify helices, repeating bridges identify beta sheets.
Input: A 3D structure (a PDB file, ID 2o53 in our case)
Output: (from [1])
H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend
The results differ from those of the two secondary structure predictors, as the PDB file contains a dimer, whereas the Uniprot sequence only contains one domain (which is a sensible thing, since both domains are essentially identical.)
The prediction shows slight differences between both domains; we assume that reasons for this are slight differences in the actual 3D structure of the two chains as well as H-bonds between the two chains.
10 20 30 40 50 60 | | | | | | 1 - 60 EHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYIDCD 1 - 60 SSSSSS TTTT HHHHHHHHHHTT 333 TT SSSSSST HHHHHTTTT TTT 1 - 60 1 - 60 AA AA A AA AAA AA AAAA A A AA AAAA AAAA 70 80 90 100 110 120 | | | | | | 61 - 120 LNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLIL 61 - 120 333 THHHHTT TTT HHHHHHHHHHHHH TTTTTT TSSSSSSS TTT SSSSSS 61 - 120 ** * * ** ** * * 61 - 120 A AAA AAAAAAAAA A A AA AAA AA A 130 140 150 160 170 180 | | | | | | 121 - 180 EDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLR 121 - 180 T TT HHHHHHHHHHHHHHTTT SSSSS TT 3333TTSSSSSSSST TT 121 - 180 * ** ** 121 - 180 A A A AA AAA AAAA A AAAAAA AA A A A 190 200 210 220 230 240 | | | | | | 181 - 240 ADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQ 181 - 240 HHHHHHHHHHHHHHHHHHHHHHTT SSSSSSSSSSSS TTT TSS TTTT 181 - 240 181 - 240 A AA AA AA AA AAAA AA A A A AAA A AAAAA A AA 250 260 270 280 290 300 | | | | | | 241 - 300 DQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSI 241 - 300 T TTT TTTSSSS TT SSS TTT SSSTTT 333TTTT TSSSSSSSSSSS 241 - 300 241 - 300 AA AA AAAAA A AAAAAA AAAAA A AAA A AAA A A 301 - 302 RC 301 - 302 301 - 302 301 - 302 AA 310 320 330 340 350 360 | | | | | | 303 - 362 EHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKKCTRYIDCD 303 - 362 SSSSSS TTTT HHHHHHHHHHHH 3333 TT SSSSSST HHHHHTT T TTT 303 - 362 303 - 362 AA AA AA AAA AA AAAA A A A AA AAAA AAAA 370 380 390 400 410 420 | | | | | | 363 - 422 LNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTSNMGCTLIL 363 - 422 333 THHHHTT TTT HHHHHHHHHHHHH TTTTTT TSSSSSSS TTT SSSSSS 363 - 422 ** *** * ** **** 363 - 422 A AAA AAAAAAAAA A A AA AAA AA A 430 440 450 460 470 480 | | | | | | 423 - 482 EDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLR 423 - 482 T TT HHHHHHHHHHHHHHTTT SSSSS TTTT 3333TTSSSSSSSS TT 423 - 482 423 - 482 A A A AA AA AAAA A AAAAAA AA A A 490 500 510 520 530 540 | | | | | | 483 - 542 ADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQ 483 - 542 HHHHHHHHHHHHHHHHHHHHHHTT SSSSSSSSSSSS TTT TSS TTTT 483 - 542 *** * 483 - 542 A AA AA A AA A AA AA A A A AAA A AAAAA A A AA 550 560 570 580 590 600 | | | | | | 543 - 602 DQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTKLTLNAKSI 543 - 602 T TTT TTTSSSS TT SSS TTT SSSTTT THHHHTT TSSSSSSSSSSS 543 - 602 * 543 - 602 AA AA AAAAA A AAAAAA AAAAA A AAA A AAAA A AA 603 - 604 RC 603 - 604 603 - 604 603 - 604 AA Clearly solvent accessible: A; involved in symmetry contacts: *
All in all, the two prediction methods Psipred and JPred3 did a good job; they managed to predict most of the main secondary structure elements, with only minor variations in length and position of the individual helices/sheets and very minor variations between each other. A somewhat more detailed result from DSSP is to be expected, as it has pointedly better information to and merely assigns instead of actually predicting the secondary structure.
Prediction of disordered regions
DISOPRED
DISOPRED predicts native disorder in proteins. It was published in 2004 by Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF and Jones DT. Reference: [2]
DISOPRED uses linear support vector machines to predict disorder in a given protein sequence. A set of 750 proteins with high-quality structures was used as training data; to this end, PSI-Blast profiles were generated by aligning the training structures against a filtered database of protein structures. The resulting profiles were used to train the SVMs.
DISOPRED predictions for a false positive rate threshold of: 2% conf: 999999999877640000000000000000000000000000000000000000000000 pred: **********.................................................. AA: MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK 10 20 30 40 50 60 conf: 000000000000000356777788777654200000000000000000000000000000 pred: ......................**.................................... AA: CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS 70 80 90 100 110 120 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG 130 140 150 160 170 180 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA 190 200 210 220 230 240 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK 250 260 270 280 290 300 conf: 0000000000002 pred: ............. AA: LTLNAKSIRCCLH 310 Asterisks (*) represent disorder predictions and dots (.) prediction of order. The confidence estimates give a rough indication of the probability that each residue is disordered.
POODLE
POODLE (Prediction Of Order and Disorder by machine LEarning) is a series of programs published between 2005 and 2008. We used the latest variant, POODLE-I, which was published in 2008 by S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi.
Reference: S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi, "Disordered region prediction by integrating POODLE series", CASP8 Proceedings 2008, 14-15.
Input: Protein amino acid sequence
POODLE-I is an integrated variant of other flavors of POODLE (-S and -L for short/long regions of disorder and -W for proteins that are mostly disordered) and several other tools like Psipred, JNet etc. It employs a rather involved workflow.
Custom-formatted output for Aspartoacyclase:
POS 1 M T S C H I A E E H I -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.461 0.444 0.413 0.401 0.418 0.461 0.537 0.644 0.693 0.62 0.468 POS 12 Q K V A I F G G T H G -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.321 0.238 0.177 0.146 0.128 0.116 0.106 0.104 0.111 0.126 0.132 POS 23 N E L T G V F L V K H -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.131 0.118 0.098 0.073 0.053 0.041 0.036 0.035 0.035 0.036 0.036 POS 34 W L E N G A E I Q R T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.038 0.045 0.06 0.081 0.099 0.119 0.133 0.146 0.147 0.143 0.129 POS 45 G L E V K P F I T N P -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.111 0.09 0.073 0.062 0.054 0.047 0.039 0.033 0.033 0.037 0.041 POS 56 R A V K K C T R Y I D -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.043 0.047 0.054 0.062 0.068 0.071 0.073 0.07 0.067 0.069 0.075 POS 67 C D L N R I F D L E N -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.08 0.081 0.078 0.075 0.073 0.072 0.076 0.094 0.127 0.176 0.249 POS 78 L G K K M S E D L P Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.403 0.554 0.737 0.766 0.804 0.755 0.682 0.65 0.632 0.636 0.583 POS 89 E V R R A Q E I N H L -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.505 0.448 0.348 0.262 0.201 0.16 0.131 0.11 0.103 0.104 0.111 POS 100 F G P K D S E D S Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.116 0.117 0.108 0.089 0.067 0.049 0.039 0.034 0.033 0.035 POS 110 D I I F D L H N T T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.038 0.041 0.043 0.043 0.042 0.041 0.042 0.045 0.052 0.06 POS 120 S N M G C T L I L E -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.07 0.081 0.092 0.101 0.109 0.111 0.107 0.096 0.085 0.072 POS 130 D S R N N F L I Q M -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.06 0.051 0.046 0.04 0.036 0.033 0.032 0.031 0.031 0.031 POS 140 F H Y I K T S L A P -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.033 0.036 0.04 0.043 0.046 0.049 0.05 0.053 0.055 0.059 POS 150 L P C Y V Y L I E H -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.065 0.073 0.088 0.103 0.115 0.119 0.118 0.111 0.104 0.104 POS 160 P S L K Y A T T R S -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.121 0.147 0.19 0.229 0.264 0.263 0.245 0.196 0.149 0.098 POS 170 I A K Y P V G I E V -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.069 0.053 0.051 0.057 0.066 0.08 0.093 0.102 0.103 0.099 POS 180 G P Q P Q G V L R A -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.096 0.094 0.095 0.095 0.099 0.098 0.099 0.095 0.094 0.086 POS 190 D I L D Q M R K M I -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.074 0.059 0.048 0.04 0.038 0.038 0.038 0.039 0.04 0.043 POS 200 K H A L D F I H H F -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.046 0.049 0.051 0.052 0.056 0.064 0.077 0.092 0.112 0.142 POS 210 N E G K E F P P C A -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.17 0.198 0.21 0.311 0.281 0.248 0.105 0.084 0.072 0.071 POS 220 I E V Y K I I E K V -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.069 0.065 0.06 0.056 0.052 0.054 0.062 0.076 0.105 0.141 POS 230 D Y P R D E N G E I -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.176 0.203 0.224 0.227 0.217 0.209 0.228 0.248 0.271 0.282 POS 240 A A I I H P N L Q D -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.289 0.269 0.24 0.208 0.188 0.169 0.155 0.152 0.167 0.193 POS 250 Q D W K P L H P G D -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.222 0.236 0.235 0.21 0.175 0.136 0.11 0.097 0.099 0.104 POS 260 P M F L T L D G K T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.107 0.108 0.104 0.095 0.084 0.077 0.073 0.082 0.102 0.125 POS 270 I P L G G D C T V Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.144 0.162 0.169 0.166 0.156 0.149 0.133 0.117 0.099 0.089 POS 280 P V F V N E A A Y Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.077 0.072 0.067 0.064 0.066 0.082 0.122 0.184 0.241 0.279 POS 290 E K K E A F A K T T -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.282 0.264 0.236 0.229 0.238 0.257 0.263 0.252 0.231 0.222 POS 300 K L T L N A K S I R -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0.246 0.278 0.305 0.31 0.303 0.277 0.263 0.371 0.382 0.38 POS 310 C C L H -1 -1 -1 -1 0.348 0.51 0.496 0.489
IUPRED
IUPRED is a software for the prediction of intrinsically unstructured regions in proteins. It was published in 2005 by Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon.
Reference: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon, Bioinformatics (2005) 21, 3433-3434.
IUPRED predicts disordered regions by estimating the capacity of the amino acid chain to form stabilizing contacts. The underlying assumption is that proteins intrinsically unable to do so have distinct sequences that can be identified via their unfavorable energy values. To this end a 20x20 predictor matrix was calculated from a set of globular proteins with known structure. IUPRED uses this matrix to derive a tendency to be intrinsically unstructured from the amino acid composition alone.
Input: An amino acid sequence.
IUPRED comes in three flavors: Long Disorder, which specializes in finding long stretches of disorder, Short Disorder, which does the same for short stretches of disorder, and structured regions, which predicts regions lacking disorder.
Long Disorder
POS 1 M T S C H I A E E H I 0.3215 0.3426 0.2817 0.2783 0.2064 0.1275 0.1554 0.1823 0.2094 0.2364 0.2575 POS 12 Q K V A I F G G T H G 0.2988 0.3087 0.2364 0.3215 0.3149 0.3321 0.2609 0.1823 0.1275 0.1206 0.1759 POS 23 N E L T G V F L V K H 0.1028 0.0676 0.1070 0.1298 0.1881 0.2575 0.2715 0.1969 0.2034 0.2034 0.2064 POS 34 W L E N G A E I Q R T 0.1942 0.1206 0.1914 0.1399 0.1373 0.2064 0.2002 0.1969 0.2541 0.2715 0.2951 POS 45 G L E V K P F I T N P 0.3840 0.4256 0.3460 0.3321 0.3286 0.2609 0.2503 0.3249 0.2292 0.1583 0.1611 POS 56 R A V K K C T R Y I D 0.0985 0.1554 0.0929 0.1373 0.1424 0.0765 0.0749 0.1229 0.0749 0.0780 0.0719 POS 67 C D L N R I F D L E N 0.0424 0.0506 0.0734 0.0734 0.0605 0.1048 0.1115 0.1184 0.1229 0.2064 0.1323 POS 78 L G K K M S E D L P Y 0.2258 0.1643 0.2364 0.2292 0.2002 0.2884 0.4087 0.3215 0.4119 0.3948 0.3053 POS 89 E V R R A Q E I N H L 0.2849 0.2849 0.3149 0.3182 0.3631 0.3667 0.3667 0.3631 0.4441 0.3286 0.4220 POS 100 F G P K D S E D S Y 0.3215 0.3053 0.1969 0.2034 0.1611 0.1501 0.1476 0.2224 0.2164 0.2164 POS 110 D I I F D L H N T T 0.3053 0.3704 0.3704 0.2609 0.2680 0.1823 0.1184 0.0662 0.0690 0.0734 POS 120 S N M G C T L I L E 0.1229 0.1184 0.1914 0.2817 0.2849 0.2034 0.2064 0.2193 0.1759 0.0985 POS 130 D S R N N F L I Q M 0.0948 0.0581 0.0327 0.0398 0.0414 0.0719 0.0414 0.0581 0.1092 0.1092 POS 140 F H Y I K T S L A P 0.1162 0.0662 0.0398 0.0269 0.0163 0.0105 0.0115 0.0184 0.0275 0.0300 POS 150 L P C Y V Y L I E H 0.0372 0.0433 0.0424 0.0405 0.0581 0.0618 0.0618 0.0618 0.1007 0.0749 POS 160 P S L K Y A T T R S 0.0543 0.0870 0.0443 0.0888 0.1007 0.1424 0.1449 0.2292 0.2470 0.2328 POS 170 I A K Y P V G I E V 0.2575 0.2503 0.2752 0.3667 0.3704 0.3948 0.3426 0.3356 0.3019 0.3149 POS 180 G P Q P Q G V L R A 0.2328 0.2328 0.2752 0.2951 0.3321 0.3087 0.3631 0.3182 0.3182 0.2918 POS 190 D I L D Q M R K M I 0.3494 0.3182 0.2164 0.2129 0.1115 0.0605 0.0592 0.0870 0.0734 0.0780 POS 200 K H A L D F I H H F 0.1048 0.0967 0.1501 0.2364 0.1349 0.1399 0.1942 0.1206 0.1048 0.0817 POS 210 N E G K E F P P C A 0.1449 0.1048 0.0618 0.0734 0.0704 0.0389 0.0835 0.1349 0.0948 0.1028 POS 220 I E V Y K I I E K V 0.1115 0.1184 0.1092 0.1184 0.1323 0.1275 0.2129 0.2094 0.1229 0.1731 POS 230 D Y P R D E N G E I 0.1731 0.1759 0.1028 0.1476 0.2470 0.2609 0.2680 0.3631 0.3566 0.3740 POS 240 A A I I H P N L Q D 0.4476 0.3392 0.4256 0.4256 0.3460 0.3356 0.3392 0.3249 0.3392 0.3460 POS 250 Q D W K P L H P G D 0.3910 0.3215 0.2783 0.3631 0.3667 0.3774 0.3566 0.3392 0.4220 0.3321 POS 260 P M F L T L D G K T 0.3426 0.2541 0.2436 0.3426 0.3566 0.2470 0.3286 0.2680 0.1643 0.1852 POS 270 I P L G G D C T V Y 0.1298 0.0631 0.0543 0.1048 0.1731 0.1449 0.1881 0.1115 0.0646 0.0734 POS 280 P V F V N E A A Y Y 0.0690 0.1092 0.1048 0.1399 0.0765 0.0646 0.0581 0.1028 0.1007 0.1373 POS 290 E K K E A F A K T T 0.1449 0.1298 0.1184 0.2034 0.2364 0.2164 0.2002 0.1583 0.1823 0.1852 POS 300 K L T L N A K S I R 0.1881 0.1184 0.1184 0.1184 0.0888 0.0851 0.1349 0.1349 0.1137 0.0870 POS 310 C C L H 0.0631 0.0473 0.0734 0.0483
Short Disorder
POS 1 M T S C H I A E E H I 0.8886 0.7772 0.7418 0.6984 0.5992 0.5296 0.4149 0.2748 0.2333 0.1921 0.1566 POS 12 Q K V A I F G G T H G 0.1805 0.1844 0.1732 0.2700 0.1766 0.2531 0.2913 0.2080 0.1292 0.0832 0.0965 POS 23 N E L T G V F L V K H 0.0909 0.0991 0.0935 0.0660 0.1088 0.1766 0.1766 0.1495 0.1566 0.0991 0.1041 POS 34 W L E N G A E I Q R T 0.1041 0.0909 0.1456 0.0935 0.0935 0.0909 0.1416 0.2385 0.2080 0.1416 0.1322 POS 45 G L E V K P F I T N P 0.1844 0.2963 0.2820 0.2558 0.1921 0.2167 0.2041 0.1998 0.1921 0.1921 0.1380 POS 56 R A V K K C T R Y I D 0.0935 0.1495 0.0935 0.1416 0.0935 0.0771 0.1322 0.1416 0.0935 0.0965 0.0464 POS 67 C D L N R I F D L E N 0.0490 0.0567 0.0542 0.0554 0.0567 0.0935 0.0813 0.0858 0.1292 0.1958 0.1322 POS 78 L G K K M S E D L P Y 0.2385 0.1732 0.2558 0.1958 0.1878 0.2432 0.2963 0.3184 0.4149 0.3359 0.3399 POS 89 E V R R A Q E I N H L 0.4116 0.3491 0.2820 0.2913 0.3535 0.3399 0.3456 0.3399 0.4333 0.4078 0.4825 POS 100 F G P K D S E D S Y 0.3992 0.4651 0.4149 0.3578 0.3005 0.2820 0.1878 0.2483 0.2385 0.2385 POS 110 D I I F D L H N T T 0.2963 0.3668 0.3630 0.2865 0.2963 0.2209 0.2122 0.1292 0.0789 0.0441 POS 120 S N M G C T L I L E 0.0771 0.0771 0.1117 0.1635 0.2333 0.2209 0.1878 0.1292 0.0771 0.0858 POS 130 D S R N N F L I Q M 0.0660 0.0336 0.0316 0.0226 0.0102 0.0179 0.0179 0.0327 0.0279 0.0363 POS 140 F H Y I K T S L A P 0.0387 0.0173 0.0218 0.0128 0.0078 0.0044 0.0055 0.0059 0.0055 0.0055 POS 150 L P C Y V Y L I E H 0.0070 0.0167 0.0194 0.0200 0.0387 0.0212 0.0160 0.0157 0.0327 0.0455 POS 160 P S L K Y A T T R S 0.0387 0.0414 0.0279 0.0441 0.0455 0.0884 0.0991 0.1602 0.1635 0.1602 POS 170 I A K Y P V G I E V 0.1088 0.0965 0.1150 0.1878 0.2167 0.3146 0.3535 0.2865 0.2080 0.2080 POS 180 G P Q P Q G V L R A 0.1766 0.2748 0.2333 0.1667 0.2531 0.2385 0.2748 0.2748 0.3630 0.3184 POS 190 D I L D Q M R K M I 0.3146 0.3096 0.2748 0.2292 0.1292 0.1292 0.0744 0.0701 0.1150 0.1117 POS 200 K H A L D F I H H F 0.0723 0.0701 0.1150 0.1732 0.1602 0.1602 0.1205 0.1456 0.1766 0.1532 POS 210 N E G K E F P P C A 0.1958 0.1322 0.1416 0.1178 0.1205 0.1088 0.1205 0.1240 0.1380 0.1322 POS 220 I E V Y K I I E K V 0.1566 0.1698 0.1060 0.1266 0.1178 0.1240 0.2080 0.1766 0.1566 0.2292 POS 230 D Y P R D E N G E I 0.1878 0.2432 0.2041 0.2041 0.2122 0.2122 0.3225 0.3992 0.3005 0.3359 POS 240 A A I I H P N L Q D 0.4149 0.4245 0.5173 0.4078 0.4116 0.4245 0.3359 0.3263 0.3578 0.3399 POS 250 Q D W K P L H P G D 0.4282 0.4825 0.4703 0.4651 0.4600 0.4600 0.3399 0.3578 0.4333 0.4078 POS 260 P M F L T L D G K T 0.3885 0.2913 0.3053 0.3096 0.3184 0.2820 0.3630 0.3005 0.2657 0.1998 POS 270 I P L G G D C T V Y 0.1205 0.1205 0.1041 0.1018 0.1117 0.1150 0.1844 0.1380 0.1117 0.0701 POS 280 P V F V N E A A Y Y 0.0425 0.0744 0.0567 0.0909 0.0965 0.0744 0.0387 0.0464 0.0441 0.0607 POS 290 E K K E A F A K T T 0.0991 0.0771 0.0660 0.1041 0.0935 0.0935 0.0660 0.0832 0.1041 0.1041 POS 300 K L T L N A K S I R 0.1698 0.1041 0.0660 0.0441 0.0405 0.0701 0.1602 0.2333 0.2865 0.3456 POS 310 C C L H 0.4037 0.4556 0.5802 0.6334
Structured Regions
IUPRED predicts one structured region comprised of the whole input sequence.
Results
IUPRED predicts no significant disorder in Aspartoacyclase. The disorder tendency stays below 0.5 in all cases (except for short stretches of about 3-5 residues at each end of the sequence in short disorder mode, which are negligible) and the structured regions mode predicts one continuous structured region spanning all of the protein sequence. This makes sense when looking at the 3D structure: Aspartoacyclase is a rather densely packed globular structure, which according to the assumptions that IUPRED makes has a strong tendency to form many inter-residue contacts and to stabilize itself thereby, markedly reducing the tendency for disorder in the process.
Meta-Disorder
Meta-Disorder, as the name implies, employs a set of so-called orthogonal disorder predictors in order to combine their strengths and mitigate their weak points. It was published in 2009 by Avner Schlessinger, Marco Punta, Guy Yachdav, Laszlo Kajan and Burkhard Rost.
Reference: Paper
As with the previous methods, Meta-Disorder predicts disorder from the amino acid sequence alone; results from the predictors IUPRED, DISOPRED, NORSnet and Ucon are molded into one final result using a neural network.
Results for Aspartoacyclase:
Number Residue NORSnet NORS2st PROFbval bval2st Ucon Ucon2st MD_raw MD_rel MD2st 1 M 0.33 - 0.99 D 0.17 - 0.551 1 D 2 T 0.26 - 0.78 D 0.25 - 0.531 0 D 3 S 0.16 - 0.72 D 0.35 - 0.535 0 D 4 C 0.23 - 0.65 D 0.33 - 0.505 0 - 5 H 0.20 - 0.48 D 0.25 - 0.475 1 - 6 I 0.16 - 0.55 D 0.30 - 0.465 1 - 7 A 0.34 - 0.56 D 0.40 - 0.444 2 - 8 E 0.28 - 0.67 D 0.30 - 0.424 3 - 9 E 0.21 - 0.73 D 0.38 - 0.404 3 - 10 H 0.15 - 0.70 D 0.30 - 0.374 4 - 11 I 0.15 - 0.59 D 0.29 - 0.354 5 - 12 Q 0.15 - 0.60 D 0.28 - 0.313 6 - 13 K 0.14 - 0.51 D 0.23 - 0.263 8 - 14 V 0.14 - 0.30 - 0.19 - 0.253 8 - 15 A 0.16 - 0.24 - 0.19 - 0.250 9 - 16 I 0.13 - 0.20 - 0.24 - 0.242 9 - 17 F 0.10 - 0.13 - 0.23 - 0.250 9 - 18 G 0.13 - 0.18 - 0.21 - 0.242 9 - 19 G 0.10 - 0.24 - 0.20 - 0.253 8 - 20 T 0.07 - 0.34 - 0.20 - 0.253 8 - 21 H 0.06 - 0.26 - 0.26 - 0.260 8 - 22 G 0.06 - 0.39 - 0.29 - 0.253 8 - 23 N 0.06 - 0.48 D 0.22 - 0.250 9 - 24 E 0.06 - 0.47 D 0.18 - 0.242 9 - 25 L 0.11 - 0.43 - 0.16 - 0.242 9 - 26 T 0.12 - 0.39 - 0.20 - 0.253 8 - 27 G 0.10 - 0.32 - 0.20 - 0.242 9 - 28 V 0.08 - 0.28 - 0.15 - 0.242 9 - 29 F 0.12 - 0.35 - 0.13 - 0.242 9 - 30 L 0.14 - 0.28 - 0.15 - 0.242 9 - 31 V 0.09 - 0.30 - 0.16 - 0.253 8 - 32 K 0.07 - 0.40 - 0.16 - 0.263 8 - 33 H 0.06 - 0.40 - 0.18 - 0.293 7 - 34 W 0.08 - 0.38 - 0.29 - 0.273 8 - 35 L 0.09 - 0.45 - 0.30 - 0.283 7 - 36 E 0.09 - 0.56 D 0.41 - 0.313 6 - 37 N 0.12 - 0.62 D 0.32 - 0.313 6 - 38 G 0.16 - 0.62 D 0.35 - 0.330 6 - 39 A 0.11 - 0.64 D 0.46 - 0.313 6 - 40 E 0.10 - 0.66 D 0.47 - 0.323 6 - 41 I 0.09 - 0.65 D 0.47 - 0.323 6 - 42 Q 0.10 - 0.64 D 0.36 - 0.293 7 - 43 R 0.09 - 0.61 D 0.50 - 0.273 8 - 44 T 0.08 - 0.61 D 0.56 - 0.273 8 - 45 G 0.08 - 0.53 D 0.34 - 0.263 8 - 46 L 0.09 - 0.43 - 0.35 - 0.260 8 - 47 E 0.10 - 0.33 - 0.32 - 0.253 8 - 48 V 0.07 - 0.23 - 0.32 - 0.250 9 - 49 K 0.06 - 0.17 - 0.34 - 0.253 8 - 50 P 0.08 - 0.18 - 0.37 - 0.263 8 - 51 F 0.08 - 0.17 - 0.49 - 0.273 8 - 52 I 0.07 - 0.21 - 0.33 - 0.273 8 - 53 T 0.06 - 0.28 - 0.53 - 0.303 7 - 54 N 0.07 - 0.28 - 0.53 - 0.303 7 - 55 P 0.09 - 0.36 - 0.37 - 0.313 6 - 56 R 0.08 - 0.41 - 0.51 - 0.313 6 - 57 A 0.10 - 0.40 - 0.66 D 0.280 7 - 58 V 0.13 - 0.40 - 0.51 - 0.263 8 - 59 K 0.16 - 0.48 D 0.37 - 0.263 8 - 60 K 0.19 - 0.47 D 0.40 - 0.263 8 - 61 C 0.18 - 0.47 D 0.29 - 0.253 8 - 62 T 0.16 - 0.55 D 0.35 - 0.263 8 - 63 R 0.18 - 0.51 D 0.31 - 0.253 8 - 64 Y 0.22 - 0.47 D 0.25 - 0.273 8 - 65 I 0.23 - 0.47 D 0.20 - 0.260 8 - 66 D 0.23 - 0.56 D 0.21 - 0.263 8 - 67 C 0.25 - 0.57 D 0.16 - 0.263 8 - 68 D 0.30 - 0.43 - 0.18 - 0.263 8 - 69 L 0.29 - 0.40 - 0.18 - 0.260 8 - 70 N 0.28 - 0.40 - 0.25 - 0.263 8 - 71 R 0.40 - 0.39 - 0.23 - 0.273 8 - 72 I 0.46 - 0.43 - 0.22 - 0.280 7 - 73 F 0.46 - 0.37 - 0.19 - 0.273 8 - 74 D 0.37 - 0.46 - 0.32 - 0.310 6 - 75 L 0.33 - 0.57 D 0.40 - 0.390 4 - 76 E 0.36 - 0.61 D 0.30 - 0.444 2 - 77 N 0.44 - 0.62 D 0.41 - 0.465 1 - 78 L 0.38 - 0.66 D 0.65 D 0.531 0 D 79 G 0.30 - 0.70 D 0.64 D 0.485 1 - 80 K 0.35 - 0.69 D 0.64 D 0.515 0 - 81 K 0.23 - 0.69 D 0.59 D 0.475 1 - 82 M 0.23 - 0.66 D 0.42 - 0.444 2 - 83 S 0.28 - 0.69 D 0.64 D 0.449 2 - 84 E 0.34 - 0.72 D 0.56 - 0.485 1 - 85 D 0.29 - 0.74 D 0.45 - 0.424 3 - 86 L 0.20 - 0.64 D 0.35 - 0.424 3 - 87 P 0.20 - 0.64 D 0.45 - 0.404 3 - 88 Y 0.17 - 0.55 D 0.46 - 0.384 4 - 89 E 0.14 - 0.50 D 0.46 - 0.364 5 - 90 V 0.13 - 0.45 - 0.30 - 0.333 6 - 91 R 0.12 - 0.43 - 0.43 - 0.320 6 - 92 R 0.11 - 0.40 - 0.36 - 0.293 7 - 93 A 0.11 - 0.34 - 0.36 - 0.283 7 - 94 Q 0.10 - 0.45 - 0.22 - 0.290 7 - 95 E 0.12 - 0.41 - 0.25 - 0.303 7 - 96 I 0.09 - 0.34 - 0.26 - 0.283 7 - 97 N 0.11 - 0.40 - 0.33 - 0.313 6 - 98 H 0.10 - 0.49 D 0.39 - 0.313 6 - 99 L 0.10 - 0.47 D 0.38 - 0.313 6 - 100 F 0.13 - 0.47 D 0.38 - 0.293 7 - 101 G 0.14 - 0.54 D 0.58 D 0.323 6 - 102 P 0.13 - 0.61 D 0.58 D 0.333 6 - 103 K 0.13 - 0.60 D 0.47 - 0.323 6 - 104 D 0.11 - 0.61 D 0.71 D 0.323 6 - 105 S 0.10 - 0.65 D 0.73 D 0.283 7 - 106 E 0.10 - 0.70 D 0.62 D 0.283 7 - 107 D 0.12 - 0.70 D 0.42 - 0.273 8 - 108 S 0.11 - 0.64 D 0.37 - 0.270 8 - 109 Y 0.12 - 0.50 D 0.23 - 0.253 8 - 110 D 0.13 - 0.39 - 0.20 - 0.242 9 - 111 I 0.16 - 0.29 - 0.18 - 0.240 9 - 112 I 0.15 - 0.20 - 0.16 - 0.240 9 - 113 F 0.14 - 0.20 - 0.16 - 0.240 9 - 114 D 0.17 - 0.21 - 0.20 - 0.242 9 - 115 L 0.21 - 0.20 - 0.19 - 0.253 8 - 116 H 0.17 - 0.28 - 0.19 - 0.273 8 - 117 N 0.11 - 0.48 D 0.23 - 0.283 7 - 118 T 0.13 - 0.39 - 0.24 - 0.283 7 - 119 T 0.13 - 0.41 - 0.21 - 0.273 8 - 120 S 0.15 - 0.46 - 0.21 - 0.273 8 - 121 N 0.22 - 0.54 D 0.18 - 0.263 8 - 122 M 0.25 - 0.51 D 0.14 - 0.260 8 - 123 G 0.30 - 0.51 D 0.16 - 0.253 8 - 124 C 0.26 - 0.42 - 0.18 - 0.250 9 - 125 T 0.29 - 0.40 - 0.18 - 0.242 9 - 126 L 0.24 - 0.34 - 0.18 - 0.253 8 - 127 I 0.17 - 0.28 - 0.23 - 0.260 8 - 128 L 0.13 - 0.28 - 0.25 - 0.263 8 - 129 E 0.14 - 0.41 - 0.24 - 0.253 8 - 130 D 0.14 - 0.54 D 0.18 - 0.253 8 - 131 S 0.10 - 0.59 D 0.19 - 0.273 8 - 132 R 0.07 - 0.68 D 0.27 - 0.280 7 - 133 N 0.05 - 0.64 D 0.28 - 0.273 8 - 134 N 0.06 - 0.61 D 0.18 - 0.273 8 - 135 F 0.07 - 0.53 D 0.15 - 0.260 8 - 136 L 0.08 - 0.47 D 0.13 - 0.242 9 - 137 I 0.10 - 0.47 D 0.13 - 0.242 9 - 138 Q 0.16 - 0.42 - 0.13 - 0.242 9 - 139 M 0.15 - 0.34 - 0.13 - 0.242 9 - 140 F 0.11 - 0.32 - 0.14 - 0.250 9 - 141 H 0.13 - 0.41 - 0.14 - 0.263 8 - 142 Y 0.16 - 0.36 - 0.16 - 0.263 8 - 143 I 0.12 - 0.34 - 0.16 - 0.263 8 - 144 K 0.11 - 0.46 - 0.15 - 0.283 7 - 145 T 0.07 - 0.54 D 0.20 - 0.273 8 - 146 S 0.07 - 0.55 D 0.17 - 0.253 8 - 147 L 0.09 - 0.56 D 0.17 - 0.242 9 - 148 A 0.10 - 0.57 D 0.17 - 0.250 9 - 149 P 0.09 - 0.60 D 0.13 - 0.253 8 - 150 L 0.11 - 0.51 D 0.13 - 0.242 9 - 151 P 0.13 - 0.44 - 0.14 - 0.242 9 - 152 C 0.12 - 0.38 - 0.13 - 0.240 9 - 153 Y 0.13 - 0.31 - 0.13 - 0.240 9 - 154 V 0.18 - 0.28 - 0.13 - 0.242 9 - 155 Y 0.17 - 0.33 - 0.14 - 0.250 9 - 156 L 0.21 - 0.47 D 0.13 - 0.253 8 - 157 I 0.22 - 0.54 D 0.15 - 0.263 8 - 158 E 0.17 - 0.58 D 0.15 - 0.283 7 - 159 H 0.16 - 0.62 D 0.17 - 0.303 7 - 160 P 0.13 - 0.65 D 0.21 - 0.323 6 - 161 S 0.14 - 0.59 D 0.29 - 0.303 7 - 162 L 0.17 - 0.58 D 0.41 - 0.303 7 - 163 K 0.21 - 0.56 D 0.36 - 0.293 7 - 164 Y 0.32 - 0.51 D 0.29 - 0.273 8 - 165 A 0.31 - 0.47 D 0.26 - 0.273 8 - 166 T 0.28 - 0.45 - 0.32 - 0.273 8 - 167 T 0.22 - 0.41 - 0.33 - 0.273 8 - 168 R 0.15 - 0.47 D 0.26 - 0.273 8 - 169 S 0.14 - 0.47 D 0.28 - 0.280 7 - 170 I 0.12 - 0.46 - 0.29 - 0.273 8 - 171 A 0.12 - 0.47 D 0.22 - 0.283 7 - 172 K 0.11 - 0.58 D 0.27 - 0.290 7 - 173 Y 0.13 - 0.47 D 0.20 - 0.263 8 - 174 P 0.11 - 0.38 - 0.19 - 0.253 8 - 175 V 0.10 - 0.26 - 0.26 - 0.250 9 - 176 G 0.09 - 0.24 - 0.31 - 0.250 9 - 177 I 0.13 - 0.33 - 0.25 - 0.250 9 - 178 E 0.20 - 0.28 - 0.37 - 0.253 8 - 179 V 0.26 - 0.41 - 0.33 - 0.253 8 - 180 G 0.20 - 0.45 - 0.33 - 0.270 8 - 181 P 0.17 - 0.59 D 0.25 - 0.283 7 - 182 Q 0.12 - 0.49 D 0.35 - 0.283 7 - 183 P 0.12 - 0.51 D 0.28 - 0.273 8 - 184 Q 0.13 - 0.54 D 0.42 - 0.273 8 - 185 G 0.10 - 0.51 D 0.33 - 0.263 8 - 186 V 0.12 - 0.55 D 0.22 - 0.253 8 - 187 L 0.17 - 0.54 D 0.24 - 0.253 8 - 188 R 0.14 - 0.48 D 0.24 - 0.263 8 - 189 A 0.11 - 0.53 D 0.18 - 0.273 8 - 190 D 0.11 - 0.51 D 0.19 - 0.273 8 - 191 I 0.08 - 0.41 - 0.31 - 0.283 7 - 192 L 0.07 - 0.42 - 0.33 - 0.263 8 - 193 D 0.06 - 0.47 D 0.24 - 0.273 8 - 194 Q 0.08 - 0.45 - 0.33 - 0.270 8 - 195 M 0.04 - 0.34 - 0.26 - 0.263 8 - 196 R 0.04 - 0.43 - 0.34 - 0.273 8 - 197 K 0.05 - 0.44 - 0.34 - 0.263 8 - 198 M 0.06 - 0.29 - 0.34 - 0.263 8 - 199 I 0.06 - 0.28 - 0.22 - 0.253 8 - 200 K 0.07 - 0.34 - 0.22 - 0.263 8 - 201 H 0.07 - 0.32 - 0.20 - 0.253 8 - 202 A 0.08 - 0.28 - 0.15 - 0.250 9 - 203 L 0.08 - 0.35 - 0.15 - 0.253 8 - 204 D 0.09 - 0.43 - 0.19 - 0.263 8 - 205 F 0.11 - 0.41 - 0.18 - 0.263 8 - 206 I 0.12 - 0.45 - 0.18 - 0.253 8 - 207 H 0.15 - 0.59 D 0.23 - 0.270 8 - 208 H 0.18 - 0.59 D 0.40 - 0.290 7 - 209 F 0.22 - 0.58 D 0.24 - 0.283 7 - 210 N 0.27 - 0.63 D 0.37 - 0.293 7 - 211 E 0.27 - 0.66 D 0.53 - 0.313 6 - 212 G 0.28 - 0.68 D 0.44 - 0.313 6 - 213 K 0.26 - 0.70 D 0.46 - 0.323 6 - 214 E 0.26 - 0.71 D 0.50 - 0.323 6 - 215 F 0.20 - 0.70 D 0.56 - 0.303 7 - 216 P 0.21 - 0.69 D 0.37 - 0.293 7 - 217 P 0.24 - 0.69 D 0.28 - 0.280 7 - 218 C 0.14 - 0.66 D 0.28 - 0.263 8 - 219 A 0.14 - 0.58 D 0.19 - 0.263 8 - 220 I 0.15 - 0.52 D 0.19 - 0.263 8 - 221 E 0.11 - 0.47 D 0.22 - 0.270 8 - 222 V 0.11 - 0.34 - 0.26 - 0.273 8 - 223 Y 0.12 - 0.30 - 0.28 - 0.280 7 - 224 K 0.08 - 0.37 - 0.34 - 0.280 7 - 225 I 0.09 - 0.32 - 0.33 - 0.273 8 - 226 I 0.07 - 0.35 - 0.29 - 0.283 7 - 227 E 0.09 - 0.43 - 0.38 - 0.313 6 - 228 K 0.09 - 0.49 D 0.61 D 0.333 6 - 229 V 0.12 - 0.49 D 0.58 D 0.337 6 - 230 D 0.16 - 0.53 D 0.75 D 0.354 5 - 231 Y 0.14 - 0.52 D 0.84 D 0.343 5 - 232 P 0.12 - 0.57 D 0.84 D 0.313 6 - 233 R 0.13 - 0.66 D 0.59 D 0.303 7 - 234 D 0.15 - 0.69 D 0.70 D 0.310 6 - 235 E 0.10 - 0.71 D 0.59 D 0.293 7 - 236 N 0.12 - 0.71 D 0.62 D 0.303 7 - 237 G 0.17 - 0.67 D 0.44 - 0.293 7 - 238 E 0.22 - 0.60 D 0.40 - 0.283 7 - 239 I 0.17 - 0.53 D 0.36 - 0.270 8 - 240 A 0.16 - 0.38 - 0.30 - 0.260 8 - 241 A 0.19 - 0.29 - 0.22 - 0.253 8 - 242 I 0.16 - 0.28 - 0.24 - 0.263 8 - 243 I 0.22 - 0.33 - 0.24 - 0.263 8 - 244 H 0.25 - 0.34 - 0.34 - 0.293 7 - 245 P 0.14 - 0.48 D 0.41 - 0.323 6 - 246 N 0.16 - 0.53 D 0.30 - 0.343 5 - 247 L 0.16 - 0.58 D 0.53 - 0.343 5 - 248 Q 0.16 - 0.61 D 0.71 D 0.374 4 - 249 D 0.22 - 0.64 D 0.59 D 0.354 5 - 250 Q 0.30 - 0.64 D 0.51 - 0.364 5 - 251 D 0.34 - 0.62 D 0.52 - 0.333 6 - 252 W 0.33 - 0.52 D 0.65 D 0.313 6 - 253 K 0.22 - 0.58 D 0.68 D 0.283 7 - 254 P 0.21 - 0.58 D 0.63 D 0.283 7 - 255 L 0.18 - 0.54 D 0.45 - 0.263 8 - 256 H 0.16 - 0.68 D 0.27 - 0.263 8 - 257 P 0.18 - 0.69 D 0.28 - 0.273 8 - 258 G 0.19 - 0.57 D 0.21 - 0.270 8 - 259 D 0.28 - 0.54 D 0.16 - 0.290 7 - 260 P 0.25 - 0.54 D 0.23 - 0.270 8 - 261 M 0.19 - 0.40 - 0.24 - 0.273 8 - 262 F 0.16 - 0.34 - 0.29 - 0.253 8 - 263 L 0.13 - 0.37 - 0.30 - 0.253 8 - 264 T 0.10 - 0.46 - 0.20 - 0.242 9 - 265 L 0.14 - 0.56 D 0.20 - 0.253 8 - 266 D 0.13 - 0.61 D 0.20 - 0.263 8 - 267 G 0.11 - 0.62 D 0.26 - 0.280 7 - 268 K 0.10 - 0.60 D 0.34 - 0.283 7 - 269 T 0.10 - 0.60 D 0.40 - 0.273 8 - 270 I 0.08 - 0.46 - 0.41 - 0.250 9 - 271 P 0.07 - 0.43 - 0.35 - 0.250 9 - 272 L 0.12 - 0.46 - 0.29 - 0.242 9 - 273 G 0.10 - 0.53 D 0.18 - 0.242 9 - 274 G 0.08 - 0.52 D 0.19 - 0.250 9 - 275 D 0.05 - 0.62 D 0.19 - 0.253 8 - 276 C 0.06 - 0.68 D 0.15 - 0.263 8 - 277 T 0.10 - 0.59 D 0.16 - 0.263 8 - 278 V 0.08 - 0.52 D 0.17 - 0.263 8 - 279 Y 0.09 - 0.33 - 0.20 - 0.242 9 - 280 P 0.10 - 0.27 - 0.17 - 0.242 9 - 281 V 0.12 - 0.23 - 0.21 - 0.242 9 - 282 F 0.12 - 0.18 - 0.17 - 0.253 8 - 283 V 0.09 - 0.24 - 0.16 - 0.263 8 - 284 N 0.05 - 0.28 - 0.23 - 0.303 7 - 285 E 0.06 - 0.39 - 0.28 - 0.384 4 - 286 A 0.09 - 0.35 - 0.46 - 0.404 3 - 287 A 0.08 - 0.43 - 0.72 D 0.418 3 - 288 Y 0.08 - 0.37 - 0.79 D 0.374 4 - 289 Y 0.09 - 0.55 D 0.61 D 0.354 5 - 290 E 0.10 - 0.41 - 0.49 - 0.333 6 - 291 K 0.10 - 0.50 D 0.65 D 0.323 6 - 292 K 0.07 - 0.47 D 0.66 D 0.323 6 - 293 E 0.07 - 0.36 - 0.79 D 0.333 6 - 294 A 0.06 - 0.29 - 0.95 D 0.354 5 - 295 F 0.08 - 0.27 - 0.82 D 0.333 6 - 296 A 0.09 - 0.32 - 0.70 D 0.323 6 - 297 K 0.09 - 0.42 - 0.41 - 0.343 5 - 298 T 0.08 - 0.42 - 0.36 - 0.343 5 - 299 T 0.10 - 0.51 D 0.36 - 0.414 3 - 300 K 0.09 - 0.54 D 0.64 D 0.455 2 - 301 L 0.09 - 0.48 D 0.70 D 0.394 4 - 302 T 0.15 - 0.55 D 0.43 - 0.404 3 - 303 L 0.15 - 0.48 D 0.46 - 0.374 4 - 304 N 0.12 - 0.48 D 0.34 - 0.374 4 - 305 A 0.13 - 0.47 D 0.19 - 0.374 4 - 306 K 0.22 - 0.55 D 0.19 - 0.384 4 - 307 S 0.07 - 0.53 D 0.18 - 0.434 2 - 308 I 0.07 - 0.41 - 0.23 - 0.424 3 - 309 R 0.07 - 0.37 - 0.21 - 0.414 3 - 310 C 0.11 - 0.60 D 0.19 - 0.414 3 - 311 C 0.14 - 0.67 D 0.14 - 0.394 4 - 312 L 0.15 - 0.72 D 0.13 - 0.424 3 - 313 H 0.44 - 0.80 D 0.13 - 0.485 1 - Key for output ---------------- Number - residue number Residue - amino-acid type NORSnet - raw score by NORSnet (prediction of unstructured loops) NORS2st - two-state prediction by NORSnet; D=disordered PROFbval - raw score by PROFbval (prediction of residue flexibility from sequence) Bval2st - two-state prediction by PROFbval Ucon - raw score by Ucon (prediction of protein disorder using predicted internal contacts) Ucon2st - two-state prediction by Ucon MD - raw score by MD (prediction of protein disorder using orthogonal sources) MD_rel - reliability of the prediction by MD; values range from 0-9. 9=strong prediction MD2st - two-state prediction by MD
The last column indicates whether or not disorder was predicted at the current position. Meta-Disorder predicts a total of four disorder positions, which are not significant. This coincides with the predictions of the other programs employed previously - not alltogether surprising, since Meta-Disorder draws its predictions from two of them.
Prediction of transmembrane alpha-helices and signal peptides
The results of this task are unequivocal: Aspartoacyclase does not contain any transmembrane regions. From a biological point of view this was to be expected, as Aspartoacyclase is known to be located in the cytosol.
TMHMM
Since the VM version could not be made to work, we used the server at http://www.cbs.dtu.dk/services/TMHMM/.
TMHMM uses a hidden markov model to predict transmembrane helices in proteins. It was published in 1998 by E. L.L. Sonnhammer, G. von Heijne, and A. Krogh.
Reference: Original paper
The hidden markov model used by TMHMM models the biological structure with states for helix turns, helix caps and loops on either side of the membrane, which are specially designed to model membrane insertion, too. The HMM probabilities were estimated both by using a maximum likelihood method and a discriminative method.
Results for Aspartoacyclase very clearly show absence of any sort of transmembrane structure, which is biologically sound.
# sp_P45381_ACY2_HUMAN Length: 313 # sp_P45381_ACY2_HUMAN Number of predicted TMHs: 0 # sp_P45381_ACY2_HUMAN Exp number of AAs in TMHs: 0.2005 # sp_P45381_ACY2_HUMAN Exp number, first 60 AAs: 0.01618 # sp_P45381_ACY2_HUMAN Total prob of N-in: 0.03827 sp_P45381_ACY2_HUMAN TMHMM2.0 outside 1 313
http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output
BACR_HALSA
# BACR_HALSA Length: 262 # BACR_HALSA Number of predicted TMHs: 6 # BACR_HALSA Exp number of AAs in TMHs: 140.4032 # BACR_HALSA Exp number, first 60 AAs: 26.1196 # BACR_HALSA Total prob of N-in: 0.01887 # BACR_HALSA POSSIBLE N-term signal sequence BACR_HALSA TMHMM2.0 outside 1 22 BACR_HALSA TMHMM2.0 TMhelix 23 42 BACR_HALSA TMHMM2.0 inside 43 54 BACR_HALSA TMHMM2.0 TMhelix 55 77 BACR_HALSA TMHMM2.0 outside 78 91 BACR_HALSA TMHMM2.0 TMhelix 92 114 BACR_HALSA TMHMM2.0 inside 115 120 BACR_HALSA TMHMM2.0 TMhelix 121 143 BACR_HALSA TMHMM2.0 outside 144 147 BACR_HALSA TMHMM2.0 TMhelix 148 170 BACR_HALSA TMHMM2.0 inside 171 189 BACR_HALSA TMHMM2.0 TMhelix 190 212 BACR_HALSA TMHMM2.0 outside 213 262
RET4_HUMAN
# RET4_HUMAN Length: 201 # RET4_HUMAN Number of predicted TMHs: 0 # RET4_HUMAN Exp number of AAs in TMHs: 0.01196 # RET4_HUMAN Exp number, first 60 AAs: 0.01179 # RET4_HUMAN Total prob of N-in: 0.01909 RET4_HUMAN TMHMM2.0 outside 1 201
INSL5_HUMAN
# INSL5_HUMAN Length: 135 # INSL5_HUMAN Number of predicted TMHs: 0 # INSL5_HUMAN Exp number of AAs in TMHs: 0.50415 # INSL5_HUMAN Exp number, first 60 AAs: 0.50415 # INSL5_HUMAN Total prob of N-in: 0.03772 INSL5_HUMAN TMHMM2.0 outside 1 135
LAMP1_HUMAN
# LAMP1_HUMAN Length: 417 # LAMP1_HUMAN Number of predicted TMHs: 2 # LAMP1_HUMAN Exp number of AAs in TMHs: 44.89582 # LAMP1_HUMAN Exp number, first 60 AAs: 22.24286 # LAMP1_HUMAN Total prob of N-in: 0.99287 # LAMP1_HUMAN POSSIBLE N-term signal sequence LAMP1_HUMAN TMHMM2.0 inside 1 10 LAMP1_HUMAN TMHMM2.0 TMhelix 11 33 LAMP1_HUMAN TMHMM2.0 outside 34 383 LAMP1_HUMAN TMHMM2.0 TMhelix 384 406 LAMP1_HUMAN TMHMM2.0 inside 407 417
A4_HUMAN
# A4_HUMAN Length: 770 # A4_HUMAN Number of predicted TMHs: 1 # A4_HUMAN Exp number of AAs in TMHs: 22.72525 # A4_HUMAN Exp number, first 60 AAs: 0.0027 # A4_HUMAN Total prob of N-in: 0.00015 A4_HUMAN TMHMM2.0 outside 1 700 A4_HUMAN TMHMM2.0 TMhelix 701 723 A4_HUMAN TMHMM2.0 inside 724 770
Phobius & PolyPhobius
Phobius is a program for the prediction of transmembrane region with special emphasis on reducing confusion with signal peptides. It was published in 2005 by Käll L, Krogh A, Sonnhammer EL.
Reference: Paper
Signal peptides and transmembrane proteins share a great deal of similarity and are often confused by predictors for either class; Phobius aims to predict both and to discriminate between them. It employs a hidden markov model to do this, modelling the different sequence regions pertaining to either class.
Input: An amino acid sequence.
Again, neither signal nor transmembrane regions were detected in Aspartoacyclase.
BACR_HALSA
ID FT TOPO_DOM 1 22 NON CYTOPLASMIC. FT TRANSMEM 23 42 FT TOPO_DOM 43 53 CYTOPLASMIC. FT TRANSMEM 54 76 FT TOPO_DOM 77 95 NON CYTOPLASMIC. FT TRANSMEM 96 114 FT TOPO_DOM 115 120 CYTOPLASMIC. FT TRANSMEM 121 142 FT TOPO_DOM 143 147 NON CYTOPLASMIC. FT TRANSMEM 148 169 FT TOPO_DOM 170 189 CYTOPLASMIC. FT TRANSMEM 190 212 FT TOPO_DOM 213 217 NON CYTOPLASMIC. FT TRANSMEM 218 237 FT TOPO_DOM 238 262 CYTOPLASMIC. //
RET4_HUMAN
ID RET4_HUMAN FT SIGNAL 1 18 FT REGION 1 2 N-REGION. FT REGION 3 13 H-REGION. FT REGION 14 18 C-REGION. FT TOPO_DOM 19 201 NON CYTOPLASMIC. //
INSL5_HUMAN
ID FT SIGNAL 1 22 FT REGION 1 5 N-REGION. FT REGION 6 17 H-REGION. FT REGION 18 22 C-REGION. FT TOPO_DOM 23 135 NON CYTOPLASMIC. //
LAMP1_HUMAN
ID FT SIGNAL 1 28 FT REGION 1 10 N-REGION. FT REGION 11 22 H-REGION. FT REGION 23 28 C-REGION. FT TOPO_DOM 29 381 NON CYTOPLASMIC. FT TRANSMEM 382 405 FT TOPO_DOM 406 417 CYTOPLASMIC. //
A4_HUMAN
ID A4_HUMAN FT SIGNAL 1 17 FT REGION 1 1 N-REGION. FT REGION 2 12 H-REGION. FT REGION 13 17 C-REGION. FT TOPO_DOM 18 700 NON CYTOPLASMIC. FT TRANSMEM 701 723 FT TOPO_DOM 724 770 CYTOPLASMIC. //
OCTOPUS & SPOCTOPUS
OCTOPUS uses a combination of hidden markov models and neural networks to predict transmembrane regions. It was published in 2004 by Käll L, Krogh A, Sonnhammer EL.
Reference: Original paper
OCROPUS first creates a sequence profile by running BLAST with the input sequence. Neural networks are used to subsequently predict the propensity for each residue to be located in a transmembrane region or in certain structure patterns on either side of the membrane. The resulting propensities are then fed to a hidden markov model, which calculates the most likely topology.
SPOCTOPUS extends OCTOPUS with a preprocessor that uses a neural network to assess the probability that the first 70 residues of the input sequence contain a signal peptide sequence. If this scores high enough, a hidden markov model is used to ascertain the exact offset of the signal region.
No transmembrane/signal regions were predicted for Aspartoacyclase.
BACR_HALSA
OCTOPUS predicted topology: oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiii MMMMMMMMMMMMMMMMMMMMMoooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii iiiiiiiiiiiiiiiiiiiiii
SPOCTOPUS predicted topology: oooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiMMMMMMM MMMMMMMMMMMMMMooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiM MMMMMMMMMMMMMMMMMMMMooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiii iiiiMMMMMMMMMMMMMMMMMMMMMooooooooooMMMMMMMMMMMMMMMMMMMMMiiii iiiiiiiiiiiiiiiiiiiiii
RET4_HUMAN
OCTOPUS predicted topology: iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooo
SPOCTOPUS predicted topology: nnnnSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooo
INSL5_HUMAN
OCTOPUS predicted topology: iMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooo
SPOCTOPUS predicted topology: nnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooo
LAMP1_HUMAN
OCTOPUS predicted topology: iiiiiiiiiMMMMMMMMMMMMMMMMMMMMMoooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii
SPOCTOPUS predicted topology: nnnnnnnnnnSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMMMiiiiiiiiiiiiii
A4_HUMAN
OCTOPUS predicted topology: ooooRRRRRRoooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SPOCTOPUS predicted topology: nnnSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooooooMMMMMMMMMMMMMMMMMMMM Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SignalP
SignalP is a method for the detection of signal peptides. It was first published in 1997 by Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne.
Reference: Original paper, current version
SignalP comes in two flavours: One using a neural network, the other using a hidden markov model. It supports discriminating between cleaved and uncleaved signal peptides and supports both prokaryotic and eukaryotic input.
Input: A protein sequence.
Neither flavour detected any signal sequence in Aspartoacyclase.
HMM
Neural Network
TargetP
TargetP is a software for the prediction of the cellular location of certain proteins, based on location signals in their sequence. It was published in 2000 by Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1.
Reference: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Olof Emanuelsson1, Henrik Nielsen2, Søren Brunak2 and Gunnar von Heijne1. J. Mol. Biol., 300: 1005-1016, 2000.
TargetP confines its analysis to the N-terminal part of the sequence, it can discriminate between proteins destined for either mitochondrion, chloroplast (plants only, for obvious reasons), the secretory pathway or another location.
The prediction for Aspartoacyclase was "other location", which is plausible, as the enzyme is known to reside in the cytosol.
### targetp v1.1 prediction results ################################## Number of query sequences: 1 Cleavage site predictions not included. Using NON-PLANT networks. Name Len mTP SP other Loc RC ---------------------------------------------------------------------- sp_P45381_ACY2_HUMAN 313 0.073 0.109 0.898 _ 2 ---------------------------------------------------------------------- cutoff 0.000 0.000 0.000
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php
Prediction of GO terms
GOPET
GOPET is a tool aimed at automatically assigning Gene Ontology terms to proteins. It was published in 2006 by Arunachalam Vinayagam, Coral del Val, Falk Schubert, Roland Eils, Karl-Heinz Glatting, Sándor Suhai and Rainer König.
Reference: Paper
The input sequence is first BLASTed against a database of proteins with known GO terms; a support vector machine is then used to discriminate between correct and false terms.
Results for Aspartoacyclase, all coinciding nicely with the current knowledge on the enzyme:
GOid | Aspect | Confidence | GO Term |
---|---|---|---|
GO:0016787 | F | 96% | hydrolase activity |
GO:0004046 | F | 82% | aminoacyclase activity |
GO:0019807 | F | 82% | aspartoacyclase activity |
GO:0016788 | F | 81% | hydrolase activity acting on ester bonds |
Pfam
PFAM is a large database of protein functions. It was established in 1998 at the Wellcome Trust Sanger Institute.
It is comprised of two database: Pfam-A, a manually curated high-quality database with a limited number of entries, and the much larger, automatically curated, Pfam-B.
Reference: The Pfam protein families database: R.D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J.E. Pollington, O.L. Gavin, P. Gunesekaran, G. Ceric, K. Forslund, L. Holm, E.L. Sonnhammer, S.R. Eddy, A. Bateman
The result for Aspartoacyclase is spot-on:
ProtFun 2.2
ProtFun is a program for ab-initio protein function prediction. It was published in 2002 by Juhl Jensen et al.
Reference: Paper Abstract
The software queries a number of existing prediction servers for a wide range of features, from isoelectic point to posttranslational modifications, and deduces its function from this data.
Results for Aspartoacyclase:
############## ProtFun 2.2 predictions ############## >sp_P45381_A # Functional category Prob Odds Amino_acid_biosynthesis 0.071 3.233 Biosynthesis_of_cofactors 0.144 2.003 Cell_envelope 0.033 0.535 Cellular_processes 0.137 1.875 Central_intermediary_metabolism => 0.334 5.309 Energy_metabolism 0.226 2.511 Fatty_acid_metabolism 0.022 1.663 Purines_and_pyrimidines 0.367 1.512 Regulatory_functions 0.021 0.128 Replication_and_transcription 0.167 0.625 Translation 0.113 2.559 Transport_and_binding 0.017 0.042 # Enzyme/nonenzyme Prob Odds Enzyme => 0.703 2.454 Nonenzyme 0.297 0.416 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.111 0.534 Transferase (EC 2.-.-.-) 0.202 0.585 Hydrolase (EC 3.-.-.-) 0.115 0.363 Lyase (EC 4.-.-.-) 0.031 0.662 Isomerase (EC 5.-.-.-) => 0.084 2.637 Ligase (EC 6.-.-.-) 0.074 1.460 # Gene Ontology category Prob Odds Signal_transducer 0.053 0.246 Receptor 0.004 0.024 Hormone 0.001 0.206 Structural_protein 0.001 0.041 Transporter 0.025 0.230 Ion_channel 0.015 0.257 Voltage-gated_ion_channel 0.004 0.173 Cation_channel 0.011 0.234 Transcription 0.100 0.785 Transcription_regulation 0.039 0.313 Stress_response 0.010 0.117 Immune_response 0.061 0.720 Growth_factor 0.006 0.450 Metal_ion_transport 0.009 0.020