Difference between revisions of "Secondary Structure Prediction BCKDHA"

From Bioinformatikpedia
(Results)
(Results)
Line 802: Line 802:
 
|}
 
|}
   
  +
TMHMM predicted one transmembrane helix for the A4_HUMAN. This agrees with the Uniprot annotation. The predicted transmembrane helix begins at position 701 in the protein, whereas Uniprot states the Transmembrane regions goes from position 700-723. The extracellular region reported by Uniprot begins at position 18 in the sequence, this is due to a signal peptide. TMHMM doesn't include a signal peptide prediction, therefore it predicted the extracellular region from position 1-700.
TMHMM predicted one transmembrane helix for the A4_HUMAN.
 
   
 
[[File:BCDKHA_BACR_HALSA_region.PNG|400px|thumb|Membrane topology of BACR_HALSA (source: Uniprot)]]
 
[[File:BCDKHA_BACR_HALSA_region.PNG|400px|thumb|Membrane topology of BACR_HALSA (source: Uniprot)]]
Line 837: Line 837:
 
|213-262||outside
 
|213-262||outside
 
|}
 
|}
  +
  +
The TMHMM prediction differs a little bit from the information provided in Uniprot. TMHMM predicted only 13 different domains of the protein (the end of the protein is predicted to be in the extracellular space), whereas in Uniprot 15 domains are reported (protein ends in cytoplasma).
   
 
'''INSL5_HUMAN'''
 
'''INSL5_HUMAN'''
Line 848: Line 850:
 
|}
 
|}
   
  +
The TMHMM prediction agrees with fact that INSL5_HUMAN is a hormone and therefore secreted in the extracellular region.
   
 
'''LAMP1_HUMAN'''
 
'''LAMP1_HUMAN'''
Line 866: Line 869:
 
|407-417||inside
 
|407-417||inside
 
|}
 
|}
  +
  +
The prediction for LAMP1_HUMAN made by TMHMM does only partially agree with the Uniprot annotation. The sequence parts which form the signal peptide and lumenal domain are predicted to be another transmembrane helix and extracellular domain. The second transmembrane helix is predicted correctly.
   
 
'''RET4_HUMAN'''
 
'''RET4_HUMAN'''
   
[[File:BCKDHA_RET4_regions.PNG|400px|thumb|Membrane topology of RET4_HUMAN (source: Uniprot)]]
 
 
{| border="1" style="text-align:center; border-spacing:0;"
 
{| border="1" style="text-align:center; border-spacing:0;"
 
!Position
 
!Position
Line 876: Line 880:
 
|1-201||outside
 
|1-201||outside
 
|}
 
|}
  +
  +
The TMHMM prediction for RET4_HUMAN is correct, as RET4_HUMAN is a secreted protein and does not span any membrane.
   
 
=== Phobius and Polyphobius===
 
=== Phobius and Polyphobius===

Revision as of 21:34, 4 June 2011

1. Secondary structure prediction

General Information

The secondary structure of a protein bases on the primary structure and consists of alpha-helices, beta-sheets and coils.

alpha-helices

alpha-helix

Alpha-helices are build by H-bounds between the NH-group of an amino acid and the CO-group of the amino acid which is placed four recidues earlier (i+4). This form of the alhpa-helix is the most common one. There are two other types of alpha-helices which are very rare. One is called 3,10-helices because the H-bound is between the NH-group and the CO-group three recidues earlier (i+3). And the other one is the Phi-helix and here the H-boung is between the NH-group and the CO-group five residues earlier (i+5). The different locations of the CO-group influence the width and the height of the helices.

beta-sheets

beta-sheet

The H-bounds between the CO-group and the NH-group which build a beta-sheet can be located far away from each other in the sequence.
There are two different kinds of beta-sheets. The parallel one where the sheets all point in the same direction and the anti-parallel ones where the sheets point alternately in different directions.

coils

Coils are irregular formed elements like turns.

PSIPRED

Basic information

author: David T. Jones (University College London)
year:1998
version: 2

PSIPRED uses neuronal networks which has a single hidden layer and a feed-forward back-propagation architecture to predict the secondary structure. To run PSIPRED local it requires the output of PSI-BLAST (Position Specific Iterated - BLAST) as input data.
For the online prediction on the server it is enough to enter a amino acid sequence. Since PSIPRED uses a very stringent cross validation method to evaluate the performance it reaches an average Q3 score of 80.7%.
The predicition is splitted into three different steps. In the first step sequence profiles are generated by using a position specific scoring matrix from PSI-BLAST as input for the neuronal network. In the next step the secondary structure is predicted. In the last step the output of the secundary structure prediction is filtered.

There are three different options:
- Mask low complexity regions
- Mask transmembrane helices
- Mask coiled-coil regions

References

[PSIPRED Server]
[Overview of prediction methods]
[History of the PSIPRED]

Prediction

Prediction of DSSP
Seq       MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDD
Pred      CHHHHHHHHHHHHHHHCHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCC
UniProt                                                     

Seq       KPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE
Pred      CCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCHH
UniProt             EEEE                          HHH     HH

Seq       KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDN
Pred      HHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHCCCC
UniProt   HHHHHHHHHHHHHHHHHHHHHHHH  EEE        HHHHHHHH     

Seq       TDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER
Pred      CCEEECCCCHHHHHHHCCCCHHHHHHHHCCCCCCCCCCCCCCCCCCCCCC
UniProt    EEE      HHHHHH    HHHHHHHHH     CCCC         CCC

Seq       HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGF
Pred      CCCCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEEECCCCCCHHHHHHHH
UniProt   C       CCCHHHHHHHHHHHHHHH     EEEEEE  HHH HHHHHHH

Seq       NFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG
Pred      HHHHHHCCCEEEEEECCCCCCCCCCCHHCCCCHHHHHCCCCCCCCCEECC
UniProt   HHHHH    EEEEEEE EEE    HHH  EEE  HHH HHH  EEEEEE 

Seq       NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDE
Pred      HHHHHHHHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCHHHH
UniProt     EEEEEEEEEEEEEEEEEE   EEEEEE                     

Seq       VNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK
Pred      HHHHHHCCCCHHHHHHHHHHCCCCC HHHHHHHHHHHHHHHHHHHHHHHC
UniProt            HHHHHHHHHCCCC   HHHHHHHHHHHHHHHHHHHHHHHH 

Seq       PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
Pred      CCCCHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCC
UniProt       HHHH   EEEE  HHHHHHHHHHHHHHHHHHHH  HHH   


PSIPRED has predicted 23 coils, 16 alpha helices and 6 beta sheets.
The alpha helices are quite good predicted by DSSP but it also predicts many beta-sheets but most of them are false-positives.

Jpred3

Basic information

author: Cole C, Barber JD & Barton GJ (Bioinformatics and Computational Biology Research, University of Dundee)
year: 1998
version: 3


Jpred is using a neuronal network to make the predictions. To predict the secondary structure of a protein sequence or of a multiple alignment of protein sequences the algorithm Jnet is used. The prediction accuracy for secondary strctures lies above 81%. Additionally Jpred makes predictions about the solvent accessibility.
Jpred3 needs a protein sequence or multiple alignment of protein sequences as input.
It is important that the target sequence is the first sequence in the multiple alignment since the alignment is modified so that the first sequence do not have any gaps. The alignemt has to be in the MSF or in the BLC format.

References

Jpred3 Server
About Jpred
FAQ


Prediction

By predicting the secondary structure of BCKDHA with JPred it found many hits with very good e-values in other proteins.

e-value=0.0
2bew, 2bev, 2beu, 1x80, 1wci, 1u5b, 1olx, 1ols, 1dtw, 1x7y, 1x7z, 1x7x, 1x7w, 2j9f, 2bff, 1v1r, 1olu, 1v16, 1v11, 2bfc, 2bfb, 1v1m, 2bfd, 2bfe

e-value=6e-58
1umd, 1umc, 1umb, 1um9

e-value=1e-57
2bp7, 1qs0, 1w85, 3dva, 1w88


With these hits JPred run the prediction:

Seq       MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDD
Pred        HHHHHHHHHHHHHH                 EEE              
Conf      10090009999980000000323546777770000303566666777777
UniProd                                                     

Seq       KPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE
Pred                                 EEEEE                HH
Conf      77777777777777654567777777308885377740467787776368
UniProd             EEEE                          HHH     HH

Seq       KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDN
Pred      HHHHHHHHHHHHHHHHHHHHHHHH     E      HHHHHHHHHHH
Conf      99999999999999999999875045000001677517899999885278
UniProt   HHHHHHHHHHHHHHHHHHHHHHHH  EEE        HHHHHHHH     

Seq       TDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER
Pred        EEEE    HHHHHHHH  HHHHHHHHH
Conf      84465157745788885065689988740677754577777545677777
UniProt    EEE      HHHHHH    HHHHHHHHH     CCCC         CCC

Seq       HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGF
Pred                   HHHHHHHHHHHH     EEEEEE      HHHHHHHH
Conf      64132147888770367889998750688558887407887468999999
UniProt   C       CCCHHHHHHHHHHHHHHH     EEEEEE  HHH HHHHHHH

Seq       NFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG
Pred      HHHH     EEEEEEE                 HHHHHHH   EEEEE
Conf      87500888606888703677777777777764067777005725774078
UniProt   HHHHH    EEEEEEE EEE    HHH  EEE  HHH HHH  EEEEEE 

Seq       NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDE
Pred        HHHHHHHHHHHHHHHHH    EEEEEEEEEE              HHH
Conf      74689999999999988507985588886354067777777765553688
UniProt     EEEEEEEEEEEEEEEEEE   EEEEEE                     

Seq       VNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK
Pred      HHHHHH   HHHHHHHHHHH     HHHHHHHHHHHHHHHHHHHHHHHH
Conf      99998468758999999986068866899999999999999999988606
UniProt            HHHHHHHHHCCCC   HHHHHHHHHHHHHHHHHHHHHHHH 

Seq       PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
Pred          HHHHHHH      HHHHHHHHHHHHHHHH
Conf      887368777523688756899999999999875267777777888
UniProt       HHHH   EEEE  HHHHHHHHHHHHHHHHHHHH  HHH   


By comparing the prediction of the secondary structure of Jpred and the secondary structure of BCKDHA in UniProt it is remarkable that in the beginning the prediction differs a lot from UniProt but in the middle and in the end it becomes much better. Jpred predicts more helices and less beta sheets than there are in the UniProt secondary structure.




0 129 small.png

130 257 small.png

257 385 small.png

386 Ende small.png

In the first line the secondary structure prediction is shown. The red parts stand for the alpha-helices and the green parts stand for the beta-sheets. Under this line the confidence of the prediction is symbolized by the black bars. The higher a bar is the higher is the confidence for the prediction on this position. In the last line again the confidence is shown. The numbers reach from 0 to 9 where 0 means that the prediction is very uncertain and 9 means that this prediction is quite sure.

DSSP

Basic information

author: Wolfgang Kabsch and Chris Sander (Max-Planck-Institut fürmedizinische Forschung, Heidelberg)
year: 1983
whole name: Define Secondary Structure of Proteins

Based on atomic coordinates in Protein Data Bank format, DSSP defines the secondary structure of a protein.
With this method the secondary structure is not predicted but determined from the 3D coordinates.


Referencse

[Introduction]
[Explanation ]


Prediction

The first part of the Plot till position 391 is a subpart of the whole sequence. The first 50 amino acid are not shown.
Seq     KPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMT
Pred        TT       T        TT T    T  TTT  T 333     HHHHHHHHHHHH
UniProt                            EEEEE                HH

Seq     LLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYP
Pred    HHHHHHHHHHHHHHTTTTT     TT HHHHHHHHHTT TTTSSS  TT HHHHHHTT
UniProt HHHHHHHHHHHHHH     E      HHHHHHHHHHH     EEEE    HHHHHHHH  

Seq     LELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANR
Pred    HHHHHHHHHT TT TTTT T TT    TTTT     TTTTTHHHHHHHHHHHHHHTT
UniProt HHHHHHHHH                                  HHHHHHHHHHHH

Seq     VVICYFGEGAASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPG
Pred     SSSSSSTT333THHHHHHHHHHHHTT  SSSSSSS TSSTTSS333T TTTTT333T33
UniProt EEEEEE      HHHHHHHHHHHH     EEEEEEE                 HHHHHHH

Seq     YGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYR
Pred    3T SSSSSSTT HHHHHHHHHHHHHHHHHHT  SSSSSS    T TTTT  333T
UniProt    EEEEE    HHHHHHHHHHHHHHHHH    EEEEEEEEEE             

Seq     VNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFS
Pred     HHHHHHT HHHHHHHHHHHHTT  HHHHHHHHHHHHHHHHHHHHHHHHT    3333TT
UniProt HHHHHH   HHHHHHHHHHH     HHHHHHHHHHHHHHHHHHHHHHHH     HHHHHH
 
Seq     DVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
Pred    TTTTT  HHHHHHHHHHHHHHHHH333T 333
UniProt H      HHHHHHHHHHHHHHHH


1. line: Sequence
2. line: structral elements
3. line: if a residue is involved in symmetrie contacts it is labeled with a star
4. line: if a residue is solvent accessible it is labeled with an "A"

Letter code for the secundary structure elements:

- H (blue): alpha
- 3 (yellow): residue in isolated beta-bridge
- T (red): hydrogen bonded turn
- S (green): bend



2. Prediction of disordered regions

General information

Disordered regions are long regions which do not have a regular secondary structure. They are dynamically flexible and have only a regular structure when they bind to another substrate or protein. In these regions polar and charged amino acid and especially proline are overrepresentated. The disordered regions are conserved and obtain mainly in regions which have a regulatory function. Since disordered regions have no clear secondary structure they also have no tertiary structure.


DISOPRED

Basic information

author: Jonathan J. Ward, Liam J. McGuffin, Kevin Bryson, Bernard F. Buxton and David T. Jones (University College London)
year: 2004
version: 2

DISOPRED2 identifies disordered regions by searching residues which appear in the sequence records but have no co-ordinates in the electron density map. This is a very simple method to find disordered regions because the absence of co-ordinates can also be explained with artifacts of the crystalization process.

References

Publication
DISOPRED server
Information

Prediction

DisopredOutseq.png
Disopredplot.png

In the first line the confidence of the prediction which is shown in the second line is denoted. The prediction of a disordered region is marked with an asterisk (*). All of the disordered regions are predicted with a very high confidence.
DISOPRED predicts disordered regions in the beginning and in the end of BCKDHA as it is shown in the left picture by the red fields.
Also the plot on the right side points out that the disordered regions are in the beginning and in the end since at these two sides there are the highest peaks.

POODLE

Basic information

author:
- POODLE-L S. Hirose, K. Shimizu, S. Kanai, Y. Kuroda and T. Noguchi
- POODLE-S K. Shimizu, Y. Muraoka, S. Hirose, and T. Noguchi
- POODLE-W K. Shimizu, Y. Muraoka, S. Hirose, K. Tomii and T. Noguchi
- POODLE-I S.Hirose, K.Shimizu, N.Inoue, S.Kanai and T.Noguchi

year:
- POODLE-L 2007
- POODLE-S 2007
- POODLE-W 2007
- POODLE-I 2008

POODLE uses machine learning approaches to predict the disordered regions of an amino acid sequence.
There are several different options which can be choosen:

POODLE-L: This tool searches for disordered regions which are longer than 40 consecutive amino acids.
POODLE-S: Here the focus lies on predicting short disordered regions. There are two different subtools: "Missing residues" and "High B-factor residues"
POODLE-W: With this option the proteins which are mostly disordered can be found.
POODLE-I: In this tool the other three tools are combined. POODLE-I also uses structural information to predict disordered regions. It bases on a work-flow approach.


References

[POODLE-L]
[POODLE-S]
[POODLE-W]
[POODLE-I]
[POODLE server]
[Help]


Prediction

POODLE-S

POODLE-S
Missing residues
POODLE-S
High B-factor residues
POODLE-S (Missing residues)
POODLE-S (High B-factor residues)


POODLE-S (Missing residues):


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
M A V A I A A A R V W R L N R G L S Q A A L L L L R Q P G A R G L A R S H P P R Q Q Q Q F S S L D D K P Q F P G
0.585 0.601 0.624 0.69 0.753 0.809 0.798 0.748 0.679 0.595 0.526 0.55 0.59 0.604 0.679 0.754 0.783 0.817 0.849 0.826 0.799 0.779 0.782 0.763 0.748 0.722 0.714 0.668 0.661 0.691 0.724 0.754 0.799 0.841 0.862 0.88 0.885 0.892 0.89 0.892 0.897 0.892 0.91 0.926 0.913 0.908 0.888 0.829 0.77 0.715 0.691 0.652 0.616 0.586 0.577 0.512

341 342 343 344 345
D S S A Y
0.562 0.6 0.615 0.597 0.501

420 421 422 423
L R K Q
0.565 0.594 0.557 0.525


POODLE-S (which predicts short disordered regions) with the option "Missing residues" predicted the disordered regions between the positions 1-56, 341-345 and 420-423. This is also shown in the plot above.

POODLE-S (High B-Factor residues):
6 7 8 9
A A A R
0.618 0.664 0.634 0.609

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
R G L S Q A A L L L L R Q P G A R G L A R S H P P R Q Q Q Q F S S L D D K P Q F P G A
0.544 0.647 0.669 0.716 0.762 0.791 0.777 0.801 0.8 0.799 0.786 0.782 0.744 0.738 0.753 0.797 0.812 0.875 0.898 0.907 0.907 0.889 0.865 0.849 0.816 0.811 0.843 0.867 0.889 0.916 0.909 0.894 0.858 0.805 0.745 0.689 0.634 0.619 0.583 0.594 0.588 0.552 0.525

93
E
0.529

95 96
P H
0.542 0.549

340 341 342 343 344 345 346 347 348 349 350 351 352 353 354
D D S S A Y R S V D E V N Y W
0.501 0.607 0.663 0.73 0.764 0.746 0.763 0.768 0.769 0.746 0.731 0.711 0.66 0.594 0.549

379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 499 400 401 402
E K A W R K Q S R R K V M E A F E Q A E R K P K
0.546 0.577 0.559 0.571 0.63 0.601 0.502 0.517 0.536 0.518 0.504 0.577 0.572 0.568 0.574 0.607 0.622 0.658 0.719 0.74 0.706 0.668 0.642 0.548


POODLE-S (which predicts short disordered regions) with the option "High B-Factor residues" predicted the disordered regions between the positions 6-9, 15-57, 93, 95-96, 340-354 and 379-402. This is also shown in the plot above.


POODLE-L

POODLE L BCKDHA.png

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
M A V A I A A A R V W R L N R G L S Q A A L L L L R Q P G A R G L A R S H P P R Q Q Q Q F S S L
0.516 0.518 0.517 0.521 0.526 0.538 0.543 0.55 0.562 0.574 0.58 0.587 0.594 0.606 0.613 0.618 0.622 0.626 0.632 0.642 0.652 0.666 0.674 0.68 0.682 0.684 0.685 0.683 0.679 0.675 0.672 0.668 0.663 0.657 0.648 0.642 0.637 0.634 0.628 0.619 0.61 0.601 0.588 0.575 0.558 0.542 0.521 0.598

369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 936 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 145 416 417 418 419 420 421 422 423 424 425 426 427 428
L S Q G W W D E E Q E K A W R K Q S R R K V M E A F E Q A E R K P K P N P N L L F S D V Y Q E M P A Q L R K Q Q E S L A
0.365 0.549 0.572 0.591 0.615 0.637 0.656 0.671 0.685 0.698 0.711 0.725 0.737 0.746 0.753 0.756 0.757 0.76 0.763 0.764 0.764 0.763 0.761 0.761 0.762 0.763 0.762 0.759 0.754 0.75 0.747 0.745 0.742 0.738 0.733 0.723 0.712 0.698 0.687 0.676 0.67 0.666 0.669 0.672 0.67 0.665 0.656 0.65 0.64 0.63 0.619 0.614 0.61 0.605 0.592 0.576 0.558 0.54 0.521 0.436


POODLE-L predicts two disordered regions which are longer than 40 amino acids.They are located between the positions 1-48 and 369-428.

POODLE-W

width=300px

The regions which could be disordered regions but poodle is not sure are bordered by blue squares and the disordered regions are bordered by red squares.

0=ordered regions
5=perhaps disordered regions
9=disordered regions


POODLE-I

POODLE I BCKDHA.png


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
M A V A I A A A R V W R L N R G L S Q A A L L L L R Q P G A R G L A R S H P P R Q Q Q Q F S S L D D K P Q F P G P G
0.516 0.518 0.517 0.521 0.526 0.538 0.543 0.55 0.562 0.574 0.58 0.587 0.594 0.606 0.613 0.618 0.622 0.626 0.632 0.642 0.652 0.666 0.674 0.68 0.682 0.684 0.685 0.683 0.679 0.675 0.672 0.668 0.663 0.657 0.648 0.642 0.637 0.634 0.628 0.619 0.61 0.601 0.588 0.575 0.558 0.542 0.521 0.598 0.661 0.725 0.686 0.637 0.602 0.577 0.57 0.534

341 342 343 344 345
D S S A Y
0.544 0.592 0.604 0.571 0.503

370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427
S Q G W W D E E Q E K A W R K Q S R R K V M E A F E Q A E R K P K P N P N L L F S D V Y Q E M P A Q L R K Q Q E S L
0.549 0.572 0.591 0.615 0.637 0.656 0.671 0.685 0.698 0.711 0.725 0.737 0.746 0.753 0.756 0.757 0.76 0.763 0.764 0.764 0.763 0.761 0.761 0.762 0.763 0.762 0.759 0.754 0.75 0.747 0.745 0.742 0.738 0.733 0.723 0.712 0.698 0.687 0.676 0.67 0.666 0.669 0.672 0.67 0.665 0.656 0.65 0.64 0.63 0.619 0.614 0.61 0.605 0.592 0.576 0.558 0.54 0.521

443 444 445
F D K
0.606 0.742 0.881


POODLE-I predicted the disordered regions between the positions 1-56, 341-345, 370-427 and 443-445.


Comparison

POODLE-S(Missing residues) POODLE-S(High B-factor residues) POODLE-L POODLE-W POODLE-I
1-56 6-9 1-48 325-345 1-56
341-345 15-57 369-428 341-345
420-423 93 370-427
95-96 443-445
340-354
379-402




IUPred

Basic information

author: Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon
year: 2005


IUPred predicts disordered regions by estimating the capacity of polypeptides to form stabilizing contacts. The potential to form these contacts depends on the surrounding sequence and on the chemical properties. This approach is based on the idea that disordered regions have no capacity to form sufficient interresidue interactions so that there is no stabilizing energy.


There are three different prediction types which can be chosen:
- long disorder
- short disorder
- structured regions


References

[IUPred server]
[Theory]


Prediction

Prediction type: long disorder

Long.png

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
L A R S H P P R Q Q Q Q F S S L D D
0.6043 0.5758 0.6851 0.7881 0.6851 0.6906 0.6661 0.6661 0.7415 0.7505 0.6136 0.7629 0.7982 0.7595 0.7595 0.7163 0.6948 0.5211

89 90 91 92 93
I N P S E
0.5254 0.6427 0.5493 0.5382 0.5951

385 386 387 388
Q S R R
0.5456 0.5176 0.5176 0.5017

390 391 392 393 394 395 396 397
V M E A F E Q A
0.5017 0.5017 0.5533 0.7209 0.7547 0.7755 0.6851 0.5992

399 400 401
R K P
0.5017 0.5176 0.5211

404 405 406 4407 408 409 410 411 412 413
N P N L L F S D V Y
0.5055 0.5807 0.6089 0.5707 0.6136 0.5176 0.5176 0.5176 0.5017 0.5176

420 421 422
L R K
0.5098 0.5254 0.5176

424 425 426 427 428
Q E S L A
0.5951 0.5854 0.5807 0.5296 0.5296

431
L
0.5533


When using the long disorder-tool of IUPred it predicts several disordered regions. They are located at the positions 33-50, 89-93, 385-388, 390-397, 399-401, 404-413, 420-422, 424-428 and on the position 431.


Detailed sequence with disordered region probability: File:LongSeqOut.pdf

Prediction type: short disorder

Short.png

1
M
0.5623

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
L A R s H P P R Q Q Q Q F S S L D D K P Q F P
0.5846 0.6756 0.7605 0.7688 0.7688 0.7688 0.6756 0.6827 0.7275 0.7232 0.7501 0.8311 0.7869 0.8158 0.8200 0.7817 0.7458 0.6789 0.6827 0.6035 0.5173 0.5253 0.5008

92 93
S E
0.5711 0.5473

393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411
A F E Q A E R K P K P N P N L L F S D
0.5514 0.5900 0.5992 0.6174 0.6293 0.5941 0.5667 0.5084 0.6124 0.5549 0.5008 0.5667 0.5802 0.5296 0.5802 0.5623 0.5846 0.5253

415
E
0.5008

420 421
L R
0.5296 0.5253

423 424 425
Q Q E
0.5126 0.5711 0.5008

427 428
L A
0.5374 0.5126

433
T
0.5084

438 439 440 441 442 443 444
Y P L D H F D K
0.5374 0.6035 0.6442 0.6827 0.7951 0.8158 0.8556 0.9257


When using the short disorder-tool of IUPred it predicts several disordered regions. They are located at the positions 1, 33-55, 92-93, 393-411, 415, 420-421, 423-425, 427-428, 433 and 438-444.


Detailed sequence with disordered region probability: File:ShortSeqOut.pdf

Prediction type: structured regions

Structural.png


With the option "structured regions" there was no prediction of disordered regions.
Only the command "Unkown globular domains: 1-445" appeared.

back to Maple syrup urine disease main page

3. Prediction of transmembrane alpha-helices and signal peptides

General

Transmembrane Topology

The prediction of the membrane topology of proteins aims at discovering which portions of the protein lie within the lipid bilayer of a membrane and which portions protrude from the membrane into the watery environment. Membrane spanning polypeptides usually form helices of about 20 amino acids length. As the surrounding membrane is hydrophobic, the membrane spanning part of the protein consists of hydrophobic amino acids as well. These information can be used for the prediction of transmembrane helices, which subsequently enables the prediction of the membrane topology. <ref> http://en.wikipedia.org/wiki/Membrane_topology</ref><ref>http://en.wikipedia.org/wiki/Transmembrane_domain</ref>

Prediction tools: TMHMM, OCTOPUS and SPOCTOPUS

Signal Peptides

Signal peptides are N-terminal sequence motifs directing proteins to their cellular destination, like secretory pathway, mitochondria and chloroplast. One example for a signal peptide is the secretory signal peptide (SP), which is an N-terminal peptide that is typically 15-30 amino acids long. There are three regions of a signal peptide: an N-terminal region (n-region) which is often built up by positively charged residues, a hydrophobic region (h-region) in the middle of at least six residues and a C-terminal region (c-region) of polar uncharged residues. In Eukaryotes the SP targets proteins across the endoplasmic reticulum, in prokaryotes across the plasma membrane. The SP is cleaved when the protein crosses the membrane.
Furthermore there exists chloroplast transit peptides (cTP) which are also N-terminal and are cleaved when the protein enters the choloplast. The most conserved site in cTPs is an Alanine directly after the N-terminal methionine... <ref>O. Emanuelsson, S. Brunak, G. von Heijne, H. Nielsen, "Location proteins in the cell unsing TargetP, SignalP and related tools", Nature Protocols, 2007</ref> Prediction tools: SignalP, TargetP

Combined transmembrane and signal peptide prediction

As the hydrophobic regions of a transmembrane helix and a signal peptide are highly similar, this leads to cross reaction between these two types of prediction. <ref>http://www.ebi.ac.uk/Tools/phobius/help.html</ref>

Prediction tools: Phobius and Polyphobius

In the following section different tools for predicting transmembrane helices and signal peptides are tested. As the BCKDHA protein isn't a transmembrane protein, additional proteins were used for the transmembrane and signal peptide analysis:

name organism location transmembrane protein function reference
A4_HUMAN Human Cell membrane yes Protease Inhibitor P05067
BACR_HALSA Halobacterium salinarium Cell membrane yes ion transport P02945
INSL5_HUMAN Human extracellular region no hormone Q9Y5Q6
LAMP1_HUMAN Human Cell membrane, Lysosome membrane, Endosome membrane yes Presents carbohydrate ligands to selectins P11279
RET4_HUMAN Human extracellular space no Transport P02753

TMHMM

Method

  • TMHMM was developed by Sonnhammer, Heijne and Krogh in 1998 <ref> E.L. Sonnhammer, Heijne and A. Krogh, A hidden Markov model for predicting transmembrane helices in protein sequences, Proc Int Conf Intell Syst Mol Biol.(1998)</ref>
  • TMHMM predicts transmembrane helices in proteins.
  • TMHMM is a membrane topology prediction method based on a hidden Markov model.

Execution

Before we could execute TMHMM we had to change all occurrences of "/usr/local/bin/" to "/usr/bin" in the following files: tmhmm, tmhmm.ORIG and tmhmmformat.pl

To execute the program we used these commands:

  • tmhmm P05067.fasta > tmhmm _out_P05067.txt
  • tmhmm P02945.fasta > tmhmm _out_P02945.txt
  • tmhmm Q9Y5Q6.fasta > tmhmm _out_Q9Y5Q6.txt
  • tmhmm P11279.fasta > tmhmm _out_P11279.txt
  • tmhmm P02753.fasta > tmhmm _out_P02753.txt
  • tmhmm P12694.fasta > tmhmm _out_P12694.txt

Results

BCKDHA

Position Membrane topology
1-445 outside

TMHMM predicted no membrane spanning region for the BCKDHA protein, which corresponds to the information provided in Uniprot.

Membrane topology of A4_HUMAN (source: Uniprot)

A4_HUMAN

Position Membrane topology
1-700 outside
701-723 TMhelix
724-770 inside

TMHMM predicted one transmembrane helix for the A4_HUMAN. This agrees with the Uniprot annotation. The predicted transmembrane helix begins at position 701 in the protein, whereas Uniprot states the Transmembrane regions goes from position 700-723. The extracellular region reported by Uniprot begins at position 18 in the sequence, this is due to a signal peptide. TMHMM doesn't include a signal peptide prediction, therefore it predicted the extracellular region from position 1-700.

Membrane topology of BACR_HALSA (source: Uniprot)

BACR_HALSA

Position Membrane topology
1-22 outside
23-42 TMhelix
43-54 inside
55-77 TMhelix
78-91 outside
92-114 TMhelix
115-120 inside
121-143 TMhelix
144-147 outside
148-170 TMhelix
171-189 inside
190-212 TMhelix
213-262 outside

The TMHMM prediction differs a little bit from the information provided in Uniprot. TMHMM predicted only 13 different domains of the protein (the end of the protein is predicted to be in the extracellular space), whereas in Uniprot 15 domains are reported (protein ends in cytoplasma).

INSL5_HUMAN

Membrane topology of INSL5_HUMAN (source: Uniprot)
Position Membrane topology
1-135 outside

The TMHMM prediction agrees with fact that INSL5_HUMAN is a hormone and therefore secreted in the extracellular region.

LAMP1_HUMAN

Membrane topology of LAMP1_HUMAN (source: Uniprot)
Position Membrane topology
1-10 inside
11-33 TMhelix
34-383 outside
384-406 TMhelix
407-417 inside

The prediction for LAMP1_HUMAN made by TMHMM does only partially agree with the Uniprot annotation. The sequence parts which form the signal peptide and lumenal domain are predicted to be another transmembrane helix and extracellular domain. The second transmembrane helix is predicted correctly.

RET4_HUMAN

Position Membrane topology
1-201 outside

The TMHMM prediction for RET4_HUMAN is correct, as RET4_HUMAN is a secreted protein and does not span any membrane.

Phobius and Polyphobius

Methods

  • Phobius was developed by Käll et al <ref>Käll et al., "A Combined Transmembrane Topology and Signal Peptide Prediction Method", Journal of Mol. Biology,338(5):1027-1036, 2004 </ref>
  • combined prediction of transmembrane regions and signal peptids
  • Required input information: only sequence in FASTA-Format (20 amino acids and B, Z, X are recognized)
  • As transmembrane topology and signal peptides are likely to be conserved during evolution, Polyphobius was established <ref>Käll et al., "An HMM posterior decoder for sequence feature prediction that includes homology information", Bioinformatics, 21 (Suppl 1):i251-i257, 2005</ref>, which includes information from homologous sequences to the query.
  • Required input: 2 Options: Query Sequence in FASTA-Format, which is then blasted agains uniprot_trembl or upload of an alignment in FASTA-Format which provides information about homologs.

Results

A4_HUMAN
Phobius Polyphobius
BCKDHA Phobius A4 HUMAN.png
sp|P05067|A4_HUMAN
SIGNAL 1 17
REGION 1 1 N-REGION
REGION 2 12 H-REGION
REGION 13 17 C-REGION
TOPO_DOM 18 700 NON CYTOPLASMIC
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC

sp|P05067|A4_HUMAN
SIGNAL 1 17
REGION 1 3 N-REGION
REGION 4 12 H-REGION
REGION 13 17 C-REGION
TOPO_DOM 18 700 NON CYTOPLASMIC
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC

BCKDHA Polyphobius A4 HUMAN.png


BACR_HALSA
Phobius Polyphobius
BCKDHA Phobius BACR HALSA.png
sp|P02945|BACR_HALSA
TOPO_DOM 1 22 NON CYTOPLASMIC.
TRANSMEM 23 42
TOPO_DOM 43 53 CYTOPLASMIC.
TRANSMEM 54 76
TOPO_DOM 77 95 NON CYTOPLASMIC.
TRANSMEM 96 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 142
TOPO_DOM 143 147 NON CYTOPLASMIC.
TRANSMEM 148 169
TOPO_DOM 170 189 CYTOPLASMIC.
TRANSMEM 190 212
TOPO_DOM 213 217 NON CYTOPLASMIC.
TRANSMEM 218 237
TOPO_DOM 238 262 CYTOPLASMIC.

sp|P02945|BACR_HALSA
TOPO_DOM 1 21 NON CYTOPLASMIC.
TRANSMEM 22 43
TOPO_DOM 44 54 CYTOPLASMIC.
TRANSMEM 55 77
TOPO_DOM 78 94 NON CYTOPLASMIC.
TRANSMEM 95 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 141
TOPO_DOM 142 147 NON CYTOPLASMIC.
TRANSMEM 148 166
TOPO_DOM 167 186 CYTOPLASMIC.
TRANSMEM 187 205
TOPO_DOM 206 215 NON CYTOPLASMIC.
TRANSMEM 216 237
TOPO_DOM 238 262 CYTOPLASMIC.

BCKDHA Polyphobius BACR HALSA.png


INSL5_HUMAN
Phobius Polyphobius
BCKDHA Phobius INSL5 HUMAN.png
sp|Q9Y5Q6|INSL5_HUMAN
SIGNAL 1 22
REGION 1 5 N-REGION
REGION 6 17 H-REGION
REGION 18 22 C-REGION
TOPO_DOM 23 135 NON CYTOPLASMIC

sp|Q9Y5Q6|INSL5_HUMAN
SIGNAL 1 22
REGION 1 4 N-REGION
REGION 5 16 H-REGION
REGION 17 22 C-REGION
TOPO_DOM 23 135 NON CYTOPLASMIC

BCKDHA Polyphobius INSL5 HUMAN.png
LAMP1_HUMAN
Phobius Polyphobius
BCKDHA Phobius LAMP1 HUMAN.png
sp|P11279|LAMP1_HUMAN
SIGNAL 1 28
REGION 1 10 N-REGION
REGION 11 22 H-REGION
REGION 23 28 C-REGION
TOPO_DOM 29 381 NON CYTOPLASMIC
TRANSMEM 382 405
TOPO_DOM 405 417 CYTOPLASMIC

sp|P11279|LAMP1_HUMAN
SIGNAL 1 28
REGION 1 9 N-REGION
REGION 10 22 H-REGION
REGION 23 28 C-REGION
TOPO_DOM 29 381 NON CYTOPLASMIC
TRANSMEM 382 405
TOPO_DOM 405 417 CYTOPLASMIC

BCKDHA Polyphobius LAMP1 HUMAN.png



RET4_HUMAN
Phobius Polyphobius
BCKDHA Phobius RET4 HUMAN.png
sp|P02753|RET4_HUMAN
SIGNAL 1 18
REGION 1 2 N-REGION
REGION 3 13 H-REGION
REGION 14 18 C-REGION
TOPO_DOM 19 201 NON CYTOPLASMIC

sp|P02753|RET4_HUMAN
SIGNAL 1 18
REGION 1 3 N-REGION
REGION 4 13 H-REGION
REGION 14 18 C-REGION
TOPO_DOM 19 201 NON CYTOPLASMIC

BCKDHA Polyphobius RET4 HUMAN.png

For the BCKDHA-protein Phobius predicted a signal peptide with about 90% probability at the beginning of the sequence. The predicted signal peptide is 34 amino acids long. This matches the information given on Uniprot, which says, that BCKDHA contains a 45bp long signal peptide for the transfer into the mitochondrion. The rest of the amino acid is a non cytoplasmic protein sequence. No part of the protein is predicted to be transmembrane spanning. This is also true, as BCKDHA is a protein located in the mitochondrion matrix according to Uniprot.

BCKDHA
Phobius Polyphobius
Phobius BCKDHA.png
sp|P12694|ODBA_HUMAN (BCKDHA)
Signal 1 34
Region 1 16 N-Region
Region 17 25 H-Region
Region 26 34 C-Region
TOPO_DOM 35 445 non cytoplasmic

OBDA_HUMAN (BCKDHA)
TOPO_DOM 1 445 Non cytoplasmic

BCKDHA Polyphobius BCKDHA.png

Considering the information given on Uniprot, Polyphobius performed worse than Phobius on the BCKDHA-protein sequence. It predicted no signal sequence at the beginning of the protein sequence. There is a low probability for the amino acids between position 1-45 to be a signal sequence, but all in all the whole sequenc is predicted to be a non cytoplasmic protein.

OCTOPUS and SPOCTOPUS

Methods

  • OCTOPUS was developed by Viklund and Elofsson in 2008 <ref>Håkan Viklund and Arne Elofsson, "Improving topology prediction by two-track ANN-based preference scores and an extended topological grammar", Bioinformatics (2008)</ref>
  • OCTOPUS (obtainer of correct topologies for uncharacterized sequences) uses a combination of hidden Markov models and artificial neural networks.
  • It creates a sequence profile by doing a BLAST search to obtain homologous sequences. The profile is used as input for a neural network that predicts the probability for each residue to be located in a transmembrane(M), interface (I), close loop (L), or globular loop (G) environment as well as the preference to be inside (i) or outside (o) of the membrane. A hidden Markov model is used to calculate the most likely Protein Topology.
  • Required input: Protein Sequence in FASTA-Format
  • SPOCTOPUS (Viklund et al., 2008<ref>Viklund et al., "A combined predictor of signal peptides and membrane protein topology", Bioinformatics (2008)</ref>) is an extension of OCTOPUS which also predicts signal peptides. A neural network is used to predict a signal peptide preference score. The signal peptide's location is determined by a hidden Markov model. The output contains the information retrieved by OCTOPUS as well as the probabilty if a residue is predicted to be N-terminal of a signal peptide (n) or in a signal peptide (S).
  • Required input information: Protein sequence in FASTA-Format

Results

A4_HUMAN
OCTOPUS BCKDHA Octopus A4 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus A4 HUMAN small.png
BACR_HALSA
OCTOPUS BCKDHA Octopus BACR HALSA small.png
SPOCTOPUS BCKDHA Spoctopus BACR HALSA small.png
INSL5_HUMAN
OCTOPUS BCKDHA Octopus INSL5 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus INSL5 HUMAN small.png
LAMP1_HUMAN
OCTOPUS BCKDHA Octopus LAMP1 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus LAMP1 HUMAN small.png
RET4_HUMAN
OCTOPUS BCKDHA Octopus RET4 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus RET4 HUMAN small.png
BCKDHA
OCTOPUS BCKDHA Octopus BCKDHA small.png
SPOCTOPUS BCKDHA Spoctopus BCKDHA small.png

SignalP

Method

  • SignalP was established by Nielsen et al. in 1997<ref>Nielsen et al., "Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites", Protein Engineering, 10:1-6, 1997</ref>
  • SignalP is neural network based. It identifies signal peptides and cleavage sites.

Execution

To run the command line SignalP tool, the path in the SignalP file had to be adapted to /apps/signalp-3.0

Following commands were used to execute SignalP:

  • signalp -t euk P05067.fasta > signalp_out_P05067.txt
  • signalp -t gram- P02945.fasta > signalp_out_P02945.txt
  • signalp -t euk Q9Y5Q6.fasta > signalp_out_Q9Y5Q6.txt
  • signalp -t euk P11279.fasta > signalp_out_P11279.txt
  • signalp -t euk P02753.fasta > signalp_out_P02753.txt
  • signalp -t euk P12694.fasta > signalp_out_P12694.txt


Results

BCKDHA

Both methods (NN and HMM) predicted the most likely cleavage site between positions 32 and 33 (ARG_LA).

A4_HUMAN

SignalP predicted with both methods a cleavage site between positions 17 and 18 with a high probability for a signal peptide.

BACR_HALSA

Both methods (NN and HMM) predicted no cleavage site, and therefore no signal peptide, in the BACR_HALSA sequence.

INSL5_HUMAN

For the INSL5_HUMAN protein signalP detected a cleavage site between positions 22 and 23, which is due to a predicted signal peptide at the beginning of the sequence.

LAMP1_HUMAN

SignalP predicted with both methods a cleavage site between positions 28 and 29, as there is a signal peptide detected.

RET4_HUMAN

SignalP predicted a cleavage site with high probability between positions 18 and 19 in both the NN and the HMM method. This cleavage site is predicted to be after a signal peptide.

TargetP

Method

  • TargetP was developed by Emanuelsson et al. in 2002 <ref> Emanuelsson et al., "Predicting subcellular localization of proteins based on their N-terminal amino acid sequence", J. Mol. Biol., 200: 1005-1016, 2002</ref>
  • TargetP predicts the subcellular location of eukaryotic proteins. additionally: cleavage site predictions
  • This method is neural network based. The prediction is based on the N-terminal presequences: chloroplast transit peptide(cTP), mitochondiral targeting peptide (mTP) or secretory pathway signal peptide (SP)
  • Required input information: Sequence(s) in FASTA format, organism group

Results

The TargetP prediction results can be seen in the following table:
BCKDHA TargetP.PNG

The ODBA_HUMAN (BCKDHA) is predicted to be located in the mitochondrion, which is true according to Uniprot.


back to Maple syrup urine disease main page

4. Prediction of GO terms

GOPET

Method

  • GOPET (Gene Ontology Term Prediction and Evaluation Tool) was described by Vinayagam et al.<ref> Arunachalam Vinayagam, Coral Del Val, Falk Schubert, Roland Eils, Karl-Heinz Glatting, Sándor Suhai, Rainer König, "GOPET: A tool for automated predictions of Gene Ontology terms", BMC Bioinformatics (2006), Volume: 7, Issue: 161, Publisher: BioMed Central, Pages: 161</ref>
  • GOPET is a complete automated to for assigning molecular function terms to a given sequence.
  • Required input information: cDNA or protein sequence
  • Gene Ontology is used for annotation terms, GO-mapped protein databases for performing homology searches and Support Vector Machines for the prediction and the assignment of confidence values.
  • The prediction is organism independent.

Results

BCKDHA

GOid Aspect Confidence GOTerm
GO:0003824 F 97% catalytic activity
Go:0016491 F 96% oxidoreductase activity
GO:0016624 F 95% oxidoredusctase activity acting on the aldehyde or oxo group of donors disulfide as acceptor
GO:0003863 F 90% 3-methyl-2-oxobutanoate dehydrogenase 2-methylpropanoyl-transferring activity
GO:0004739 F 89% pyruvate dehydrogenase acetyl-transferring activity
GO:0004738 F 78% pyruvat dehydrogenase activity
GO:0003826 F 77% alpha-ketoacid dehydrogenase activity
GO:0047101 F 75% 2-oxoisovalerate dehydrogenase acylting activity
GO:0008677 F 65% 2-dehydropantoate 2-reductase activity
GO:0019152 F 63% acetoin dehydrogenase activity
GO:0030955 F 63% potassium ion binding
GO:0016616 F 62% oxidoreductase activity acting on the CH-OH group of donors NAD or NADP as acceptor
GO:0046872 F 62% metal ion binding


A4_HUMAN

GOid Aspect Confidence GOTerm
GO:0004866 F 87% endopeptidase inhibitor activity
GO:0004867 F 86% serine-type endopeptidase inhibitor activity
GO:0030568 F 83% plasmin inhibitor activity
GO:0030304 F 83% trypsin inhibitor activity
GO:0030414 F 82% peptidase inhibitor activity
GO:0005488 F 79% binding
GO:0005515 F 74% protein binding
GO:0046872 F 73% metal ion binding
GO:0003677 F 71% DNA binding
GO:0008201 F 70% heparin binding
GO:0008270 F 69% zinc ion binding
GO:0005507 F 69% copper ion binding
GO:0005506 F 67% iron ion binding


BACR_HALSA

GOid Aspect Confidence GOterm
GO:0005216 F 77% ion channel activiy
GO:0008020 F 75% G-protein coupled photoreceptor activity
GO:0015078 F 60% hydrogen ion transmembrane transporter activity


INSL5_HUMAN

GOid Aspect Confidence GOterm
GO:0005179 F 80% hormone activity


LAMP1_HUMAN

GOid Aspect Confidence GOterm
GO:0004812 F 60% aminoacyl-tRNA ligase activity
GO:0005524 F 60% ATP binding


RET4_HUMAN

GOid Aspect Confidence GOterm
GO:0005488 F 90% binding
GO:0005501 F 81% retinoid binding
GO:0008289 F 80% lipid binding
GO:0019841 F 78% retinol binding
GO:0005215 F 78% transporter activity
GO:0016918 F 78% retinal binding
GO:0005319 F 69% lipid transporter activity
GO:0008035 F 60% high-density lipoprotein particle binding

Pfam

Method

  • Pfam was established by Finn et al. in 2008. It is described in <ref>Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A (2008). "The Pfam protein families database.". Nucleic Acids Res 36 (Database issue): D281–8</ref>

Results

Query Cellular Component Molecular function Biological Process
BCKDHA GO:0016624 (oxidoreductase activity, acting on the aldehyde or oxo group of donors, disulfide as acceptor) GO:0008152 (metabolic process)
A4_HUMAN GO:0016021 (integral to membrane) GO:0005488 (binding)
BACR_HALSA GO:0016020 (membrane) GO:0005216 (ion channel activity) GO: 0006811 (ion transport)
INSL5_HUMAN GO:0005576 (extracellular region) GO:0005179 (hormone activity)
LAMP1_HUMAN GO:0016020 (membrane)
RET4_HUMAN GO:0005488 (binding)

ProtFun 2.2

Method

  • ProtFun is described in : Jensen et al.<ref>Prediction of human protein function from post-translational modifications and localization features.

L. Juhl Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames, C. Kesmir, H. Nielsen, H. H. Stærfeldt, K. Rapacki, C. Workman, C. A. F. Andersen, S. Knudsen, A. Krogh, A. Valencia and S. Brunak. J. Mol. Biol., 319:1257-1265, 2002</ref>

  • ProtFun is an ab initio prediction server of protein function from sequence. Various servers are queried and the provided information is integrated into the final prediciton.

Results

BCKDHA BCKDHA ProtFun BCKDHA.png

A4_HUMAN BCKDHA ProtFun A4 Human.png

BACR_HALSA BCKDHA ProtFun BACR HALSA.png

INSL5_HUMAN BCKDHA ProtFun INSL5 Human.png


LAMP1_HUMAN BCKDHA ProtFun LAMP1 Human.png


RET4_HUMAN BCKDHA ProtFun RET4 Human.png

References

<references />


back to Maple syrup urine disease main page