Sequence-based predictions GLA
by Benjamin Drexler and Fabian Grandke
Contents
Secondary structure prediction
PSIPRED
http://bioinf.cs.ucl.ac.uk/psipred/
Jpred3
http://www.compbio.dundee.ac.uk/www-jpred/index.html
EBI | Chain | Describtion | E-value |
---|---|---|---|
3hg5 | B | Alpha-galactosidase A | 0.0 |
3hg5 | A | Alpha-galactosidase A | 0.0 |
3hg4 | B | Alpha-galactosidase A | 0.0 |
3hg4 | A | Alpha-galactosidase A | 0.0 |
3hg2 | B | Alpha-galactosidase A | 0.0 |
3hg2 | A | Alpha-galactosidase A | 0.0 |
3gxt | B | Alpha-galactosidase A | 0.0 |
3gxt | A | Alpha-galactosidase A | 0.0 |
3gxp | B | Alpha-galactosidase A | 0.0 |
3gxp | A | Alpha-galactosidase A | 0.0 |
3gxn | B | Alpha-galactosidase A | 0.0 |
3gxn | A | Alpha-galactosidase A | 0.0 |
1r47 | B | Alpha-galactosidase A | 0.0 |
1r47 | A | Alpha-galactosidase A | 0.0 |
1r46 | B | Alpha-galactosidase A | 0.0 |
1r46 | A | Alpha-galactosidase A | 0.0 |
3hg3 | B | Alpha-galactosidase A | 0.0 |
3hg3 | A | Alpha-galactosidase A | 0.0 |
3lxc | B | Alpha-galactosidase A | 0.0 |
3lxc | A | Alpha-galactosidase A | 0.0 |
3lxb | B | Alpha-galactosidase A | 0.0 |
3lxb | A | Alpha-galactosidase A | 0.0 |
3lxa | B | Alpha-galactosidase A | 0.0 |
3lxa | A | Alpha-galactosidase A | 0.0 |
3lx9 | B | Alpha-galactosidase A | 0.0 |
3lx9 | A | Alpha-galactosidase A | 0.0 |
1ktc | A | alpha-N-acetylgalactosaminidase | e-113 |
1ktb | A | alpha-N-acetylgalactosaminidase | e-113 |
3igu | B | Alpha-N-acetylgalactosaminidase | e-100 |
3igu | A | Alpha-N-acetylgalactosaminidase | e-100 |
3h55 | B | Alpha-N-acetylgalactosaminidase | e-100 |
3h55 | A | Alpha-N-acetylgalactosaminidase | e-100 |
3h54 | B | Alpha-N-acetylgalactosaminidase | e-100 |
3h54 | A | Alpha-N-acetylgalactosaminidase | e-100 |
3h53 | B | Alpha-N-acetylgalactosaminidase | e-100 |
3h53 | A | Alpha-N-acetylgalactosaminidase | e-100 |
The lightblue colored protein is the protein that was used as query sequence.
Comparison with DSSP
http://swift.cmbi.ru.nl/servers/html/
Find a pdf version of this image here: File:GLA DSSP Comp.pdf
Prediction of disordered regions
DISOPRED
http://bioinf.cs.ucl.ac.uk/disopred/
POODLE
http://mbs.cbrc.jp/poodle/poodle.html
POODLE-S: Missing residues
POODLE-S: High B-Factor residues
IUPRED
http://iupred.enzim.hu/index.html
Short Disorder
Long Disorder
META-Disorder
http://www.predictprotein.org/
Hint: You will have to register. It is free of charge, but you can submit max. 3 sequences within the next 12 months!
https://www.rostlab.org/owiki/index.php/Metadisorder
PROFbval
https://rostlab.org/owiki/index.php/Profbval
NORSnet
https://www.rostlab.org/owiki/index.php/Norsnet
Ucon
https://www.rostlab.org/owiki/index.php/UCON
Prediction of transmembrane alpha-helices and signal peptides
Additional Proteins
TMHMM
GLA
BARC_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
Phobius and PolyPhobius
GLA
Phobius
SIGNAL 1 31
REGION 1 9 N-REGION.
REGION 10 22 H-REGION.
REGION 23 31 C-REGION.
TOPO_DOM 32 429 NON CYTOPLASMIC.
PolyPhobius
SIGNAL 1 31
REGION 1 12 N-REGION.
REGION 13 26 H-REGION.
REGION 27 31 C-REGION.
TOPO_DOM 32 429 NON CYTOPLASMIC.
BARC_HALSA
Phobius
TOPO_DOM 1 22 NON CYTOPLASMIC.
TRANSMEM 23 42
TOPO_DOM 43 53 CYTOPLASMIC.
TRANSMEM 54 76
TOPO_DOM 77 95 NON CYTOPLASMIC.
TRANSMEM 96 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 142
TOPO_DOM 143 147 NON CYTOPLASMIC.
TRANSMEM 148 169
TOPO_DOM 170 189 CYTOPLASMIC.
TRANSMEM 190 212
TOPO_DOM 213 217 NON CYTOPLASMIC.
TRANSMEM 218 237
TOPO_DOM 238 262 CYTOPLASMIC.
PolyPhobius
TOPO_DOM 1 21 NON CYTOPLASMIC.
TRANSMEM 22 43
TOPO_DOM 44 54 CYTOPLASMIC.
TRANSMEM 55 77
TOPO_DOM 78 94 NON CYTOPLASMIC.
TRANSMEM 95 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 141
TOPO_DOM 142 147 NON CYTOPLASMIC.
TRANSMEM 148 166
TOPO_DOM 167 186 CYTOPLASMIC.
TRANSMEM 187 205
TOPO_DOM 206 215 NON CYTOPLASMIC.
TRANSMEM 216 237
TOPO_DOM 238 262 CYTOPLASMIC.
RET4_HUMAN
Phobius
SIGNAL 1 18
REGION 1 2 N-REGION.
REGION 3 13 H-REGION.
REGION 14 18 C-REGION.
TOPO_DOM 19 201 NON CYTOPLASMIC.
PolyPhobius
SIGNAL 1 18
REGION 1 3 N-REGION.
REGION 4 13 H-REGION.
REGION 14 18 C-REGION.
TOPO_DOM 19 201 NON CYTOPLASMIC.
INSL5_HUMAN
Phobius
SIGNAL 1 22
REGION 1 5 N-REGION.
REGION 6 17 H-REGION.
REGION 18 22 C-REGION.
TOPO_DOM 23 135 NON CYTOPLASMIC.
PolyPhobius
SIGNAL 1 22
REGION 1 4 N-REGION.
REGION 5 16 H-REGION.
REGION 17 22 C-REGION.
TOPO_DOM 23 135 NON CYTOPLASMIC.
LAMP1_HUMAN
Phobius
SIGNAL 1 28
REGION 1 10 N-REGION.
REGION 11 22 H-REGION.
REGION 23 28 C-REGION.
TOPO_DOM 29 381 NON CYTOPLASMIC.
TRANSMEM 382 405
TOPO_DOM 406 417 CYTOPLASMIC.
PolyPhobius
SIGNAL 1 28
REGION 1 9 N-REGION.
REGION 10 22 H-REGION.
REGION 23 28 C-REGION.
TOPO_DOM 29 381 NON CYTOPLASMIC.
TRANSMEM 382 405
TOPO_DOM 406 417 CYTOPLASMIC.
A4_HUMAN
Phobius
SIGNAL 1 17
REGION 1 1 N-REGION.
REGION 2 12 H-REGION.
REGION 13 17 C-REGION.
TOPO_DOM 18 700 NON CYTOPLASMIC.
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC.
PolyPhobius
SIGNAL 1 17
REGION 1 3 N-REGION.
REGION 4 12 H-REGION.
REGION 13 17 C-REGION.
TOPO_DOM 18 700 NON CYTOPLASMIC.
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC.
OCTOPUS and SPOCTOPUS
http://octopus.cbr.su.se/index.php
GLA
Octopus
Spoctopus
BARC_HALSA
Octopus
Spoctopus
RET4_HUMAN
Octopus
Spoctopus
INSL5_HUMAN
Octopus
Spoctopus
LAMP1_HUMAN
Octopus
Spoctopus
A4_HUMAN
Octopus
Spoctopus
SignalP
GLA
BARC_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
TargetP
http://www.cbs.dtu.dk/services/TargetP/
Name | Length | mTP | SP | other | Loc | RC |
---|---|---|---|---|---|---|
GLA | 429 | 0.041 | 0.860 | 0.141 | S | 2 |
BACR_HALSA | 262 | 0.019 | 0.897 | 0.562 | S | 4 |
RET4_HUMAN | 201 | 0.242 | 0.928 | 0.020 | S | 2 |
INSL5_HUMA | 135 | 0.074 | 0.899 | 0.037 | S | 1 |
LAMP1_HUMA | 417 | 0.043 | 0.953 | 0.017 | S | 1 |
A4_HUMAN | 770 | 0.035 | 0.937 | 0.084 | S | 1 |
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php
Prediction of GO terms
Programs
GOPET
http://genius.embnet.dkfz-heidelberg.de/menu/biounit/open-husar
We used the default settings (GO aspect: molecular function, maximum number of predictions: 20, confidence threshold: 60, GOPET model 2007 june, version 2.0, GOPET database 2007). The results only contain GOids of the GO aspect "molecular function", since the other two GO aspects (cellular component and biological process) were not available.
Pfam
Pfam is a database composed of the protein domain families that is created by using Hidden Markov Models profiles(HMMs). Each protein domain family is represented by a multiple sequence alignment and a HMMs. One can search one protein sequence against Pfam and obtain all the possible domains that the query sequence might contain.
Pfam database includes two parts A and B where the protein domain families with different quality levels. In the 1.0 release of Pfam, the protein entries in Pfam-A and Pfam-B were from Swissprot (a few initial members of seed alignment in Pfam-A were from several sources: Swissprot, Prosite, ProDom etc.). In the current release of Pfam, the entries in Pfam-A and Pfam-B are from Pfamseq(UniProtKB) and ADDA respectively.
The Pfam-A contains the well characterized entries with annotation. It starts with the building of the seed alignment with a few selected representative sequence members under manually quality checking. Then the HMMs is applied automatically to make full alignment and try to detect all the possible members for each initial family. The families/domains in Pfam-A are in high quality level and could be used as a reliable annotation/classification evidence for the query sequence.
The Pfam-B is created based on the sequence alignment of the entries from ADDA by using HMMs. Those entries existing already in Pfam-A are excluded. There are no confirmed annotation and no manual quality checking for the families in Pfam-B, therefore there could be some errors (e.g. the members in one family could be just randomly aligned) and the overall quality is relative low. However, it still can be useful for the situation that one can not find domain evidence in Pfam-A for the query sequence.
We used the "sequence search" feature of Pfam to determine potential domains or domain families of the protein. Afterwards we checked out the corresponding page of the domain (family) for a GO annotation. The search was performed with the default settings (cut-off: use E-Value, threshold 1.0), but we also included Pfam-B in the search. Only one hit in Pfam-B was found which does not have any GO annotation and hence there was no gain in including Pfam-B. The classification in respect to the significance of a hit was done by the Pfam search algorithm. The results are listed in the tables below.
ProtFun
http://www.cbs.dtu.dk/services/ProtFun/
The results of the Gene Ontology category assignment of ProtFun are listed below. The term 'Prob' represents the calculated probability by ProtFun that the query belongs to the category. This probability is dependent on the prior probability of the category. 'Odds' describes the odds that the query belongs to the certain category and is not influenced by the prior probability.<ref name=ProtFun>Explanation of the ProtFun 2.2 output.</ref> The class with the highest information content and with the highest probability is marked bold. Additionally we provide a table for each query that contains the categories with the highest information content or probability, respectively, and their associated GO id. For this purpose, we used the search feature of the Gene Ontology website.
Proteins
GLA
GOPET
GOid | Confidence | GO term |
---|---|---|
GO:0016798 | 98% | hydrolase activity acting on glycosyl bonds |
GO:0004553 | 98% | hydrolase activity hydrolyzing O-glycosyl compounds |
GO:0016787 | 97% | hydrolase activity |
GO:0004557 | 96% | alpha-galactosidase activit |
GO:0008456 | 89% | alpha-N-acetylgalactosaminidase activity |
Pfam
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Melibiase | Family | x | Molecular function | hydrolase activity, hydrolyzing O-glycosyl compounds | GO:0004553 |
Pfam-A | Melibiase | Family | x | Biological process | carbohydrate metabolic process | GO:0005975 |
ProtFun
Gene Ontology category Prob Odds
Signal_transducer 0.090 0.419
Receptor 0.014 0.083
Hormone 0.002 0.318
Structural_protein 0.004 0.127
Transporter 0.024 0.222
Ion_channel 0.010 0.169
Voltage-gated_ion_channel 0.003 0.127
Cation_channel 0.010 0.215
Transcription 0.047 0.367
Transcription_regulation 0.026 0.204
Stress_response 0.049 0.552
Immune_response 0.012 0.136
Growth_factor 0.006 0.412
Metal_ion_transport 0.009 0.020
Type | GO category | GO aspect | GO id |
---|---|---|---|
Highest probablity | Signal transducer | Molecular function | GO:0004871 |
BARC_HALSA
GOPET
GOid | Confidence | GO term |
---|---|---|
GO:0005216 | 77% | ion channel activity |
GO:0008020 | 75% | G-protein coupled photoreceptor activity |
GO:0015078 | 60% | hydrogen ion transmembrane transporter activity |
Pfam
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Bacteriorhodopsin-like protein | Domain | x | Cellular component | membrane | GO:0016020 |
Pfam-A | Bacteriorhodopsin-like protein | Domain | x | Molecular function | ion channel activity | GO:0005216 |
Pfam-A | Bacteriorhodopsin-like protein | Domain | x | Biological process | ion transport | GO:0006811 |
Pfam-A | Domain of unknown function DUF21 | Family | - | - | - |
ProtFun
Gene Ontology category Prob Odds
Signal_transducer 0.258 1.205
Receptor 0.355 2.087
Hormone 0.001 0.206
Structural_protein 0.006 0.200
Transporter => 0.440 4.036
Ion_channel 0.010 0.169
Voltage-gated_ion_channel 0.004 0.172
Cation_channel 0.078 1.689
Transcription 0.026 0.205
Transcription_regulation 0.028 0.226
Stress_response 0.012 0.139
Immune_response 0.011 0.128
Growth_factor 0.010 0.727
Metal_ion_transport 0.049 0.106
Type | GO category | GO aspect | GO id |
---|---|---|---|
Highest information content / highest probability | Transporter | Molecular function | GO:0005215 |
RET4_HUMAN
GOPET
GOid | Confidence | GO term |
---|---|---|
GO:0005488 | 90% | binding |
GO:0005501 | 81% | retinoid binding |
GO:0008289 | 80% | lipid binding |
GO:0019841 | 78% | retinol binding |
GO:0005215 | 78% | transporter activity |
GO:0016918 | 78% | retinal binding |
GO:0005319 | 69% | lipid transporter activity |
GO:0008035 | 60% | high-density lipoprotein particle binding |
Pfam
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Lipocalin / cytosolic fatty-acid binding protein family | Domain | x | Molecular function | binding | GO:0005488 |
Pfam-A | DspF/AvrF protein | Family | - | - | - | |
Pfam-B | PB008544 | - | - | - | - | - |
ProtFun
Gene Ontology category Prob Odds
Signal_transducer 0.202 0.942
Receptor 0.147 0.862
Hormone 0.004 0.667
Structural_protein 0.002 0.058
Transporter 0.025 0.232
Ion_channel 0.016 0.288
Voltage-gated_ion_channel 0.003 0.148
Cation_channel 0.010 0.215
Transcription 0.027 0.207
Transcription_regulation 0.025 0.196
Stress_response 0.161 1.829
Immune_response => 0.239 2.813
Growth_factor 0.023 1.617
Metal_ion_transport 0.009 0.020
Type | GO category | GO aspect | GO id |
---|---|---|---|
Highest information content / highest probability | Immune response | Biological process | GO:0006955 |
INSL5_HUMAN
GOPET
GOid | Confidence | GO term |
---|---|---|
GO:0005179 | 80% | hormone activity |
Pfam
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Insulin/IGF/Relaxin family | Domain | x | Cellular component | extracellular region | GO:0005576 |
Pfam-A | Insulin/IGF/Relaxin family | Domain | x | Molecular function | hormone activity | GO:0005179 |
ProtFun
Gene Ontology category Prob Odds
Signal_transducer 0.374 1.746
Receptor 0.128 0.750
Hormone => 0.247 37.936
Structural_protein 0.001 0.041
Transporter 0.025 0.228
Ion_channel 0.010 0.168
Voltage-gated_ion_channel 0.003 0.131
Cation_channel 0.010 0.215
Transcription 0.054 0.425
Transcription_regulation 0.091 0.724
Stress_response 0.099 1.128
Immune_response 0.178 2.090
Growth_factor 0.061 4.379
Metal_ion_transport 0.009 0.020
Type | GO category | GO aspect | GO id |
---|---|---|---|
Highest information content | Hormone | Molecular function | GO:0005179 |
Highest probability | Signal transducer | Molecular function | GO:0004871 |
LAMP1_HUMAN
GOPET
GOid | Confidence | GO term |
---|---|---|
GO:0004812 | 60% | aminoacyl-tRNA ligase activity |
GO:0005524 | 60% | ATP binding |
Pfam
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Lysosome-associated membrane glycoprotein | Family | x | Cellular component | membrane | GO:0016020 |
Pfam-A | Protein of unknown function DUF1180 | Family | - | - | - |
ProtFun
Gene Ontology category Prob Odds
Signal_transducer 0.396 1.849
Receptor 0.282 1.659
Hormone 0.001 0.206
Structural_protein 0.011 0.408
Transporter 0.024 0.222
Ion_channel 0.008 0.147
Voltage-gated_ion_channel 0.002 0.111
Cation_channel 0.010 0.215
Transcription 0.032 0.247
Transcription_regulation 0.018 0.142
Stress_response 0.246 2.795
Immune_response => 0.371 4.368
Growth_factor 0.013 0.956
Metal_ion_transport 0.009 0.020
Type | GO category | GO aspect | GO id |
---|---|---|---|
Highest information content | Immune response | Biological process | GO:0006955 |
Highest probability | Signal transducer | Molecular function | GO:0004871 |
A4_HUMAN
GOPET
GOid | Confidence | GO term |
---|---|---|
GO:0004866 | 87% | endopeptidase inhibitor activity |
GO:0004867 | 86% | serine-type endopeptidase inhibitor activity |
GO:0030568 | 83% | plasmin inhibitor activity |
GO:0030304 | 83% | trypsin inhibitor activity |
GO:0030414 | 82% | peptidase inhibitor activity |
GO:0005488 | 79% | binding |
GO:0005515 | 74% | protein binding |
GO:0046872 | 73% | metal ion binding |
GO:0003677 | 71% | DNA binding |
GO:0008201 | 70% | heparin binding |
GO:0008270 | 69% | zinc ion binding |
GO:0005507 | 69% | copper ion binding |
GO:0005506 | 67% | iron ion binding |
Pfam
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Amyloid A4 N-terminal heparin-binding | Domain | x | Cellular component | integral to membrane | GO:0016021 |
Pfam-A | Amyloid A4 N-terminal heparin-binding | Domain | x | Molecular function | binding | GO:0005488 |
Pfam-A | Copper-binding of amyloid precursor, CuBD | Domain | x | - | - | - |
Pfam-A | Kunitz/Bovine pancreatic trypsin inhibitor domain | Domain | x | Molecular function | serine-type endopeptidase inhibitor activity | GO:0004867 |
Pfam-A | E2 domain of amyloid precursor protein | Domain | x | - | - | - |
Pfam-A | Beta-amyloid peptide | Family | x | Cellular component | integral to membrane | GO:0016021 |
Pfam-A | Beta-amyloid peptide | Family | x | Molecular function | binding | GO:0005488 |
Pfam-A | beta-amyloid precursor protein C-terminus | Family | x | - | - | - |
Pfam-A | Exonuclease VII, large subunit | Family | - | - | - | |
Pfam-A | Transcriptional activator TraM | Family | - | - | - |
ProtFun
Gene Ontology category Prob Odds
Signal_transducer 0.126 0.586
Receptor 0.036 0.211
Hormone 0.001 0.206
Structural_protein => 0.034 1.205
Transporter 0.024 0.222
Ion_channel 0.009 0.162
Voltage-gated_ion_channel 0.002 0.108
Cation_channel 0.010 0.215
Transcription 0.043 0.335
Transcription_regulation 0.018 0.143
Stress_response 0.076 0.862
Immune_response 0.016 0.183
Growth_factor 0.005 0.372
Metal_ion_transport 0.009 0.020
Type | GO category | GO aspect | GO id |
---|---|---|---|
Highest information content | Structural protein | Molecular function | GO:0005198 |
Highest probability | Signal transducer | Molecular function | GO:0004871 |
References
<references />