Difference between revisions of "Sequence-based predictions GLA"
(→Pfam) |
m (→Pfam) |
||
Line 622: | Line 622: | ||
The Pfam-B is created based on the sequence alignment of the entries from ADDA by using HMMs. Those entries existing already in Pfam-A are excluded. There are no confirmed annotation and no manual quality checking for the families in Pfam-B, therefore there could be some errors (e.g. the members in one family could be just randomly aligned) and the overall quality is relative low. However, it still can be useful for the situation that one can not find domain evidence in Pfam-A for the query sequence. |
The Pfam-B is created based on the sequence alignment of the entries from ADDA by using HMMs. Those entries existing already in Pfam-A are excluded. There are no confirmed annotation and no manual quality checking for the families in Pfam-B, therefore there could be some errors (e.g. the members in one family could be just randomly aligned) and the overall quality is relative low. However, it still can be useful for the situation that one can not find domain evidence in Pfam-A for the query sequence. |
||
− | We used the "sequence search" feature of Pfam to determine potential domains or domain families of the protein. Afterwards we checked out the corresponding page of the domain (family) for a GO annotation. The search was performed with the default settings (cut-off: use E-Value, threshold 1.0), but we also included Pfam-B in the search. Only one hit in Pfam-B was found which does not have any GO annotation. The results are listed in the tables below. |
+ | We used the "sequence search" feature of Pfam to determine potential domains or domain families of the protein. Afterwards we checked out the corresponding page of the domain (family) for a GO annotation. The search was performed with the default settings (cut-off: use E-Value, threshold 1.0), but we also included Pfam-B in the search. Only one hit in Pfam-B was found which does not have any GO annotation and hence there was no gain in including Pfam-B. The classification in respect to the significance of a hit was done by the Pfam search algorithm. The results are listed in the tables below. |
===GLA=== |
===GLA=== |
Revision as of 17:14, 28 May 2011
by Benjamin Drexler and Fabian Grandke
Contents
Secondary structure prediction
PSIPRED
http://bioinf.cs.ucl.ac.uk/psipred/
Jpred3
http://www.compbio.dundee.ac.uk/www-jpred/index.html
EBI | Chain | Describtion | E-value |
---|---|---|---|
3hg5 | B | Alpha-galactosidase A | 0.0 |
3hg5 | A | Alpha-galactosidase A | 0.0 |
3hg4 | B | Alpha-galactosidase A | 0.0 |
3hg4 | A | Alpha-galactosidase A | 0.0 |
3hg2 | B | Alpha-galactosidase A | 0.0 |
3hg2 | A | Alpha-galactosidase A | 0.0 |
3gxt | B | Alpha-galactosidase A | 0.0 |
3gxt | A | Alpha-galactosidase A | 0.0 |
3gxp | B | Alpha-galactosidase A | 0.0 |
3gxp | A | Alpha-galactosidase A | 0.0 |
3gxn | B | Alpha-galactosidase A | 0.0 |
3gxn | A | Alpha-galactosidase A | 0.0 |
1r47 | B | Alpha-galactosidase A | 0.0 |
1r47 | A | Alpha-galactosidase A | 0.0 |
1r46 | B | Alpha-galactosidase A | 0.0 |
1r46 | A | Alpha-galactosidase A | 0.0 |
3hg3 | B | Alpha-galactosidase A | 0.0 |
3hg3 | A | Alpha-galactosidase A | 0.0 |
3lxc | B | Alpha-galactosidase A | 0.0 |
3lxc | A | Alpha-galactosidase A | 0.0 |
3lxb | B | Alpha-galactosidase A | 0.0 |
3lxb | A | Alpha-galactosidase A | 0.0 |
3lxa | B | Alpha-galactosidase A | 0.0 |
3lxa | A | Alpha-galactosidase A | 0.0 |
3lx9 | B | Alpha-galactosidase A | 0.0 |
3lx9 | A | Alpha-galactosidase A | 0.0 |
1ktc | A | alpha-N-acetylgalactosaminidase | e-113 |
1ktb | A | alpha-N-acetylgalactosaminidase | e-113 |
3igu | B | Alpha-N-acetylgalactosaminidase | e-100 |
3igu | A | Alpha-N-acetylgalactosaminidase | e-100 |
3h55 | B | Alpha-N-acetylgalactosaminidase | e-100 |
3h55 | A | Alpha-N-acetylgalactosaminidase | e-100 |
3h54 | B | Alpha-N-acetylgalactosaminidase | e-100 |
3h54 | A | Alpha-N-acetylgalactosaminidase | e-100 |
3h53 | B | Alpha-N-acetylgalactosaminidase | e-100 |
3h53 | A | Alpha-N-acetylgalactosaminidase | e-100 |
The lightblue colored protein is the protein that was used as query sequence.
Comparison with DSSP
http://swift.cmbi.ru.nl/servers/html/
Find a pdf version of this image here: File:GLA DSSP Comp.pdf
Prediction of disordered regions
DISOPRED
http://bioinf.cs.ucl.ac.uk/disopred/
POODLE
http://mbs.cbrc.jp/poodle/poodle.html
POODLE-S: Missing residues
POODLE-S: High B-Factor residues
IUPRED
http://iupred.enzim.hu/index.html
Short Disorder
Long Disorder
META-Disorder
http://www.predictprotein.org/
Hint: You will have to register. It is free of charge, but you can submit max. 3 sequences within the next 12 months!
https://www.rostlab.org/owiki/index.php/Metadisorder
PROFbval
https://rostlab.org/owiki/index.php/Profbval
NORSnet
https://www.rostlab.org/owiki/index.php/Norsnet
Ucon
https://www.rostlab.org/owiki/index.php/UCON
Prediction of transmembrane alpha-helices and signal peptides
Additional Proteins
TMHMM
GLA
BARC_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
Phobius and PolyPhobius
GLA
Phobius
SIGNAL 1 31
REGION 1 9 N-REGION.
REGION 10 22 H-REGION.
REGION 23 31 C-REGION.
TOPO_DOM 32 429 NON CYTOPLASMIC.
PolyPhobius
SIGNAL 1 31
REGION 1 12 N-REGION.
REGION 13 26 H-REGION.
REGION 27 31 C-REGION.
TOPO_DOM 32 429 NON CYTOPLASMIC.
BARC_HALSA
Phobius
TOPO_DOM 1 22 NON CYTOPLASMIC.
TRANSMEM 23 42
TOPO_DOM 43 53 CYTOPLASMIC.
TRANSMEM 54 76
TOPO_DOM 77 95 NON CYTOPLASMIC.
TRANSMEM 96 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 142
TOPO_DOM 143 147 NON CYTOPLASMIC.
TRANSMEM 148 169
TOPO_DOM 170 189 CYTOPLASMIC.
TRANSMEM 190 212
TOPO_DOM 213 217 NON CYTOPLASMIC.
TRANSMEM 218 237
TOPO_DOM 238 262 CYTOPLASMIC.
PolyPhobius
TOPO_DOM 1 21 NON CYTOPLASMIC.
TRANSMEM 22 43
TOPO_DOM 44 54 CYTOPLASMIC.
TRANSMEM 55 77
TOPO_DOM 78 94 NON CYTOPLASMIC.
TRANSMEM 95 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 141
TOPO_DOM 142 147 NON CYTOPLASMIC.
TRANSMEM 148 166
TOPO_DOM 167 186 CYTOPLASMIC.
TRANSMEM 187 205
TOPO_DOM 206 215 NON CYTOPLASMIC.
TRANSMEM 216 237
TOPO_DOM 238 262 CYTOPLASMIC.
RET4_HUMAN
Phobius
SIGNAL 1 18
REGION 1 2 N-REGION.
REGION 3 13 H-REGION.
REGION 14 18 C-REGION.
TOPO_DOM 19 201 NON CYTOPLASMIC.
PolyPhobius
SIGNAL 1 18
REGION 1 3 N-REGION.
REGION 4 13 H-REGION.
REGION 14 18 C-REGION.
TOPO_DOM 19 201 NON CYTOPLASMIC.
INSL5_HUMAN
Phobius
SIGNAL 1 22
REGION 1 5 N-REGION.
REGION 6 17 H-REGION.
REGION 18 22 C-REGION.
TOPO_DOM 23 135 NON CYTOPLASMIC.
PolyPhobius
SIGNAL 1 22
REGION 1 4 N-REGION.
REGION 5 16 H-REGION.
REGION 17 22 C-REGION.
TOPO_DOM 23 135 NON CYTOPLASMIC.
LAMP1_HUMAN
Phobius
SIGNAL 1 28
REGION 1 10 N-REGION.
REGION 11 22 H-REGION.
REGION 23 28 C-REGION.
TOPO_DOM 29 381 NON CYTOPLASMIC.
TRANSMEM 382 405
TOPO_DOM 406 417 CYTOPLASMIC.
PolyPhobius
SIGNAL 1 28
REGION 1 9 N-REGION.
REGION 10 22 H-REGION.
REGION 23 28 C-REGION.
TOPO_DOM 29 381 NON CYTOPLASMIC.
TRANSMEM 382 405
TOPO_DOM 406 417 CYTOPLASMIC.
A4_HUMAN
Phobius
SIGNAL 1 17
REGION 1 1 N-REGION.
REGION 2 12 H-REGION.
REGION 13 17 C-REGION.
TOPO_DOM 18 700 NON CYTOPLASMIC.
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC.
PolyPhobius
SIGNAL 1 17
REGION 1 3 N-REGION.
REGION 4 12 H-REGION.
REGION 13 17 C-REGION.
TOPO_DOM 18 700 NON CYTOPLASMIC.
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC.
OCTOPUS and SPOCTOPUS
http://octopus.cbr.su.se/index.php
GLA
Octopus
Spoctopus
BARC_HALSA
Octopus
Spoctopus
RET4_HUMAN
Octopus
Spoctopus
INSL5_HUMAN
Octopus
Spoctopus
LAMP1_HUMAN
Octopus
Spoctopus
A4_HUMAN
Octopus
Spoctopus
SignalP
GLA
BARC_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
TargetP
http://www.cbs.dtu.dk/services/TargetP/
Name | Length | mTP | SP | other | Loc | RC |
---|---|---|---|---|---|---|
GLA | 429 | 0.041 | 0.860 | 0.141 | S | 2 |
BACR_HALSA | 262 | 0.019 | 0.897 | 0.562 | S | 4 |
RET4_HUMAN | 201 | 0.242 | 0.928 | 0.020 | S | 2 |
INSL5_HUMA | 135 | 0.074 | 0.899 | 0.037 | S | 1 |
LAMP1_HUMA | 417 | 0.043 | 0.953 | 0.017 | S | 1 |
A4_HUMAN | 770 | 0.035 | 0.937 | 0.084 | S | 1 |
http://www.cbs.dtu.dk/services/TargetP-1.1/output.php
Prediction of GO terms
GOPET
http://genius.embnet.dkfz-heidelberg.de/menu/biounit/open-husar
We used the default settings (GO aspect: molecular function, maximum number of predictions: 20, confidence threshold: 60, GOPET model 2007 june, version 2.0, GOPET database 2007). The results only contain GOids of the GO aspect "molecular function", since the other two GO aspects (cellular component and biological process) were not available.
GLA
GOid | Confidence | GO term |
---|---|---|
GO:0016798 | 98% | hydrolase activity acting on glycosyl bonds |
GO:0004553 | 98% | hydrolase activity hydrolyzing O-glycosyl compounds |
GO:0016787 | 97% | hydrolase activity |
GO:0004557 | 96% | alpha-galactosidase activit |
GO:0008456 | 89% | alpha-N-acetylgalactosaminidase activity |
BARC_HALSA
GOid | Confidence | GO term |
---|---|---|
GO:0005216 | 77% | ion channel activity |
GO:0008020 | 75% | G-protein coupled photoreceptor activity |
GO:0015078 | 60% | hydrogen ion transmembrane transporter activity |
RET4_HUMAN
GOid | Confidence | GO term |
---|---|---|
GO:0005488 | 90% | binding |
GO:0005501 | 81% | retinoid binding |
GO:0008289 | 80% | lipid binding |
GO:0019841 | 78% | retinol binding |
GO:0005215 | 78% | transporter activity |
GO:0016918 | 78% | retinal binding |
GO:0005319 | 69% | lipid transporter activity |
GO:0008035 | 60% | high-density lipoprotein particle binding |
INSL5_HUMAN
GOid | Confidence | GO term |
---|---|---|
GO:0005179 | 80% | hormone activity |
LAMP1_HUMAN
GOid | Confidence | GO term |
---|---|---|
GO:0004812 | 60% | aminoacyl-tRNA ligase activity |
GO:0005524 | 60% | ATP binding |
A4_HUMAN
GOid | Confidence | GO term |
---|---|---|
GO:0004866 | 87% | endopeptidase inhibitor activity |
GO:0004867 | 86% | serine-type endopeptidase inhibitor activity |
GO:0030568 | 83% | plasmin inhibitor activity |
GO:0030304 | 83% | trypsin inhibitor activity |
GO:0030414 | 82% | peptidase inhibitor activity |
GO:0005488 | 79% | binding |
GO:0005515 | 74% | protein binding |
GO:0046872 | 73% | metal ion binding |
GO:0003677 | 71% | DNA binding |
GO:0008201 | 70% | heparin binding |
GO:0008270 | 69% | zinc ion binding |
GO:0005507 | 69% | copper ion binding |
GO:0005506 | 67% | iron ion binding |
Pfam
Pfam is a database composed of the protein domain families that is created by using Hidden Markov Models profiles(HMMs). Each protein domain family is represented by a multiple sequence alignment and a HMMs. One can search one protein sequence against Pfam and obtain all the possible domains that the query sequence might contain.
Pfam database includes two parts A and B where the protein domain families with different quality levels. In the 1.0 release of Pfam, the protein entries in Pfam-A and Pfam-B were from Swissprot (a few initial members of seed alignment in Pfam-A were from several sources: Swissprot, Prosite, ProDom etc.). In the current release of Pfam, the entries in Pfam-A and Pfam-B are from Pfamseq(UniProtKB) and ADDA respectively.
The Pfam-A contains the well characterized entries with annotation. It starts with the building of the seed alignment with a few selected representative sequence members under manually quality checking. Then the HMMs is applied automatically to make full alignment and try to detect all the possible members for each initial family. The families/domains in Pfam-A are in high quality level and could be used as a reliable annotation/classification evidence for the query sequence.
The Pfam-B is created based on the sequence alignment of the entries from ADDA by using HMMs. Those entries existing already in Pfam-A are excluded. There are no confirmed annotation and no manual quality checking for the families in Pfam-B, therefore there could be some errors (e.g. the members in one family could be just randomly aligned) and the overall quality is relative low. However, it still can be useful for the situation that one can not find domain evidence in Pfam-A for the query sequence.
We used the "sequence search" feature of Pfam to determine potential domains or domain families of the protein. Afterwards we checked out the corresponding page of the domain (family) for a GO annotation. The search was performed with the default settings (cut-off: use E-Value, threshold 1.0), but we also included Pfam-B in the search. Only one hit in Pfam-B was found which does not have any GO annotation and hence there was no gain in including Pfam-B. The classification in respect to the significance of a hit was done by the Pfam search algorithm. The results are listed in the tables below.
GLA
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Melibiase | Family | x | Molecular function | hydrolase activity, hydrolyzing O-glycosyl compounds | GO:0004553 |
Pfam-A | Melibiase | Family | x | Biological process | carbohydrate metabolic process | GO:0005975 |
BACR_HALSA
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Bacteriorhodopsin-like protein | Domain | x | Cellular component | membrane | GO:0016020 |
Pfam-A | Bacteriorhodopsin-like protein | Domain | x | Molecular function | ion channel activity | GO:0005216 |
Pfam-A | Bacteriorhodopsin-like protein | Domain | x | Biological process | ion transport | GO:0006811 |
Pfam-A | Domain of unknown function DUF21 | Family | - | - | - |
RET4_HUMAN
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Lipocalin / cytosolic fatty-acid binding protein family | Domain | x | Molecular function | binding | GO:0005488 |
Pfam-A | DspF/AvrF protein | Family | - | - | - | |
Pfam-B | PB008544 | - | - | - | - | - |
INSL5_HUMAN
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Insulin/IGF/Relaxin family | Domain | x | Cellular component | extracellular region | GO:0005576 |
Pfam-A | Insulin/IGF/Relaxin family | Domain | x | Molecular function | hormone activity | GO:0005179 |
LAMP1_HUMAN
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Lysosome-associated membrane glycoprotein | Family | x | Cellular component | membrane | GO:0016020 |
Pfam-A | Protein of unknown function DUF1180 | Family | - | - | - |
A4_HUMAN
Source | Description | Entry type | Significant | GO aspect | GO description | GO id |
---|---|---|---|---|---|---|
Pfam-A | Amyloid A4 N-terminal heparin-binding | Domain | x | Cellular component | integral to membrane | GO:0016021 |
Pfam-A | Amyloid A4 N-terminal heparin-binding | Domain | x | Molecular function | binding | GO:0005488 |
Pfam-A | Copper-binding of amyloid precursor, CuBD | Domain | x | - | - | - |
Pfam-A | Kunitz/Bovine pancreatic trypsin inhibitor domain | Domain | x | Molecular function | serine-type endopeptidase inhibitor activity | GO:0004867 |
Pfam-A | E2 domain of amyloid precursor protein | Domain | x | - | - | - |
Pfam-A | Beta-amyloid peptide | Family | x | Cellular component | integral to membrane | GO:0016021 |
Pfam-A | Beta-amyloid peptide | Family | x | Molecular function | binding | GO:0005488 |
Pfam-A | beta-amyloid precursor protein C-terminus | Family | x | - | - | - |
Pfam-A | Exonuclease VII, large subunit | Family | - | - | - | |
Pfam-A | Transcriptional activator TraM | Family | - | - | - |
ProtFun 2.2
http://www.cbs.dtu.dk/services/ProtFun/
References
<references />