Difference between revisions of "Fabry:Sequence-based analyses"

From Bioinformatikpedia
(GO terms)
m (Undo revision 19020 by Rackersederj (Talk))
Line 18: Line 18:
 
== Signal peptides ==
 
== Signal peptides ==
 
Prediction of the presence and location of signal peptide cleavage sites in amino acid sequences.
 
Prediction of the presence and location of signal peptide cleavage sites in amino acid sequences.
 
hest information content. In the pictures below the probabilty, odds and information content ( <math>probability \cdot odds</math>
 
probabilty*odds) a
 
   
 
== GO terms ==
 
== GO terms ==
=== QuickGO ===
 
Since we used [http://www.ebi.ac.uk/QuickGO QuickGO] in [[Fabry:Sequence_alignments_(sequence_searches_and_multiple_alignments) | Task2]] to download the GO terms we decided to refine our [[Fabry:Sequence_alignments_(sequence_searches_and_multiple_alignments)#Sequence_searches | GO analysis]] now. The QuickGO search reveals 28 distinct GO terms (see <xr id="tab:QuickGO"/>)
 
 
<figtable id="tab:QuickGO">
 
<caption>Results of the QuickGO search</caption>
 
{| style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
 
! style="border-style: solid; border-width: 1px;"|Code
 
! style="border-style: solid; border-width: 1px;"|Name
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0052692
 
| style="border-style: solid; border-width: 1px"| raffinose alpha-galactosidase activity
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0051001
 
| style="border-style: solid; border-width: 1px"| negative regulation of nitric-oxide synthase activity
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0046479
 
| style="border-style: solid; border-width: 1px"| glycosphingolipid catabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0046477
 
| style="border-style: solid; border-width: 1px"| glycosylceramide catabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0045019
 
| style="border-style: solid; border-width: 1px"| negative regulation of nitric oxide biosynthetic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0044281
 
| style="border-style: solid; border-width: 1px"| small molecule metabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0043202
 
| style="border-style: solid; border-width: 1px"| lysosomal lumen
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0043169
 
| style="border-style: solid; border-width: 1px"| cation binding
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0042803
 
| style="border-style: solid; border-width: 1px"| protein homodimerization activity
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0016936
 
| style="border-style: solid; border-width: 1px"| galactoside binding
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0016798
 
| style="border-style: solid; border-width: 1px"| hydrolase activity, acting on glycosyl bonds
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0016787
 
| style="border-style: solid; border-width: 1px"| hydrolase activity
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0016139
 
| style="border-style: solid; border-width: 1px"| glycoside catabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0009311
 
| style="border-style: solid; border-width: 1px"| oligosaccharide metabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0008152
 
| style="border-style: solid; border-width: 1px"| metabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0006687
 
| style="border-style: solid; border-width: 1px"| glycosphingolipid metabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0006665
 
| style="border-style: solid; border-width: 1px"| sphingolipid metabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005975
 
| style="border-style: solid; border-width: 1px"| carbohydrate metabolic process
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005794
 
| style="border-style: solid; border-width: 1px"| Golgi apparatus
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005764
 
| style="border-style: solid; border-width: 1px"| lysosome
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005737
 
| style="border-style: solid; border-width: 1px"| cytoplasm
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005625
 
| style="border-style: solid; border-width: 1px"| soluble fraction
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005576
 
| style="border-style: solid; border-width: 1px"| extracellular region
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005515
 
| style="border-style: solid; border-width: 1px"| protein binding
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0005102
 
| style="border-style: solid; border-width: 1px"| receptor binding
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0004557
 
| style="border-style: solid; border-width: 1px"| alpha-galactosidase activity
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0004553
 
| style="border-style: solid; border-width: 1px"| hydrolase activity, hydrolyzing O-glycosyl compounds
 
|-
 
| style="border-style: solid; border-width: 1px"| GO:0003824
 
| style="border-style: solid; border-width: 1px"| catalytic activity
 
|-
 
|}
 
</figtable>
 
</div>
 
 
=== GOPET ===
 
=== GOPET ===
  +
Searching the GOPET annotation tool with the AGAL_HUMAN [[Alpha-galactosidase_sequence|sequence]] revealed 5 GOIds, which are displayed in <xr id="tab:GOPET"/>. On a first glance, since we already know the name and function of the protein, it is a bit surprising, that alpha-galactosidase activity is only the third entry with 96% confidence. In our already carried out information gathering we learned that α-galactosidase A is a hydrolase thus the first three entries were not surprising. Considering that our enzyme mainly is a glycosidase, the both entries on top of the list make perfekt sense.<br>
  +
Again a bit surprising was the last entry. α-N-Acetylgalactosaminidase is actually used for enzyme replacement therapy, which we mention on our [[Fabry_Disease | main page]]. The structure of both enzymes is similar to each other, but this still does not explain the association of this GO term to the AGAL protein.
   
<div style="float:right; border:thin solid lightgrey; margin: 20px;">
 
 
<figtable id="tab:GOPET">
 
<figtable id="tab:GOPET">
<caption>Results of the GOPET search</caption>
+
<caption>The results of the GOPET search</caption>
 
{| style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
 
{| style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
 
! style="border-style: solid; border-width: 1px;" colspan="6"|Result for GOPET search
 
! style="border-style: solid; border-width: 1px;" colspan="6"|Result for GOPET search
  +
|- align="center"
|-
 
 
! style="border-style: solid; border-width: 1px"| GOid
 
! style="border-style: solid; border-width: 1px"| GOid
 
! style="border-style: solid; border-width: 1px"| Aspect
 
! style="border-style: solid; border-width: 1px"| Aspect
 
! style="border-style: solid; border-width: 1px"| Confidence
 
! style="border-style: solid; border-width: 1px"| Confidence
 
! style="border-style: solid; border-width: 1px"| GO term
 
! style="border-style: solid; border-width: 1px"| GO term
  +
|- align="center"
|-
 
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0016798%5d GO:0016798]
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0016798%5d GO:0016798]
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
| style="border-style: solid; border-width: 1px; text-align: right"| 98%
+
| style="border-style: solid; border-width: 1px"| 98%
 
| style="border-style: solid; border-width: 1px"| hydrolase activity acting on glycosyl bonds
 
| style="border-style: solid; border-width: 1px"| hydrolase activity acting on glycosyl bonds
  +
|- align="center"
|-
 
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0004553%5d GO:0004553]
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0004553%5d GO:0004553]
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
| style="border-style: solid; border-width: 1px; text-align: right"| 98%
+
| style="border-style: solid; border-width: 1px"| 98%
 
| style="border-style: solid; border-width: 1px"| hydrolase activity hydrolyzing O-glycosyl compounds
 
| style="border-style: solid; border-width: 1px"| hydrolase activity hydrolyzing O-glycosyl compounds
  +
|- align="center"
|-
 
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0016787%5d GO:0016787]
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0016787%5d GO:0016787]
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
| style="border-style: solid; border-width: 1px; text-align: right"| 97%
+
| style="border-style: solid; border-width: 1px"| 97%
 
| style="border-style: solid; border-width: 1px"| hydrolase activity
 
| style="border-style: solid; border-width: 1px"| hydrolase activity
  +
|- align="center"
|-
 
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0004557%5d GO:0004557]
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0004557%5d GO:0004557]
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
| style="border-style: solid; border-width: 1px; text-align: right"| 96%
+
| style="border-style: solid; border-width: 1px"| 96%
 
| style="border-style: solid; border-width: 1px"| alpha-galactosidase activity
 
| style="border-style: solid; border-width: 1px"| alpha-galactosidase activity
  +
|- align="center"
|-
 
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0008456%5d GO:0008456]
 
| style="border-style: solid; border-width: 1px"| [http://www.dkfz.de/menu/cgi-bin/srs/wgetz?-newId+-e+%5bGO-ID:0008456%5d GO:0008456]
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
 
| style="border-style: solid; border-width: 1px"| Molecular Function Ontology (F)
| style="border-style: solid; border-width: 1px; text-align: right"| 89%
+
| style="border-style: solid; border-width: 1px"| 89%
 
| style="border-style: solid; border-width: 1px"| alpha-N-acetylgalactosaminidase activity
 
| style="border-style: solid; border-width: 1px"| alpha-N-acetylgalactosaminidase activity
 
|-
 
|-
 
|}
 
|}
 
</figtable>
 
</figtable>
</div>
 
   
  +
=== ProtFun2.0 ===
Searching the GOPET annotation tool with the AGAL_HUMAN [[Alpha-galactosidase_sequence|sequence]] revealed 5 GOIds, which are displayed in <xr id="tab:GOPET"/>. Even maximizing the "maximum number of GO prediction to be displayed per sequence" and minimizing the "confidence threshold for prediction" did not result in any more sequences.<br>
 
  +
EC=3.2.1.22(EC 3.-.-.- Hydrolase)<br>
On a first glance, since we already know the name and function of the protein, it is a bit surprising, that alpha-galactosidase activity is only the third entry with 96% confidence. In our already carried out information gathering we learned that α-galactosidase A is a hydrolase thus the first three entries were not surprising. Considering that our enzyme mainly is a glycosidase, the both entries on top of the list make perfekt sense.<br>
 
  +
Predicted: EC 6.-.-.-(Ligase)
Again a bit surprising was the last entry. α-N-Acetylgalactosaminidase is actually used for enzyme replacement therapy, which we mention on our [[Fabry_Disease | main page]]. The structure of both enzymes is similar to each other, but this still does not explain the association of this GO term to the AGAL protein.<br>
 
  +
############## ProtFun 2.2 predictions ##############
  +
  +
>gi_4504009_
  +
  +
# Functional category Prob Odds
  +
Amino_acid_biosynthesis 0.283 12.847
  +
Biosynthesis_of_cofactors 0.339 4.708
  +
Cell_envelope => 0.652 10.690
  +
Cellular_processes 0.057 0.783
  +
Central_intermediary_metabolism 0.400 6.343
  +
Energy_metabolism 0.151 1.678
  +
Fatty_acid_metabolism 0.032 2.448
  +
Purines_and_pyrimidines 0.506 2.082
  +
Regulatory_functions 0.013 0.083
  +
Replication_and_transcription 0.047 0.175
  +
Translation 0.211 4.807
  +
Transport_and_binding 0.549 1.339
  +
  +
# Enzyme/nonenzyme Prob Odds
  +
Enzyme => 0.805 2.811
  +
Nonenzyme 0.195 0.273
  +
  +
# Enzyme class Prob Odds
  +
Oxidoreductase (EC 1.-.-.-) 0.176 0.845
  +
Transferase (EC 2.-.-.-) 0.195 0.564
  +
Hydrolase (EC 3.-.-.-) 0.244 0.769
  +
Lyase (EC 4.-.-.-) 0.029 0.608
  +
Isomerase (EC 5.-.-.-) 0.010 0.321
  +
Ligase (EC 6.-.-.-) => 0.141 2.776
  +
  +
# Gene Ontology category Prob Odds
  +
Signal_transducer 0.090 0.419
  +
Receptor 0.014 0.083
  +
Hormone 0.002 0.318
  +
Structural_protein 0.004 0.127
  +
Transporter 0.024 0.222
  +
Ion_channel 0.010 0.169
  +
Voltage-gated_ion_channel 0.003 0.127
  +
Cation_channel 0.010 0.215
  +
Transcription 0.047 0.367
  +
Transcription_regulation 0.026 0.204
  +
Stress_response 0.049 0.552
  +
Immune_response 0.012 0.136
  +
Growth_factor 0.006 0.412
  +
Metal_ion_transport 0.009 0.020
  +
  +
//
   
<br style="clear:both;">
 
=== ProtFun2.2 ===
 
In the [https://www.dropbox.com/s/sn9n8h4gojhxvkn/ProtFun_pred.txt output file of ProtFun], the "=>" indicates, which of the subcategories of each category is predicted to be true for the submitted sequence. This prediction is performed on basis of the highest information content. In the pictures below the probabilty, odds and information content are shown separately for each category. The left y-axis is assigned to the probabilty (blue) and the information content (red line), the right y-axis is assigned to the probabilty, which is multiplied by 10 for a better perceptibility. <br>
 
According to the information content the most likely functional category of the human α-galactosidase A protein is "Cell envelope". Considering our own researches, we would not come to this conclusion, but rather assign a metabolic or regulatory class, since to us known GO terms are for example "glycoside catabolic process", "negative regulation of nitric oxide biosynthetic process" and "small molecule metabolic process".<br>
 
The prediction that the submitted sequence belongs to an enzyme is right. <br>
 
In our opinion, it is not always the best evaluation method to decide on the basis of the information content, since the chosen enzyme class "Ligase" indeed has the highest information content, but not the highest probabilty and is, considering literature, wrong. AGAL_HUMAN clearly is a Hydrolase, which is also indicated by the high probabilty. The assigned EC number of the galactosidase is EC=3.2.1.22.<br>
 
ProtFun does not predict a Gene Ontology category, since the category with the highest information content has odds lower than 1. We actually expected this category to be very unclear since the protein has very diverse cellular functions.
 
 
<gallery widths=350px heights=350px perrow=3 caption="Probability, Odds and Information Content of the ProtFun2.2 predictions">
 
File:FABRY_Functional_category-barplot.png | Functional category
 
File:FABRY_Enzyme_class-barplot.png | Enzyme class
 
File:FABRY_Gene_Ontology_category-barplot.png | Gene Ontology category
 
</gallery>
 
 
ProtFun uses a lot of other ressources to predict the probabilties and odds of the categories. Some of the outcomes are listed in <xr id="tab:ProtFun"/>. It is known that the α-galactosidase A protein has a 31 residues long signal peptide on position 1 to 31 which later is cleaved of. Thus the prediction of SignalP 3.0 is right.<br>
 
The prediction of a propeptide cleavage site could not be confirmed in the human AGAL protein.<br>
 
ProtFun infered 22 phosphorylation sites at the positions 62, 102, 201, 235, 238, 241, 276, 304, 364, 371, 405, 424, 366, 400, 86, 134, 151, 152, 173, 184, 207, 216. [http://www.phosphosite.org/proteinAction.do?id=12871&showAllSites=true Phosphosite] confirms phosphorylation modificatopns at S23, Y134 and H186 of which only one was predicted by NetPhos.<br>
 
[http://www.uniprot.org/uniprot/P06280 Uniprot] claims that at the positions 139, 192, 215 and 408 there are N-linked Glycosilated sites (the last one is only a potential site). All 4 are predicted by NetNGlyc. As predicted by NetOGlyc, there are no (known) O-glycosylated sites.<br>
 
See also sections [[Fabry:Sequence-based_analyses#Transmembrane_helices | Transmembrane helices]] and [[Fabry:Sequence-based_analyses#Signal_peptides| Signal peptides]] for further information on these two topics.
 
 
<div style="float:right; border:thin solid lightgrey; margin: 20px;">
 
<figtable id="tab:ProtFun">
 
<caption>Output rendered by the individual features used by ProtFun 2.2</caption>
 
{| style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
 
! style="border-style: solid; border-width: 1px"| Feature
 
! style="border-style: solid; border-width: 1px"| Output summary
 
! style="border-style: solid; border-width: 1px"| Details
 
|-
 
| style="border-style: solid; border-width: 1px"| SignalP 3.0
 
| style="border-style: solid; border-width: 1px"| Most likely cleavage site between pos. 31 and 32: ARA-LD
 
| style="border-style: solid; border-width: 1px"| Using neural networks (NN) and hidden Markov models (HMM) trained on eukaryotes
 
|-
 
| style="border-style: solid; border-width: 1px"| ProP 1.0
 
| style="border-style: solid; border-width: 1px"| 1 propeptide cleavage site predicted at position: 196
 
| style="border-style: solid; border-width: 1px"| Furin-type cleavage site prediction (Arginine/Lysine residues)
 
|-
 
| style="border-style: solid; border-width: 1px"| TargetP 1.1
 
| style="border-style: solid; border-width: 1px"| No high confidence targeting predition
 
| style="border-style: solid; border-width: 1px"| -
 
|-
 
| style="border-style: solid; border-width: 1px"| NetPhos 2.0
 
| style="border-style: solid; border-width: 1px"| 22 putative phosphorylation sites
 
| style="border-style: solid; border-width: 1px"| phosphorylation site prediction
 
|-
 
| style="border-style: solid; border-width: 1px"| NetOGlyc 3.1
 
| style="border-style: solid; border-width: 1px"| No O-glycosylated sites predicted
 
| style="border-style: solid; border-width: 1px"| -
 
|-
 
| style="border-style: solid; border-width: 1px"| NetNGlyc 1.0
 
| style="border-style: solid; border-width: 1px"| 4 putative N-glycosylated sites at positions 139 192 215 408
 
| style="border-style: solid; border-width: 1px"| -
 
|-
 
| style="border-style: solid; border-width: 1px"| TMHMM 2.0
 
| style="border-style: solid; border-width: 1px"| No TM helices predicted
 
| style="border-style: solid; border-width: 1px"| -
 
|-
 
|}
 
</figtable>
 
</div>
 
 
<br style="clear:both;">
 
 
=== Pfam ===
 
=== Pfam ===
 
The Pfam sequence search revealed one significant Pfam-A match, which is shown in <xr id="fig:pfam_fam"/>.
 
 
<div style="float:right; border:thin solid lightgrey; margin: 20px;">
 
<figtable id="fig:pfam_fam">
 
<caption>Pfam-A match</caption>
 
{| style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
 
! style="border-style: solid; border-width: 1px"| Family
 
! style="border-style: solid; border-width: 1px"| Description
 
! style="border-style: solid; border-width: 1px"| Entry xtype
 
! style="border-style: solid; border-width: 1px"| Clan
 
! style="border-style: solid; border-width: 1px"| Envelope Start
 
! style="border-style: solid; border-width: 1px"| Envelope End
 
! style="border-style: solid; border-width: 1px"| Alignment Start
 
! style="border-style: solid; border-width: 1px"| Alignment End
 
! style="border-style: solid; border-width: 1px"| HMM From
 
! style="border-style: solid; border-width: 1px"| HMM To
 
! style="border-style: solid; border-width: 1px"| Bit score
 
! style="border-style: solid; border-width: 1px"| E-value
 
|-
 
| style="border-style: solid; border-width: 1px"| Melibiase
 
| style="border-style: solid; border-width: 1px"| Melibiase
 
| style="border-style: solid; border-width: 1px"| Family
 
| style="border-style: solid; border-width: 1px"| CL0058
 
| style="border-style: solid; border-width: 1px"| 33
 
| style="border-style: solid; border-width: 1px"| 149
 
| style="border-style: solid; border-width: 1px"| 40
 
| style="border-style: solid; border-width: 1px"| 146
 
| style="border-style: solid; border-width: 1px"| 41
 
| style="border-style: solid; border-width: 1px"| 140
 
| style="border-style: solid; border-width: 1px"| 50.0
 
| style="border-style: solid; border-width: 1px"| 1.5e-13
 
|-
 
|}
 
</figtable>
 
</div>
 
 
<br style="clear:both;">
 
<figure id="fig:AGAL">[[File:Alpha_GAL_A.png|200px|thumb|right|<caption>The protein encoded by the Fabry-associated GLA gene: [[Alpha-galactosidase|α-Galactosidase A]]</caption>]]</figure>
 
 
 
 
The alpha-galactosidase A protein is, according to PFAM, in the family of the [http://pfam.sanger.ac.uk/family/PF02065 Melibiases] (see <xr id="tab:Pfam_Melibiose"/>).
 
The Pfam PF02065 family itself is refered to as "Glycoside hydrolase family 27" and "Glycoside hydrolase family 36", the AGAL_HUMAN falls into the first category. Their common characteristic is that the members of this family are glycoside hydrolases (EC 3.2.1.). The AGAL enzyme catalyzes the hydrolisis of the disaccharide Melibiose (D-Gal-α(1→6)-D-Glc) into its two components galactose and glucose.<br>
 
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
 
<figtable id="tab:Pfam_Melibiose">
 
<caption>Melibiase Identifiers</caption>
 
{| style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
 
! style="border-style: solid; border-width: 1px;" colspan="6"|Melibiase Identifiers
 
|-
 
! style="border-style: solid; border-width: 1px"| Symbol
 
| style="border-style: solid; border-width: 1px"| Melibiase
 
|-
 
! style="border-style: solid; border-width: 1px"| Pfam
 
| style="border-style: solid; border-width: 1px"| [http://pfam.sanger.ac.uk/family?acc=PF02065 PF02065]
 
|-
 
! style="border-style: solid; border-width: 1px"| Pfam clan
 
| style="border-style: solid; border-width: 1px"| [http://pfam.sanger.ac.uk/clan/CL0058 CL0058]
 
|-
 
! style="border-style: solid; border-width: 1px"| InterPro
 
| style="border-style: solid; border-width: 1px"| [http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=IPR000111 IPR000111]
 
|-
 
! style="border-style: solid; border-width: 1px"| SCOP
 
| style="border-style: solid; border-width: 1px"| [http://scop.mrc-lmb.cam.ac.uk/scop/search.cgi?tlev=fa;&amp;pdb=1ktc 1ktc]
 
|-
 
! style="border-style: solid; border-width: 1px"| SUPERFAMILY
 
| style="border-style: solid; border-width: 1px"| [http://supfam.org/SUPERFAMILY/cgi-bin/search.cgi?search_field=1ktc 1ktc]
 
|-
 
! style="border-style: solid; border-width: 1px"| CAZy
 
| style="border-style: solid; border-width: 1px"| [http://www.cazy.org/GH27.html GH27]
 
|-
 
|}
 
</figtable>
 
</div>
 
 
The ''alignment'' of positions 40 - 146 of the AGAL_HUMAN protein sequence and the matching HMM used in this prediction, is shown in <xr id="fig:pfam_ali" />. According to the color code, indicating the degree of confidence of each aligned position, there is an overall very good agreement of the Hidden Markov Model and the sequence, except for the residues 17-46. Checking our background knowledge and the [http://www.uniprot.org/uniprot/P06280 Uniprot database] we could not find a very interesting or abnormal region here, but the signal peptide cleavage site and a beta strand at position 42-46.<br>
 
Our query protein belongs to a rather large Clan, the Glyco_hydro_tim Clan ([http://pfam.sanger.ac.uk/clan/CL0058 CL0058]), which includes 4 CAZy-Clans (GH-A, GH-D, GH-H and GH-K). They main attribute of all the included glycosyl hydrolase enzymes is the hold of a a TIM barrel fold (eight α-helices and eight parallel β-strands that alternate along the peptide backbone [http://en.wikipedia.org/wiki/TIM_barrel source]). This fold (residue 31 - 324) can be well seen in <xr id="fig:AGAL"/>, where the 3D structure of α-Galactosidase A is depicted.<br>
 
The InterPro protein sequence analysis and classification assigns the enzyme to the "IPR000111 Glycoside hydrolase, clan GH-D" family. As mentioned before, GH-D is part of the Glyco_hydro_tim Clan and thus it is not surprising, that the description is almost the same. According to InterPro, there are 6 IPR000111 family members in the human body.<br>
 
The Structural Classification of Proteins (SCOP) finds two domains, an Amylase, catalytic domain ([http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.d.b.j.b.html c.1.8.1], residues 32 - 323) and an alpha-Amylases, C-terminal beta-sheet domain ([http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.c.bcb.b.b.html b.71.1.1], residues 324 - 421). The catalytic domain is an alpha and beta protein (a/b) with TIM beta/alpha-barrel fold, which is consonant with the affiliation to the Glyco_hydro_tim Clan. The C-terminal beta-sheet domain is an all beta protein, the fold and superfamily is Glycosyl hydrolase domain. This is also not surprising to us.<br>
 
AGAL is in the CAZy GH27 ([http://www.cazy.org/GH27.html Glycoside Hydrolase Family 27]) family, together with a-N-acetylgalactosaminidase (EC 3.2.1.49; as mentioned before, used for enzyme replacement therapy), isomalto-dextranase(EC 3.2.1.94) and b-L-arabinopyranosidase (EC 3.2.1.88). See http://www.cazypedia.org/index.php/Glycoside_Hydrolase_Family_27 TODO
 
 
<figure id="fig:pfam_ali">[[File:pfam_alignment.png|772px|left|thumb|<caption>Melibiase - alignment region residues 40 - 146</caption>]]</figure>
 
 
<br style="clear:both;">
 
   
 
== Other programs and ressources ==
 
== Other programs and ressources ==

Revision as of 16:22, 14 May 2012

Fabry Disease » Sequence-based analyses



The following analyses were performed on the basis of the α-Galactosidase A sequence. Please consult the journal for the commands used to generate the results.

Secondary structure

Disorder

Transmembrane helices

Voltage-gated potassium channel (Q9YDF8)


Signal peptides

Prediction of the presence and location of signal peptide cleavage sites in amino acid sequences.

GO terms

GOPET

Searching the GOPET annotation tool with the AGAL_HUMAN sequence revealed 5 GOIds, which are displayed in <xr id="tab:GOPET"/>. On a first glance, since we already know the name and function of the protein, it is a bit surprising, that alpha-galactosidase activity is only the third entry with 96% confidence. In our already carried out information gathering we learned that α-galactosidase A is a hydrolase thus the first three entries were not surprising. Considering that our enzyme mainly is a glycosidase, the both entries on top of the list make perfekt sense.
Again a bit surprising was the last entry. α-N-Acetylgalactosaminidase is actually used for enzyme replacement therapy, which we mention on our main page. The structure of both enzymes is similar to each other, but this still does not explain the association of this GO term to the AGAL protein.

<figtable id="tab:GOPET"> The results of the GOPET search

Result for GOPET search
GOid Aspect Confidence GO term
GO:0016798 Molecular Function Ontology (F) 98% hydrolase activity acting on glycosyl bonds
GO:0004553 Molecular Function Ontology (F) 98% hydrolase activity hydrolyzing O-glycosyl compounds
GO:0016787 Molecular Function Ontology (F) 97% hydrolase activity
GO:0004557 Molecular Function Ontology (F) 96% alpha-galactosidase activity
GO:0008456 Molecular Function Ontology (F) 89% alpha-N-acetylgalactosaminidase activity

</figtable>

ProtFun2.0

EC=3.2.1.22(EC 3.-.-.- Hydrolase)
Predicted: EC 6.-.-.-(Ligase)

############## ProtFun 2.2 predictions ##############

>gi_4504009_

# Functional category                  Prob     Odds
  Amino_acid_biosynthesis              0.283   12.847
  Biosynthesis_of_cofactors            0.339    4.708
  Cell_envelope                     => 0.652   10.690
  Cellular_processes                   0.057    0.783
  Central_intermediary_metabolism      0.400    6.343
  Energy_metabolism                    0.151    1.678
  Fatty_acid_metabolism                0.032    2.448
  Purines_and_pyrimidines              0.506    2.082
  Regulatory_functions                 0.013    0.083
  Replication_and_transcription        0.047    0.175
  Translation                          0.211    4.807
  Transport_and_binding                0.549    1.339

# Enzyme/nonenzyme                     Prob     Odds
  Enzyme                            => 0.805    2.811
  Nonenzyme                            0.195    0.273

# Enzyme class                         Prob     Odds
  Oxidoreductase (EC 1.-.-.-)          0.176    0.845
  Transferase    (EC 2.-.-.-)          0.195    0.564
  Hydrolase      (EC 3.-.-.-)          0.244    0.769
  Lyase          (EC 4.-.-.-)          0.029    0.608
  Isomerase      (EC 5.-.-.-)          0.010    0.321
  Ligase         (EC 6.-.-.-)       => 0.141    2.776

# Gene Ontology category               Prob     Odds
  Signal_transducer                    0.090    0.419
  Receptor                             0.014    0.083
  Hormone                              0.002    0.318
  Structural_protein                   0.004    0.127
  Transporter                          0.024    0.222
  Ion_channel                          0.010    0.169
  Voltage-gated_ion_channel            0.003    0.127
  Cation_channel                       0.010    0.215
  Transcription                        0.047    0.367
  Transcription_regulation             0.026    0.204
  Stress_response                      0.049    0.552
  Immune_response                      0.012    0.136
  Growth_factor                        0.006    0.412
  Metal_ion_transport                  0.009    0.020

//

Pfam

Other programs and ressources