Fabry:Sequence-based analyses
Fabry Disease » Sequence-based analyses
The following analyses were performed on the basis of the α-Galactosidase A sequence. Please consult the journal for the commands used to generate the results.
Contents
Secondary structure
Disorder
Transmembrane helices
Voltage-gated potassium channel (Q9YDF8)
Signal peptides
Prediction of the presence and location of signal peptide cleavage sites in amino acid sequences.
hest information content. In the pictures below the probabilty, odds and information content ( <math>probability \cdot odds</math> probabilty*odds) a
GO terms
QuickGO
Since we used QuickGO in Task2 to download the GO terms we decided to refine our GO analysis now. The QuickGO search reveals 28 distinct GO terms (see <xr id="tab:QuickGO"/>)
<figtable id="tab:QuickGO"> Results of the QuickGO search
Code | Name |
---|---|
GO:0052692 | raffinose alpha-galactosidase activity |
GO:0051001 | negative regulation of nitric-oxide synthase activity |
GO:0046479 | glycosphingolipid catabolic process |
GO:0046477 | glycosylceramide catabolic process |
GO:0045019 | negative regulation of nitric oxide biosynthetic process |
GO:0044281 | small molecule metabolic process |
GO:0043202 | lysosomal lumen |
GO:0043169 | cation binding |
GO:0042803 | protein homodimerization activity |
GO:0016936 | galactoside binding |
GO:0016798 | hydrolase activity, acting on glycosyl bonds |
GO:0016787 | hydrolase activity |
GO:0016139 | glycoside catabolic process |
GO:0009311 | oligosaccharide metabolic process |
GO:0008152 | metabolic process |
GO:0006687 | glycosphingolipid metabolic process |
GO:0006665 | sphingolipid metabolic process |
GO:0005975 | carbohydrate metabolic process |
GO:0005794 | Golgi apparatus |
GO:0005764 | lysosome |
GO:0005737 | cytoplasm |
GO:0005625 | soluble fraction |
GO:0005576 | extracellular region |
GO:0005515 | protein binding |
GO:0005102 | receptor binding |
GO:0004557 | alpha-galactosidase activity |
GO:0004553 | hydrolase activity, hydrolyzing O-glycosyl compounds |
GO:0003824 | catalytic activity |
</figtable>
GOPET
<figtable id="tab:GOPET"> Results of the GOPET search
Result for GOPET search | |||||
---|---|---|---|---|---|
GOid | Aspect | Confidence | GO term | ||
GO:0016798 | Molecular Function Ontology (F) | 98% | hydrolase activity acting on glycosyl bonds | ||
GO:0004553 | Molecular Function Ontology (F) | 98% | hydrolase activity hydrolyzing O-glycosyl compounds | ||
GO:0016787 | Molecular Function Ontology (F) | 97% | hydrolase activity | ||
GO:0004557 | Molecular Function Ontology (F) | 96% | alpha-galactosidase activity | ||
GO:0008456 | Molecular Function Ontology (F) | 89% | alpha-N-acetylgalactosaminidase activity |
</figtable>
Searching the GOPET annotation tool with the AGAL_HUMAN sequence revealed 5 GOIds, which are displayed in <xr id="tab:GOPET"/>. Even maximizing the "maximum number of GO prediction to be displayed per sequence" and minimizing the "confidence threshold for prediction" did not result in any more sequences.
On a first glance, since we already know the name and function of the protein, it is a bit surprising, that alpha-galactosidase activity is only the third entry with 96% confidence. In our already carried out information gathering we learned that α-galactosidase A is a hydrolase thus the first three entries were not surprising. Considering that our enzyme mainly is a glycosidase, the both entries on top of the list make perfekt sense.
Again a bit surprising was the last entry. α-N-Acetylgalactosaminidase is actually used for enzyme replacement therapy, which we mention on our main page. The structure of both enzymes is similar to each other, but this still does not explain the association of this GO term to the AGAL protein.
ProtFun2.2
In the output file of ProtFun, the "=>" indicates, which of the subcategories of each category is predicted to be true for the submitted sequence. This prediction is performed on basis of the highest information content. In the pictures below the probabilty, odds and information content are shown separately for each category. The left y-axis is assigned to the probabilty (blue) and the information content (red line), the right y-axis is assigned to the probabilty, which is multiplied by 10 for a better perceptibility.
According to the information content the most likely functional category of the human α-galactosidase A protein is "Cell envelope". Considering our own researches, we would not come to this conclusion, but rather assign a metabolic or regulatory class, since to us known GO terms are for example "glycoside catabolic process", "negative regulation of nitric oxide biosynthetic process" and "small molecule metabolic process".
The prediction that the submitted sequence belongs to an enzyme is right.
In our opinion, it is not always the best evaluation method to decide on the basis of the information content, since the chosen enzyme class "Ligase" indeed has the highest information content, but not the highest probabilty and is, considering literature, wrong. AGAL_HUMAN clearly is a Hydrolase, which is also indicated by the high probabilty. The assigned EC number of the galactosidase is EC=3.2.1.22.
ProtFun does not predict a Gene Ontology category, since the category with the highest information content has odds lower than 1. We actually expected this category to be very unclear since the protein has very diverse cellular functions.
ProtFun uses a lot of other ressources to predict the probabilties and odds of the categories. Some of the outcomes are listed in <xr id="tab:ProtFun"/>. It is known that the α-galactosidase A protein has a 31 residues long signal peptide on position 1 to 31 which later is cleaved of. Thus the prediction of SignalP 3.0 is right.
The prediction of a propeptide cleavage site could not be confirmed in the human AGAL protein.
ProtFun infered 22 phosphorylation sites at the positions 62, 102, 201, 235, 238, 241, 276, 304, 364, 371, 405, 424, 366, 400, 86, 134, 151, 152, 173, 184, 207, 216. Phosphosite confirms phosphorylation modificatopns at S23, Y134 and H186 of which only one was predicted by NetPhos.
Uniprot claims that at the positions 139, 192, 215 and 408 there are N-linked Glycosilated sites (the last one is only a potential site). All 4 are predicted by NetNGlyc. As predicted by NetOGlyc, there are no (known) O-glycosylated sites.
See also sections Transmembrane helices and Signal peptides for further information on these two topics.
<figtable id="tab:ProtFun"> Output rendered by the individual features used by ProtFun 2.2
Feature | Output summary | Details |
---|---|---|
SignalP 3.0 | Most likely cleavage site between pos. 31 and 32: ARA-LD | Using neural networks (NN) and hidden Markov models (HMM) trained on eukaryotes |
ProP 1.0 | 1 propeptide cleavage site predicted at position: 196 | Furin-type cleavage site prediction (Arginine/Lysine residues) |
TargetP 1.1 | No high confidence targeting predition | - |
NetPhos 2.0 | 22 putative phosphorylation sites | phosphorylation site prediction |
NetOGlyc 3.1 | No O-glycosylated sites predicted | - |
NetNGlyc 1.0 | 4 putative N-glycosylated sites at positions 139 192 215 408 | - |
TMHMM 2.0 | No TM helices predicted | - |
</figtable>
Pfam
The Pfam sequence search revealed one significant Pfam-A match, which is shown in <xr id="fig:pfam_fam"/>.
<figtable id="fig:pfam_fam"> Pfam-A match
Family | Description | Entry xtype | Clan | Envelope Start | Envelope End | Alignment Start | Alignment End | HMM From | HMM To | Bit score | E-value |
---|---|---|---|---|---|---|---|---|---|---|---|
Melibiase | Melibiase | Family | CL0058 | 33 | 149 | 40 | 146 | 41 | 140 | 50.0 | 1.5e-13 |
</figtable>
<figure id="fig:AGAL">
</figure>
The alpha-galactosidase A protein is, according to PFAM, in the family of the Melibiases (see <xr id="tab:Pfam_Melibiose"/>).
The Pfam PF02065 family itself is refered to as "Glycoside hydrolase family 27" and "Glycoside hydrolase family 36", the AGAL_HUMAN falls into the first category. Their common characteristic is that the members of this family are glycoside hydrolases (EC 3.2.1.). The AGAL enzyme catalyzes the hydrolisis of the disaccharide Melibiose (D-Gal-α(1→6)-D-Glc) into its two components galactose and glucose.
<figtable id="tab:Pfam_Melibiose"> Melibiase Identifiers
Melibiase Identifiers | |||||
---|---|---|---|---|---|
Symbol | Melibiase | ||||
Pfam | PF02065 | ||||
Pfam clan | CL0058 | ||||
InterPro | IPR000111 | ||||
SCOP | 1ktc | ||||
SUPERFAMILY | 1ktc | ||||
CAZy | GH27 |
</figtable>
The alignment of positions 40 - 146 of the AGAL_HUMAN protein sequence and the matching HMM used in this prediction, is shown in <xr id="fig:pfam_ali" />. According to the color code, indicating the degree of confidence of each aligned position, there is an overall very good agreement of the Hidden Markov Model and the sequence, except for the residues 17-46. Checking our background knowledge and the Uniprot database we could not find a very interesting or abnormal region here, but the signal peptide cleavage site and a beta strand at position 42-46.
Our query protein belongs to a rather large Clan, the Glyco_hydro_tim Clan (CL0058), which includes 4 CAZy-Clans (GH-A, GH-D, GH-H and GH-K). They main attribute of all the included glycosyl hydrolase enzymes is the hold of a a TIM barrel fold (eight α-helices and eight parallel β-strands that alternate along the peptide backbone source). This fold (residue 31 - 324) can be well seen in <xr id="fig:AGAL"/>, where the 3D structure of α-Galactosidase A is depicted.
The InterPro protein sequence analysis and classification assigns the enzyme to the "IPR000111 Glycoside hydrolase, clan GH-D" family. As mentioned before, GH-D is part of the Glyco_hydro_tim Clan and thus it is not surprising, that the description is almost the same. According to InterPro, there are 6 IPR000111 family members in the human body.
The Structural Classification of Proteins (SCOP) finds two domains, an Amylase, catalytic domain (c.1.8.1, residues 32 - 323) and an alpha-Amylases, C-terminal beta-sheet domain (b.71.1.1, residues 324 - 421). The catalytic domain is an alpha and beta protein (a/b) with TIM beta/alpha-barrel fold, which is consonant with the affiliation to the Glyco_hydro_tim Clan. The C-terminal beta-sheet domain is an all beta protein, the fold and superfamily is Glycosyl hydrolase domain. This is also not surprising to us.
AGAL is in the CAZy GH27 (Glycoside Hydrolase Family 27) family, together with a-N-acetylgalactosaminidase (EC 3.2.1.49; as mentioned before, used for enzyme replacement therapy), isomalto-dextranase(EC 3.2.1.94) and b-L-arabinopyranosidase (EC 3.2.1.88). See http://www.cazypedia.org/index.php/Glycoside_Hydrolase_Family_27 TODO
<figure id="fig:pfam_ali">
</figure>