Difference between revisions of "Sequence based mutation analysis of GBA"
(→Multiple Alignment) |
(→Mutation Analysis) |
||
Line 80: | Line 80: | ||
== Mutation Analysis == |
== Mutation Analysis == |
||
+ | |||
+ | === Substitution Matrices === |
||
+ | |||
+ | We used the following matrices: |
||
+ | * [http://www.uky.edu/Classes/BIO/520/BIO520WWW/blosum62.htm BLOSUM62] |
||
+ | * [http://www.icp.ucl.ac.be/~opperd/private/pam1.html PAM1] |
||
+ | * [http://www.icp.ucl.ac.be/~opperd/private/pam250.html PAM250] |
||
+ | |||
+ | {| border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Nr. |
||
+ | !Substitution |
||
+ | !BLOSUM62 |
||
+ | !PAM1 |
||
+ | !PAM250 |
||
+ | |- |
||
+ | |1||Gly - Ser||0 (worst: -4)||16 (worst: 0)||9 (worst: 1) |
||
+ | |- |
||
+ | |2||Asp - Asn||1 (worst: -4)||36 (worst: 0)||7 (worst: 1) |
||
+ | |- |
||
+ | |3||His - Arg||0 (worst: -3)||10 (worst: 0)||6 (worst: 1) |
||
+ | |- |
||
+ | |4||Arg - Gln||1 (worst: -3)||9 (worst: 0)||5 (worst: 1) |
||
+ | |- |
||
+ | |5||Pro - Leu||-3 (worst: -4)||3 (worst: 0)||5 (worst: 1) |
||
+ | |- |
||
+ | |6||Ser - Asn||1 (worst: -3)||20 (worst: 0)||5 (worst: 1) |
||
+ | |- |
||
+ | |7||Asn - Ser||1 (worst: -4)||34 (worst: 0)||8 (worst: 1) |
||
+ | |- |
||
+ | |8||His - Arg||0 (worst: -3)||10 (worst: 0)||6 (worst: 1) |
||
+ | |- |
||
+ | |9||Met - Val||1 (worst: -3)||17 (worst: 0)||4 (worst: 1) |
||
+ | |- |
||
+ | |10||Leu - Pro||-3 (worst: -4)||2 (worst: 0) ||3 (worst: 1) |
||
+ | |} |
||
=== Multiple Alignment === |
=== Multiple Alignment === |
Revision as of 18:18, 23 June 2011
TODO: Change images for mutation 8
Contents
Subset of SNPs
The ten SNPs shown in the table below and highlighted in Figure 1 were chosen for the analysis in this task. It was tried to include SNPs all over the protein, in order to investigate the influence of mutations in several parts of the protein. The mutated residues forming hydrogenbonds with the active site of glucocerebrosidase are included, as these should result in either a mal- or nonfunctioning protein [TODO: Insert mutation at position 311 (His)]. It was not easy to find missense mutations only listed in dbSNP, as most of them were listed in HGMD, too.
Nr. | SNP ID/Accession Number | Database | Position including SP |
Position without SP |
Amino Acid Change | Codon Change | Remarks |
1 | CM081634 | HGMD | 49 | 10 | Gly - Ser | cGGC-AGC | |
2 | rs74953658, CM050263 | dbSNP, HGMD | 63 | 24 | Asp - Asn | tGAC-AAC | |
3 | rs1141820 | dbSNP | 99 | 60 | His - Arg | CAC - CGC | suspected, status not validated |
4 | CM880035 | HGMD | 159 | 120 | Arg - Gln | CGG-CAG | synonymos mutation at this position listed in dbSNP; forming hydrogen bond with active site |
5 | rs80205046, CM041347 | dbSNP, HGMD | 221 | 182 | Pro - Leu | CCC - CTC | |
6 | rs74731340, CM970620 | dbSNP, HGMD | 310 | 271 | Ser - Asn | AGT - AAT | |
7 | CM880036 | HGMD | 409 | 370 | Asn - Ser | AAC-AGC | most common mutation found in gaucher disease type 1 patients |
8 | CM993703 | HGMD | 350 | 311 | His - Arg | CAT-CGT | severe form of gaucher disease 2; forming hydrogen bond with active site |
9 | rs80020805, CM052245 | dbSNP, HGMD | 455 | 416 | Met - Val | cATG-GTG | |
10 | rs113825752 | dbSNP | 509 | 470 | Leu - Pro | CTT - CCT |
Physicochemical Properties and Changes
Mutation 1
The wildtype amino acid, Glycine is nonpolar, whereas the mutated amino acid Serin is polar. This different polarity, could be an indication, that the mutation is damaging. Looking at the structure, one can see, that this residue (pos. 10 in mature protein) is situated in a loop at the exterior of the protein. Therefore a substitution should not affect the function of the protein that much.
Mutation 2
In this mutation the Aspartic acid as an acidic amino acid is replaced by its derivative Asparagine. So it loses its acidic character. Therefore it could have an effect on the folding of the protein. If you look at the structure you see, that it is not in the interior of the protein, so the mutation should also not affect the function.
Mutation 3
Histidine is mutated to Arginine, which are both amino acids with a positively charged functional group. Histidine forms a ring structure, whereas Arginine forms a straight chain. The residue is also situated at the exterior of the protein, so its influence is not so strong. As they have the same charge the propertiers are the same and the mutation may be tolerated.
Mutation 4
In this mutation the positively charged Arginine is replaced by the charged amino acid Glutamine. Glutamine is a zwitterion and can be positively charged as well as negatively. It is situated in the interior of the protein, so it could also affect the function of the protein.
Mutation 5
Proline is the amino acid, which forms a ring structure. Therefore it has a great influence on the folding of the protein. Both, Proline and Leucine are zwitterions. So the chemical properties are the same. As it is also in the interior of the protein it could influence the function of the protein.
Mutation 6
Serine is a hydroxylic amino acid which is mutated in the acidic amino acid Asparagine. They are both zwitterions, therefore they share chemical propertiers. The structure is also very similar. As it is positioned at the exterior the influence of the mutation may be not that strong.
Mutation 7
Asparagine is an acidic amino acid whereas Serine is hydroxylic. This mutation is exactly the other way round than mutation six. But this amino acid is situated in the interior of the protein as part of a helix. So the mutation could affect the function of the protein.
Mutation 8
Histidine and Arginine are both basic amino acids. Histidine builds an imidazole ring, which is protonated. The residue is in the interior of the protein and involved in the proteins function. So the mutation may not be tolerated.
Mutation 9
Methionine is a sulfur containing amino acid which is mutated to Valine. Their chemical properties differ so the mutation might affect the folding or the function of the protein. If you look at the structure you see that it is placed at a helix.
Mutation 10
Leucine is mutated to Proline, which is the amino acid that forms a ring structure. Therefore the structure of the protein might change. If you look at the structure you see that it is part of a beta sheet, which might be disordered by the mutation and so affect the protein's function.
Mutation Analysis
Substitution Matrices
We used the following matrices:
Nr. | Substitution | BLOSUM62 | PAM1 | PAM250 |
---|---|---|---|---|
1 | Gly - Ser | 0 (worst: -4) | 16 (worst: 0) | 9 (worst: 1) |
2 | Asp - Asn | 1 (worst: -4) | 36 (worst: 0) | 7 (worst: 1) |
3 | His - Arg | 0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) |
4 | Arg - Gln | 1 (worst: -3) | 9 (worst: 0) | 5 (worst: 1) |
5 | Pro - Leu | -3 (worst: -4) | 3 (worst: 0) | 5 (worst: 1) |
6 | Ser - Asn | 1 (worst: -3) | 20 (worst: 0) | 5 (worst: 1) |
7 | Asn - Ser | 1 (worst: -4) | 34 (worst: 0) | 8 (worst: 1) |
8 | His - Arg | 0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) |
9 | Met - Val | 1 (worst: -3) | 17 (worst: 0) | 4 (worst: 1) |
10 | Leu - Pro | -3 (worst: -4) | 2 (worst: 0) | 3 (worst: 1) |
Multiple Alignment
Usage
- protein blast with the option mammals
We found the following sequences:
Accession | Description |
---|---|
NP_000148.2 | glucosylceramidase isoform 1 precursor [Homo sapiens] |
AAA35880.1 | glucocerebrosidase [Homo sapiens] linked to AAA35880.1Genome view with mapviewer linked to AAA35880.1 |
BAH13365.1 | unnamed protein product [Homo sapiens] |
EAW53100.1 | glucosidase, beta; acid (includes glucosylceramidase), isoform CRA_a [Homo sapiens] ked to EAW53100.1 |
NP_001008997.1 | glucosylceramidase precursor [Pan troglodytes] NP_001008997.1Gene info linked to NP_001008997.1Genome view with mapviewer linked to NP_001008997.1 |
XP_003259439.1 | PREDICTED: glucosylceramidase-like isoform 1 [Nomascus leucogenys] |
XP_003259440.1 | PREDICTED: glucosylceramidase-like isoform 2 [Nomascus leucogenys] |
NP_001127488.1 | glucosylceramidase precursor [Pongo abelii] NP_001127488.1Genome view with mapviewer linked to NP_001127488.1 |
NP_001128784.1 | DKFZP469B0323 protein [Pongo abelii] |
AAA35877.1 | lysosomal glucocerebrosidase precursor [Homo sapiens] |
AAA35873.1 | glucocerebrosidase precursor (5' end put.); putative [Homo sapiens] |
BAH13232.1 | unnamed protein product [Homo sapiens] |
XP_002760131.1 | PREDICTED: glucosylceramidase isoform 2 [Callithrix jacchus] |
XP_002760130.1 | PREDICTED: glucosylceramidase isoform 1 [Callithrix jacchus] |
2WKL_A | Chain A, Velaglucerase Alfa |
3KE0_A | Chain A, Crystal Structure Of N370s Glucocerebrosidase At Acidic Ph |
2V3D_A | Chain A, Acid-Beta-Glucosidase With N-Butyl-DeoxynojirimycinubChem BioAssay Info linked to 2V3D_A |
1Y7V_A | Chain A, X-Ray Structure Of Human Acid-Beta-Glucosidase Covalently Bound To Conduritol B Epoxide |
XP_001498700.1 | PREDICTED: similar to putative lysosomal glucocerebrosidase [Equus caballus]d to XP_001498700.1 |
XP_855035.1 | PREDICTED: similar to glucocerebrosidase precursor [Canis familiaris] |
XP_002715375.1 | PREDICTED: glucocerebrosidase-like [Oryctolagus cuniculus] |
NP_001005730.1 | glucosylceramidase precursor [Sus scrofa] |
XP_002928339.1 | PREDICTED: glucosylceramidase-like [Ailuropoda melanoleuca] |
NP_001039886.1 | glucosylceramidase precursor [Bos taurus] d to NP_001039886.1Gene info linked to NP_001039886.1Genome view with mapviewer linked to NP_001039886.1 |
DAA31806.1 | glucosylceramidase precursor [Bos taurus] |
BAH13605.1 | unnamed protein product [Homo sapiens] |
NP_001165283.1 | glucosylceramidase isoform 3 precursor [Homo sapiens] |
BAH12898.1 | unnamed protein product [Homo sapiens] |
EFB14309.1 | hypothetical protein PANDA_018263 [Ailuropoda melanoleuca] |
NP_001121111.1 | glucosidase, beta, acid [Rattus norvegicus] |
EDL15229.1 | glucosidase, beta, acid, isoform CRA_a [Mus musculus] |
NP_032120.1 | glucosylceramidase isoform 1 [Mus musculus] |
XP_002760132.1 | PREDICTED: glucosylceramidase isoform 3 [Callithrix jacchus] |
NP_001165282.1 | glucosylceramidase isoform 2 [Homo sapiens] |
BAH13357.1 | unnamed protein product [Homo sapiens] |
XP_003259441.1 | XP_003259442.1| PREDICTED: glucosylceramidase-like isoform 4 [Nomascus leucogenys] |
XP_001374060.1 | PREDICTED: glucosylceramidase-like [Monodelphis domestica] |
BAH13467.1 | unnamed protein product [Homo sapiens] |
XP_002760133.1 | PREDICTED: glucosylceramidase isoform 4 [Callithrix jacchus] |
BAG58248.1 | unnamed protein product [Homo sapiens] |
BAG60020.1 | unnamed protein product [Homo sapiens] |
CAI95088.1 | glucosidase, beta; acid, pseudogene [Homo sapiens] |
BAA02546.1 | glucocerebrosidase [Homo sapiens] |
BAG52175.1 | unnamed protein product [Homo sapiens] |
BAH13574.1 | unnamed protein product [Homo sapiens] |
XP_002810119.1 | PREDICTED: glucosylceramidase-like [Pongo abelii] |
AAA35876.1 | D-glucosyl-N-acylsphingosine glucohydrolase [Homo sapiens] |
CAE06503.1 | putative lysosomal glucocerebrosidase precursor [Sus scrofa] |
AEB01891.1 | beta-glucosidase [Platanista minor] |
ABM68996.1 | glucocerebrosidase [Sotalia fluviatilis] |
ABX59623.1 | glucocerebrosidase [Lagenorhynchus australis] |
ABX59621.1 | glucocerebrosidase [Lagenodelphis hosei] |
ABX59628.1 | glucocerebrosidase [Peponocephala electra] |
AAF63861.1 | beta-acid glucosidase [Equus caballus] |
ACY77522.1 | glucosidase beta acid [Balaenoptera bonaerensis] |
ACY77443.1 | glucosidase beta acid [Eubalaena australis] |
ACY77479.1 | glucosidase beta acid [Balaenoptera bonaerensis] |
ACY77364.1 | glucosidase beta acid [Megaptera novaeangliae] |
ACY77382.1 | glucosidase beta acid [Megaptera novaeangliae] |
ABX59629.1 | glucocerebrosidase [Peponocephala electra] |
ABX59635.1 | glucocerebrosidase [Phocoenoides dalli] |
AEB01886.1 | beta-glucosidase [Feresa attenuata] |
AEB01881.1 | beta-glucosidase [Lissodelphis borealis] |
AEB01885.1 | beta-glucosidase [Lagenorhynchus acutus] |
AEB01888.1 | beta-glucosidase [Neophocaena phocaenoides] |
AAA35874.1 | glucocerebrosidase precursor [Homo sapiens] AAA35874.1 |
ABX59637.1 | glucocerebrosidase [Phocoena phocoena] |
ABX59639.1 | glucocerebrosidase [Inia geoffrensis boliviensis] |
AEB01890.1 | beta-glucosidase [Ziphius cavirostris] |
ABX59620.1 | glucocerebrosidase [Tursiops truncatus] |
ABX59634.1 | glucocerebrosidase [Phocoenoides dalli] |
ACY77528.1 | glucosidase beta acid [Cephalorhynchus hectori] |
AEB01889.1 | beta-glucosidase [Monodon monoceros] |
ACY77530.1 | glucosidase beta acid [Globicephala macrorhynchus] |
AAA35875.1 | glucocerebrosidase (alt.) precursor [Homo sapiens]75.1 |
ACY77529.1 | glucosidase beta acid [Cephalorhynchus hectori] |
NP_001071509.1 | phospholipase D3 [Bos taurus] e info linked to NP_001071509.1Gene info linked to NP_001071509.1Genome view with mapviewer linked to NP_001071509.1 |
BAE88528.1 | unnamed protein product [Macaca fascicularis] |
Q4R583.1 | unnamed protein product [Macaca fascicularis] |
XP_001094274.1 | PREDICTED: phospholipase D3 isoform 4 [Macaca mulatta]94274.1Gene info linked to XP_001094274.1Genome view with mapviewer linked to XP_001094274.1 |
XP_002762177.1 | PREDICTED: phospholipase D3-like [Callithrix jacchus] |
AAB16799.1 | HU-K4 [Homo sapiens] |
AAH00553.2 | PLD3 protein [Homo sapiens] |
XP_001498901.1 | PREDICTED: phospholipase D family, member 3 isoform 1 [Equus caballus] linked to XP_001498901.1 |
XP_866921.1 | PREDICTED: similar to phospholipase D3 isoform 2 isoform 3 [Canis familiaris] |
CAH93251.1 | hypothetical protein [Pongo abelii] |
NP_001026866.1 | phospholipase D3 [Homo sapiens]fo linked to NP_001026866.1Genome view with mapviewer linked to NP_001026866.1 |
NP_001126871.1 | phospholipase D3 [Pongo abelii] |
XP_866933.1 | PREDICTED: similar to phospholipase D3 (predicted) isoform 4 [Canis familiaris] |
XP_853110.1 | PREDICTED: similar to phospholipase D3 (predicted) isoform 2 [Canis familiaris] |
XP_866956.1 | PREDICTED: similar to phospholipase D3 (predicted) isoform 6 [Canis familiaris] |
NP_001012167.1 | phospholipase D3 [Rattus norvegicus] |
SNAP
Usage
- command line:
snapfun -i gbaseq.fasta -m mutations.txt -o snapfun_out.out
Mutation Nr. | AA change | Prediction | Reliability Index | Expected Accuracy |
---|---|---|---|---|
1 | G49S | Neutral | 3 | 78% |
2 | D63N | Non-neutral | 5 | 87% |
3 | H99R | Neutral | 5 | 89% |
4 | R159Q | Non-neutral | 7 | 96% |
5 | P221L | Non-neutral | 5 | 87% |
6 | S310N | Neutral | 0 | 53% |
7 | N409S | Non-neutral | 1 | 63% |
8 | H350R | Non-neutral | 8 | 96% |
9 | M455V | Non-neutral | 3 | 78% |
10 | L509P | Non-neutral | 1 | 63% |
SIFT
Usage
- Webserver: http://sift.jcvi.org/www/SIFT_seq_submit2.html
- Input: sequence in fasta format, substitutions of interest (Example for mutation 1:
G49S
)
Results
Mutation Nr. | Prediction | Score | Sequence Conservation |
1 | tolerated | 0.51 | 3.05 |
2 | tolerated | 0.06 | 3.05 |
3 | tolerated | 0.62 | 3.04 |
4 | affect protein function | 0.03 | 3.01 |
5 | affect protein function | 0.00 | 3.01 |
6 | tolerated | 0.54 | 3.01 |
7 | affect protein function | 0.05 | 3.02 |
8 | affect protein function | 0.00 | 3.11 |
9 | tolerated | 0.12 | 3.01 |
10 | affect protein function | 0.01 | 3.09 |
Coming soon: Multiple alignment of sequence with its homologs + Conditional Probability Matrix
Polyphen2
Usage
- Webserver: http://genetics.bwh.harvard.edu/pph2/
- Input: sequence in fasta format, position and substitution of mutation
Results
The results of PolyPhen-2 (HumDiv) are listed in the table below. Two mutations (Numbers 3 and 6) have been predicted to be harmless. Interestingly, the most common mutation found in gaucher disease patients is only classified as possibly damaging.
Mutation Nr. | Prediction | Score | Sensitivity | Specificity |
1 | probably damaging | 0.997 | 0.40 | 0.98 |
2 | probably damaging | 1.000 | 0.00 | 1.00 |
3 | benign | 0.000 | 1.00 | 0.00 |
4 | probably damaging | 1.000 | 0.00 | 1.00 |
5 | probably damaging | 1.000 | 0.00 | 1.00 |
6 | benign | 0.100 | 0.94 | 0.85 |
7 | possibly damaging | 0.573 | 0.88 | 0.91 |
8 | probably damaging | 1.0 | 0.00 | 1.00 |
9 | probably damaging | 0.999 | 0.14 | 0.99 |
10 | probably damaging | 0.978 | 0.75 | 0.96 |