Difference between revisions of "Sequence based mutation analysis of GBA"
(→Subset of SNPs) |
(→SNAP) |
||
Line 65: | Line 65: | ||
=== SNAP === |
=== SNAP === |
||
+ | |||
+ | ''' Usage ''' |
||
+ | * command line: snapfun -i gbaseq.fasta -m mutations.txt -o snapfun_out.out |
||
+ | <br/> |
||
+ | {| border="1" style="text-align:center; border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Mutation Nr. |
||
+ | !Prediction |
||
+ | !Reliability Index |
||
+ | !Expected Accuracy |
||
+ | |- |
||
+ | |'''1''' (G49S)||Neutral||2||69% |
||
+ | |- |
||
+ | |'''2''' (D63N)||Non-neutral||5||87% |
||
+ | |- |
||
+ | |'''3''' (H99R)||Neutral||6||92% |
||
+ | |- |
||
+ | |'''4''' (R159Q)||Non-neutral||6||93% |
||
+ | |- |
||
+ | |'''5''' (P221L)||Non-neutral||5||87% |
||
+ | |- |
||
+ | |'''6''' (S310N)||Non-neutral||1||63% |
||
+ | |- |
||
+ | |'''7''' (N409S)||Non-neutral||0||58% |
||
+ | |- |
||
+ | |'''8''' (H350R)||Non-neutral||6||93% |
||
+ | |- |
||
+ | |'''9''' (M455V)||Non-neutral||3||78% |
||
+ | |- |
||
+ | |'''10''' (L509P)||Neutral||2||69% |
||
+ | |} |
||
=== SIFT === |
=== SIFT === |
Revision as of 10:40, 22 June 2011
TODO: Change Mutation 8 to mutation at 311 (His). as it is one resiude forming hydrogen bonds with the active site. Change overview, images and results of SIFT and Polyphen.
Contents
Subset of SNPs
The ten SNPs shown in the table below and highlighted in Figure 1 were chosen for the analysis in this task. It was tried to include SNPs all over the protein, in order to investigate the influence of mutations in several parts of the protein. The mutated residues forming hydrogenbonds with the active site of glucocerebrosidase are included, as these should result in either a mal- or nonfunctioning protein [TODO: Insert mutation at position 311 (His)]. It was not easy to find missense mutations only listed in dbSNP, as most of them were listed in HGMD, too.
Nr. | SNP ID/Accession Number | Database | Position including SP |
Position without SP |
Amino Acid Change | Codon Change | Remarks |
1 | CM081634 | HGMD | 49 | 10 | Gly - Ser | cGGC-AGC | |
2 | rs74953658, CM050263 | dbSNP, HGMD | 63 | 24 | Asp-Asn | tGAC-AAC | |
3 | rs1141820 | dbSNP | 99 | 60 | His - Arg | CAC - CGC | suspected, status not validated |
4 | CM880035 | HGMD | 159 | 120 | Arg - Gln | CGG-CAG | synonymos mutation at this position listed in dbSNP; forming hydrogen bond with active site |
5 | rs80205046, CM041347 | dbSNP, HGMD | 221 | 182 | Pro - Leu | CCC - CTC | |
6 | rs74731340, CM970620 | dbSNP, HGMD | 310 | 271 | Ser - Asn | AGT - AAT | |
7 | CM880036 | HGMD | 409 | 370 | Asn - Ser | AAC-AGC | most common mutation found in gaucher disease type 1 patients |
8 | HGMD | 350 | 311 | His - Arg | CAT-CGT | ||
9 | rs80020805, CM052245 | dbSNP, HGMD | 455 | 416 | Met - Val | cATG-GTG | |
10 | rs113825752 | dbSNP | 509 | 470 | Leu - Pro | CTT - CCT |
Physicochemical Properties and Changes
Mutation 1
The wildtype amino acid, Glycine is nonpolar, whereas the mutated amino acid Serin is polar. This different polarity, could be an indication, that the mutation is damaging. Looking at the structure, one can see, that this residue (pos. 10 in mature protein) is situated in a loop at the exterior of the protein. Therefore a substitution should not affect the function of the protein that much.
Mutation 2
...
Mutation 3
Mutation 4
Mutation 5
Mutation 6
Mutation 7
Mutation 8
Mutation 9
Mutation 10
Mutation Analysis
SNAP
Usage
- command line: snapfun -i gbaseq.fasta -m mutations.txt -o snapfun_out.out
Mutation Nr. | Prediction | Reliability Index | Expected Accuracy |
---|---|---|---|
1 (G49S) | Neutral | 2 | 69% |
2 (D63N) | Non-neutral | 5 | 87% |
3 (H99R) | Neutral | 6 | 92% |
4 (R159Q) | Non-neutral | 6 | 93% |
5 (P221L) | Non-neutral | 5 | 87% |
6 (S310N) | Non-neutral | 1 | 63% |
7 (N409S) | Non-neutral | 0 | 58% |
8 (H350R) | Non-neutral | 6 | 93% |
9 (M455V) | Non-neutral | 3 | 78% |
10 (L509P) | Neutral | 2 | 69% |
SIFT
Usage
- Webserver: http://sift.jcvi.org/www/SIFT_seq_submit2.html
- Input: sequence in fasta format, substitutions of interest (Example for mutation 1:
G49S
)
Results
Mutation Nr. | Prediction | Score | Sequence Conservation |
1 | tolerated | 0.51 | 3.05 |
2 | tolerated | 0.06 | 3.05 |
3 | tolerated | 0.62 | 3.04 |
4 | affect protein function | 0.03 | 3.01 |
5 | affect protein function | 0.00 | 3.01 |
6 | tolerated | 0.54 | 3.01 |
7 | affect protein function | 0.05 | 3.02 |
8 | |||
9 | tolerated | 0.12 | 3.01 |
10 | affect protein function | 0.01 | 3.09 |
Coming soon: Multiple alignment of sequence with its homologs + Conditional Probability Matrix
Polyphen2
Usage
- Webserver: http://genetics.bwh.harvard.edu/pph2/
- Input: sequence in fasta format, position and substitution of mutation
Results
The results of PolyPhen-2 (HumDiv) are listed in the table below. Three mutations (Numbers 3, 6 and 8) have been predicted to be harmless. Interestingly, the most common mutation found in gaucher disease patients is only classified as possibly damaging.
Mutation Nr. | Prediction | Score | Sensitivity | Specificity |
1 | probably damaging | 0.997 | 0.40 | 0.98 |
2 | probably damaging | 1.000 | 0.00 | 1.00 |
3 | benign | 0.000 | 1.00 | 0.00 |
4 | probably damaging | 1.000 | 0.00 | 1.00 |
5 | probably damaging | 1.000 | 0.00 | 1.00 |
6 | benign | 0.100 | 0.94 | 0.85 |
7 | possibly damaging | 0.573 | 0.88 | 0.91 |
8 | probably damaging | 1.0 | 0.00 | 1.00 |
9 | probably damaging | 0.999 | 0.14 | 0.99 |
10 | probably damaging | 0.978 | 0.75 | 0.96 |