Difference between revisions of "Sequence based mutation analysis of GBA"
(→Polyphen2) |
(→PolyPhen-2) |
||
Line 287: | Line 287: | ||
''' Results ''' |
''' Results ''' |
||
− | The results of PolyPhen-2 |
+ | The results of PolyPhen-2 are listed in the table below. The predictions for mutation 7 differ subject to the training set. With HumDiv it is predicted as being damaging, whereas the prediction based on HumVar indicates that the mutation is harmless. For the other mutations the predictions are the same, only varying sligthly in score, sensitivity and specificity. |
<br/> |
<br/> |
||
Line 293: | Line 293: | ||
{| border="1" style="text-align:center; border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
{| border="1" style="text-align:center; border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
|- |
|- |
||
− | || ''' Mutation Nr.''' |
+ | |rowspan="2"| ''' Mutation Nr.'''||colspan="4"|'''HumDiv'''||colspan="4"|'''HumVar''' |
|- |
|- |
||
+ | || '''Prediction '''||''' Score''' || '''Sensitivity''' || '''Specificity '''|| '''Prediction '''||''' Score''' || '''Sensitivity''' || '''Specificity ''' |
||
− | || '''1''' || probably damaging || 0.997 || 0.40 || 0.98 |
||
|- |
|- |
||
− | || ''' |
+ | || '''1''' || probably damaging || 0.997 || 0.40 || 0.98 || probably damaging || 0.992 || 0.44 || 0.97 |
|- |
|- |
||
− | || ''' |
+ | || '''2''' || probably damaging || 1.000 || 0.00 || 1.00 || probably damaging || 0.999 || 0.08 || 1.00 |
|- |
|- |
||
− | || ''' |
+ | || ''' 3 ''' || benign || 0.000 || 1.00 || 0.00 || benign || 0.000 || 1.00 || 0.00 |
|- |
|- |
||
− | || ''' |
+ | || '''4''' || probably damaging || 1.000 || 0.00 || 1.00 || probably damaging || 1.000 || 0.00 || 1.00 |
|- |
|- |
||
− | || ''' |
+ | || ''' 5 ''' || probably damaging || 1.000 || 0.00 || 1.00 || probably damaging || 1.000 || 0.00 || 1.00 |
|- |
|- |
||
− | || ''' |
+ | || ''' 6 ''' || benign || 0.100 || 0.94 || 0.85 || benign || 0.120 || 0.91 || 0.67 |
|- |
|- |
||
− | || ''' |
+ | || ''' 7 ''' || possibly damaging || 0.573 || 0.88 || 0.91 || benign || 0.131 || 0.90 || 0.68 |
|- |
|- |
||
− | || ''' |
+ | || ''' 8 ''' || probably damaging || 1.000 || 0.00 || 1.00 || probably damaging || 0.999 || 0.08 || 1.00 |
|- |
|- |
||
− | || ''' |
+ | || ''' 9 ''' || probably damaging || 0.999 || 0.14 || 0.99 || probably damaging || 0.980 || 0.54 || 0.95 |
+ | |- |
||
+ | || ''' 10 ''' || probably damaging || 0.978 || 0.75 || 0.96 || probably damaging || 0.966 || 0.59 || 0.94 |
||
|} |
|} |
||
Revision as of 15:57, 26 June 2011
TODO: PSSM
Contents
- 1 Subset of SNPs
- 2 Mutation Analysis
- 3 Discussion
- 3.1 Mutation 1: Gly - Ser (Pos. 49/10)
- 3.2 Mutation 2: Asp - Asn (Pos. 63/24)
- 3.3 Mutation 3: His - Arg (Pos. 99/60)
- 3.4 Mutation 4: Arg - Gln (Pos. 159/120)
- 3.5 Mutation 5: Pro - Leu (Pos. 221/182)
- 3.6 Mutation 6: Ser - Asn (Pos. 310/271)
- 3.7 Mutation 7: Asn - Ser (Pos. 409/370)
- 3.8 Mutation 8: His - Arg (Pos. 350/311)
- 3.9 Mutation 9: Met - Val (Pos. 455/416)
- 3.10 Mutation 10: Leu - Pro (Pos. 509/470)
Subset of SNPs
The ten SNPs shown in the table below and highlighted in Figure 1 were chosen for the analysis in this task. It was tried to include SNPs all over the protein, in order to investigate the influence of mutations in several parts of the protein. The mutated residues forming hydrogenbonds with the active site of glucocerebrosidase are included, as these should result in either a mal- or nonfunctioning protein. It was not easy to find missense mutations only listed in dbSNP, as most of them were listed in HGMD, too.
Nr. | SNP ID/Accession Number | Database | Position including SP |
Position without SP |
Amino Acid Change | Codon Change | Remarks |
1 | CM081634 | HGMD | 49 | 10 | Gly - Ser | cGGC-AGC | |
2 | rs74953658, CM050263 | dbSNP, HGMD | 63 | 24 | Asp - Asn | tGAC-AAC | |
3 | rs1141820 | dbSNP | 99 | 60 | His - Arg | CAC - CGC | suspected, status not validated |
4 | CM880035 | HGMD | 159 | 120 | Arg - Gln | CGG-CAG | synonymos mutation at this position listed in dbSNP; forming hydrogen bond with active site |
5 | rs80205046, CM041347 | dbSNP, HGMD | 221 | 182 | Pro - Leu | CCC - CTC | |
6 | rs74731340, CM970620 | dbSNP, HGMD | 310 | 271 | Ser - Asn | AGT - AAT | |
7 | CM880036 | HGMD | 409 | 370 | Asn - Ser | AAC-AGC | most common mutation found in gaucher disease type 1 patients |
8 | CM993703 | HGMD | 350 | 311 | His - Arg | CAT-CGT | severe form of gaucher disease 2; forming hydrogen bond with active site |
9 | rs80020805, CM052245 | dbSNP, HGMD | 455 | 416 | Met - Val | cATG-GTG | |
10 | rs113825752 | dbSNP | 509 | 470 | Leu - Pro | CTT - CCT |
Mutation Analysis
In the following section, the different mutations/SNPs will be analyzed to determine, whether they are neutral or whether they will affect the function of glucocerebrosidase. To do so, several facts were taken into account: The physicochemical properties and changes of the amino acids, the substitution values of the specific amino acids in various substitution matrices, the conservation of the specific positions in a multiple sequence alignment of homologous structures and the predictions of different prediction tools (SIFT, SNAP and Polyphen 2).
Physicochemical Properties and Changes
The physicochemical properties and changes of the wildtype and mutated aminoacids are listed below, and the superpositions, allowing one to see the structural differences, are shown in Figure 2 to the right. Substitutions of aminoacids that are structurally and chemically different are more likely to affect the function of a protein, than substitutions of very similar amino acids.
Mutation 1
The wildtype amino acid, Glycine is nonpolar, whereas the mutated amino acid Serin is polar. This different polarity, could be an indication, that the mutation is damaging. Looking at the structure, one can see, that this residue (located at pos. 10 in the mature protein) is situated in a beta strand at the exterior of the protein. Therefore a substitution should not affect the function of the protein that much.
Mutation 2
In this mutation the Aspartic acid, an acidic amino acid, is replaced by its derivative Asparagine. Therefore the residue looses its acidic character, which could have an effect on the folding of the protein. If one looks at the structure one can see, that the residue is not located in the interior of the protein, so the mutation should also not affect the function.
Mutation 3
Histidine is mutated to Arginine, which are both amino acids with a positively charged functional group that are basic. Histidine forms a ring structure and is aromatic, whereas Arginine forms a straight chain. The residue is also situated at the exterior of the protein, so its influence is not so strong. As they have the same charge the propertiers are almost the same and the mutation may be tolerated. Only the different structure of Arginine may influence the function.
Mutation 4
In this mutation the positively charged Arginine is replaced by the charged amino acid Glutamine. Glutamine is a zwitterion and can be positively charged as well as negatively. It is situated in the interior of the protein, so it could also affect the function of the protein.
Mutation 5
Proline is the amino acid, which forms a ring structure. Therefore it has a great influence on the folding of the protein. Both, Proline and Leucine are zwitterions. So the chemical properties are the same. As it is also in the interior of the protein it could influence the function of the protein.
Mutation 6
Serine is a hydroxylic amino acid which is mutated in the acidic amino acid Asparagine. They are both zwitterions, aliphatic, polar and neutral and therefore they share chemical propertiers. The structure is also very similar. As it is positioned at the exterior the influence of the mutation may be not that strong.
Mutation 7
Asparagine is an acidic amino acid whereas Serine is hydroxylic. This mutation is exactly the other way round than mutation six. But this amino acid is situated in the interior of the protein as part of a helix. So the mutation could affect the function of the protein.
Mutation 8
Histidine and Arginine are both basic amino acids. Histidine builds an imidazole ring, which is protonated. The residue is in the interior of the protein and involved in the proteins function. So the mutation may not be tolerated.
Mutation 9
Methionine is a sulfur containing amino acid which is mutated to Valine. Their chemical properties differ so the mutation might affect the folding or the function of the protein. If you look at the structure you see that it is placed at a helix.
Mutation 10
Leucine is mutated to Proline, which is the amino acid that forms a ring structure. Therefore the structure of the protein might change. If you look at the structure you see that it is part of a beta sheet, which might be disordered by the mutation and so affect the protein's function.
Substitution Matrices
To analyze whether the chosen amino acid substitutions are common or not, their scores in different amino acid substitution matrices are looked up. The scores reflect how often an animo acid was substituted with another in an alignment of related sequences. A high score indicates, that they have often been substituted and that the substituted amino acid is compatible with protein structure and function.
Two different families of matrices are taken into account in this analysis: The Dayhoff Amino Acid Substitution Matrices (Percent Accepted Mutation or PAM Matrices), which is based on the differences in closely related proteins, and the Blocks Amino Acid Substitution Matrices (BLOSUM Matrices), based on a small number of protein sequences and an evolutionary model of protein change.
The following matrices have been used:
Nr. | Substitution | BLOSUM62 | PAM1 | PAM250 |
---|---|---|---|---|
1 | Gly - Ser | 0 (worst: -4) | 16 (worst: 0) | 9 (worst: 1) |
2 | Asp - Asn | 1 (worst: -4) | 36 (worst: 0) | 7 (worst: 1) |
3 | His - Arg | 0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) |
4 | Arg - Gln | 1 (worst: -3) | 9 (worst: 0) | 5 (worst: 1) |
5 | Pro - Leu | -3 (worst: -4) | 3 (worst: 0) | 5 (worst: 1) |
6 | Ser - Asn | 1 (worst: -3) | 20 (worst: 0) | 5 (worst: 1) |
7 | Asn - Ser | 1 (worst: -4) | 34 (worst: 0) | 8 (worst: 1) |
8 | His - Arg | 0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) |
9 | Met - Val | 1 (worst: -3) | 17 (worst: 0) | 4 (worst: 1) |
10 | Leu - Pro | -3 (worst: -4) | 2 (worst: 0) | 3 (worst: 1) |
Multiple Alignment
The conservation in a mutliple sequence alignment of homologous structures helps to decide whether a mutation at a certain position alters the function or structure of a protein: a highly conserved position indicates, that this position is crucial for either function or structure of the protein and one can assume that a mutation at this position is damaging. A mutation at a very variable position in the alignment is in contrast more likely to be neutral.
To retrieve homologous sequences, protein blast was used with the option mammals. The resutling sequences are listed in the File:Protein blast results for GBA.pdf for reason of clarity . A multiple sequence alignment of these sequences is shown in Figure 3. Tha table below shows the conservation of the interesting positions, once calculated for all homologous sequences and once for the 25 best sequences.
Mutation Nr. | Position | Amino acid change | Conservation all (Jalview) | Conservation best 25 (Jalview) |
---|---|---|---|---|
1 | 49 | Gly - Ser | 0.0 | 11.0 |
2 | 63 | Asp - Asn | 4.0 | 11.0 |
3 | 99 | His - Arg | 3.0 | 8.0 |
4 | 159 | Arg - Gln | 0.0 | 11.0 |
5 | 221 | Pro - Leu | 0.0 | 11.0 |
6 | 310 | Ser - Asn | 0.0 | 11.0 |
7 | 409 | Asn - Ser | 0.0 | 11.0 |
8 | 350 | His - Arg | 0.0 | 9.0 |
9 | 455 | Met - Val | 0.0 | 11.0 |
10 | 509 | Leu - Pro | 0.0 | 9.0 |
Secondary Structure
To investigate whether the mutation influence the secondary structure of the resulting protein, secondary structure predictions with JPred3 and PSIPRED have been carried out as described in Task 3. The comparison between the original (wildtype) sequence and the mutated sequence is shown in this File:Secondary structure prediction of mutated GBA sequence.pdf. The mutations show no direct influence on the secondary structure elements: the elements do not get interrupted or destroyed by the mutations chosen in this analysis. There are only some minor variances in the lengths of the elements between the predicted structures of the original and mutated sequences. As these differences are not only located next to the mutations, but all over the protein, this may be due to the fact, that these are only predictions which may not be that accurate.
The table below shows the secondary structure assignments and predictions for the relevant positions chosen in this analysis.
Mutation Nr. | Position | Wildtype | Secondary Structure Wildtype | Mutation | Secondary Structure Mutation | ||||
Uniprot | PSIPRED | JPred3 | DSSP | PSIPRED | JPred3 | ||||
1 | 49 | Gly | Beta sheet | Coil | - | Turn | Ser | Coil | - |
2 | 63 | Asp | - | Coil | - | - | Asn | Coil | - |
3 | 99 | His | - | Coil | - | - | Arg | Coil | - |
4 | 159 | Arg | Beta sheet | Beta sheet | Beta sheet | Bend | Gln | Beta Sheet | Beta Sheet |
5 | 221 | Pro | - | Coil | - | - | Leu | Coil | - |
6 | 310 | Ser | - | Coil | - | Turn | Asn | Coil | - |
7 | 409 | Asn | Helix | Helix | Helix | Helix | Ser | Helix | Helix |
8 | 350 | His | Beta sheet | Beta sheet | - | Bend | Arg | Beta Sheet | - |
9 | 455 | Met | Helix | - | Helix ([? Unterschiedlich in Task 3 und hier: -]) | Helix | Val | Helix | - |
10 | 509 | Leu | Beta sheet | Beta sheet | Beta sheet | Bend | Pro | Beta Sheet | Beta Sheet |
SNAP
SNAP (screening for non-acceptable polymorphisms) is a method that predicts the functional effects of non-synonymous SNPs based on neural networks. The method only needs sequence information as input, but if available, one may include functional and structural annotations. SNAP was established by Rost B. and Bromberg Y. in 2007 <ref>Bromberg Y., Rost B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 2007 June; 35(11): 3823–3835.</ref>
Usage
- command line:
snapfun -i gbaseq.fasta -m mutations.txt -o snapfun_out.out
Results
The predictions made by SNAP are shown in the table below. SNAP identifies the majority of mutations as non-neutral: only three mutations are predicted to be neutral. Additionally the tool provides a reliability index (RI) reflecting the confidence of each prediction ranging from 0 (low reliability) to 9 (high reliability). Only half of the predictions made, have an RI higer than 5. This indicates, that half of the predictions are not very reliable.
Mutation Nr. | AA change | Prediction | Reliability Index | Expected Accuracy |
---|---|---|---|---|
1 | G49S | Neutral | 3 | 78% |
2 | D63N | Non-neutral | 5 | 87% |
3 | H99R | Neutral | 5 | 89% |
4 | R159Q | Non-neutral | 7 | 96% |
5 | P221L | Non-neutral | 5 | 87% |
6 | S310N | Neutral | 0 | 53% |
7 | N409S | Non-neutral | 1 | 63% |
8 | H350R | Non-neutral | 8 | 96% |
9 | M455V | Non-neutral | 3 | 78% |
10 | L509P | Non-neutral | 1 | 63% |
SIFT
SIFT (Sorting Intolerant From Tolerant) is a method which predicts whether a amino acid substitution affects protein function or not. The method is based on the assumption that important amino acids are conserved in the protein family and therefore changes at conserved positions tend to be deletirious. Substitutions with a score less than 0.05 are predicted deletirious. SIFT was introduced in 2003 by Ng P. and Henikoff S. <ref>Ng P., Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003 July 1; 31(13): 3812–3814. </ref>
Usage
- Webserver: http://sift.jcvi.org/www/SIFT_seq_submit2.html
- Input: sequence in fasta format, substitutions of interest (Example for mutation 1:
G49S
)
Results
The predictions made by SIFT are based on the multiple alignment which is shown in Figure 4 to the rights. The predictions made for the chosen mutations are shown in the table below. Half of the mutations are predicted to affect protein function and the other half is predicted to be tolerated.
Mutation Nr. | Prediction | Score | Sequence Conservation |
1 | tolerated | 0.51 | 3.05 |
2 | tolerated | 0.06 | 3.05 |
3 | tolerated | 0.62 | 3.04 |
4 | affect protein function | 0.03 | 3.01 |
5 | affect protein function | 0.00 | 3.01 |
6 | tolerated | 0.54 | 3.01 |
7 | affect protein function | 0.05 | 3.02 |
8 | affect protein function | 0.00 | 3.11 |
9 | tolerated | 0.12 | 3.01 |
10 | affect protein function | 0.01 | 3.09 |
PolyPhen-2
PolyPhen-2 (Polymorphism Phenotyping v2) predicts the possible structural and functional influence of an amino acid substitution of a human protein. The method uses three strucutre-based and eight sequence-based predictive features. Two datasets were used to train and test PolyPhen-2: HumDiv and HumVar. The method was intriduced by Adzhubei et al. in 2010. <ref>Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods 7(4):248-249 (2010).</ref>
Usage
- Webserver: http://genetics.bwh.harvard.edu/pph2/
- Input: sequence in fasta format, position and substitution of mutation
Results
The results of PolyPhen-2 are listed in the table below. The predictions for mutation 7 differ subject to the training set. With HumDiv it is predicted as being damaging, whereas the prediction based on HumVar indicates that the mutation is harmless. For the other mutations the predictions are the same, only varying sligthly in score, sensitivity and specificity.
Mutation Nr. | HumDiv | HumVar | ||||||
Prediction | Score | Sensitivity | Specificity | Prediction | Score | Sensitivity | Specificity | |
1 | probably damaging | 0.997 | 0.40 | 0.98 | probably damaging | 0.992 | 0.44 | 0.97 |
2 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 0.999 | 0.08 | 1.00 |
3 | benign | 0.000 | 1.00 | 0.00 | benign | 0.000 | 1.00 | 0.00 |
4 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 1.000 | 0.00 | 1.00 |
5 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 1.000 | 0.00 | 1.00 |
6 | benign | 0.100 | 0.94 | 0.85 | benign | 0.120 | 0.91 | 0.67 |
7 | possibly damaging | 0.573 | 0.88 | 0.91 | benign | 0.131 | 0.90 | 0.68 |
8 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 0.999 | 0.08 | 1.00 |
9 | probably damaging | 0.999 | 0.14 | 0.99 | probably damaging | 0.980 | 0.54 | 0.95 |
10 | probably damaging | 0.978 | 0.75 | 0.96 | probably damaging | 0.966 | 0.59 | 0.94 |
Discussion
Mutation 1: Gly - Ser (Pos. 49/10)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
0 (worst: -4) | 16 (worst: 0) | 9 (worst: 1) | 0.0/11.0 | Beta sheet | Coil | - | Turn |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Neutral Reliability Index: 3 Expected Accuracy: 78% |
Prediction: tolerated Score: 0.51 Sequence Conservation: 3.05 |
Prediction: probably damaging Score: 0.997 Sensitivity: 0.40 Specificity: 0.98 |
Mutation number 1 is the change from the aliphatic, unpolar and neutral amino acid Glycine to the aliphatic, polar and neutral amino acid Serine. The BLOSUM62, PAM1 und PAM250 indicate that this substitution is very common as its values are very high. Especially the PAM matrices with a score of 16 and 9 have high values.
Anyway the conservation is very high. The reason why it has a conservation of 0 for all found proteins is, because the alignment of the proteins has gaps there. But if you align only the best 25 found homologues there is total conservation and no other protein shows an amino acid exchange to Serine.
In Uniprot this position is assigned as part of a beta sheet, whereas the different tools predicted it as part of a coil or turn. As it is at the exterior of the protein as you can see at the visualization of figure 2, it should not affect the function of the protein that much.
SNAP and SIFT classify the mutation as neutral with an accuracy of 78% and tolerated with a score of 0.51 whereas Polyphen2 predicts it as probably damaging with a score of 0.997. As the mutation is listed in HGMD it is associated with Gaucher Disease and should therefore be damaging and affect the protein's function.
It may be not classified as damaging because the change from Glycine to Serine is common and the position of the mutation does not indicate a big influence.
Mutation 2: Asp - Asn (Pos. 63/24)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
1 (worst: -4) | 36 (worst: 0) | 7 (worst: 1) | 4.0/11.0 | - | Coil | - | - |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 5 Expected Accuracy: 87% |
Prediction: tolerated Score: 0.06 Sequence Conservation: 3.05 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Mutation number 2 is the exchange from the aliphatic, polar and acidic Aspartic acid to the aliphatic, polar and neutral amino acid Aparagine. Therefore it loses its acidic chracter and therefore may change the structure of the protein.
The BLOSUM62, PAM1 and PAM250 show that the substitution has relatively high values, so it is very common. But if you look at the alignment the conservation is also very high. There is no protein found with a substitution to Asparagine.
If you look at the structure you can see that it is part of a coil at the exterior of the protein. This is also the result of our secondary structure predictions. So the mutation may not influence the function of the protein that much, because it is not in the functional part of the protein.
SNAP predicts the mutation as non-neutral with an accuracy of 87%. Polyphen2 also assigns the mutation as probably damaging with a score of even 1.0. Only SIFT predicts it as tolerated but only with a score of 0.06. So the mutation may be damaging.
As the mutation is listed in HGMD it affects the function of the protein. It is associated with Gaucher disease 1 and was published in 2005. <ref>Identification and functional characterization of five novel mutant alleles in 58 Italian patients with Gaucher disease type 1. Miocić S, Filocamo M, Dominissini S, Montalvo AL, Vlahovicek K, Deganuto M, Mazzotti R, Cariati R, Bembi B, Pittis MG. Hum Mutat. 2005</ref>
Mutation 3: His - Arg (Pos. 99/60)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) | 3.0/8.0 | - | Coil | - | - |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Neutral Reliability Index: 5 Expected Accuracy: 89% |
Prediction: tolerated Score: 0.62 Sequence Conservation: 3.04 |
Prediction: benign Score: 0.000 Sensitivity: 1.00 Specificity: 1.00 |
Mutation number 3 is the change from the aromatic, polar and slightly basic Histidine to the aliphatic, polar and very basic Arginine. As they are both basic and have the same charge the exchange may not influence the function of the protein that much.
The values in the different substitution matrices are again relatively high. So the change from Histidine to Arginine is common. The interesting part here is the alignment. Histidine is not highly conserved. If you look at that position there are also many sequences with an Arginine. So there are proteins of other mammals which were mutated during evolution in that way and this exchange seems not to influence the function of the protein.
Histidine is part of a coil in the secondary structure and is situated in the exterior of the protein. So it is not at the functional part of the protein and the influence may be low.
SNAP, SIFT and Polyphen2 predict the mutation as neutral, tolerated and benign. Although the scores are not that high it is exactly what we expected. On the one hand the mutation is not listed in HGMD and therefore not associated with Gaucher Disease and on the other hand the conservation and the alignment show that the mutation should be tolerated.
Mutation 4: Arg - Gln (Pos. 159/120)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
1 (worst: -3) | 9 (worst: 0) | 5 (worst: 1) | 0.0/11.0 | Beta sheet | Beta sheet | Beta sheet | Bend |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 7 Expected Accuracy: 96% |
Prediction: affect protein function Score: 0.03 Sequence Conservation: 3.01 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Mutation number 4 is the change from the aliphatic, polar and very basic Arginine to the aliphatic, polar and neutral Glutamine. Concerning the substitution matrices the exchange is not rare. The values are relatively high, so it is common.
The alignment shows a very high conservation for the best 25 hits. All but two have also an Arginine at that position, the others show a substitution with Tryptophan but no Glutamine. The low conservation for all sequences is again because of gaps in the alignment.
This Arginine is part of a beta sheet and plays an important role in forming hydrogen bonds with the active site. So a mutation should affect the function of the protein. As the amino acid change is from basic to neutral it loses chemical properties that are necessary for the protein's function.
SNAP, SIFT and Polyphen2 predict the mutation as damaging. The accuracy of SNAP with 96% is very high and also the prediction of Polyphen2 has a score of 1.00. In HGMD it is associated with Gaucher Disease 1 which was already published in 1988. <ref>Gaucher disease type 1: cloning and characterization of a cDNA encoding acid beta-glucosidase from an Ashkenazi Jewish patient. Graves PN, Grabowski GA, Eisner R, Palese P, Smith FI. DNA. 1988 Oct;7(8):521-8.</ref>
Mutation 5: Pro - Leu (Pos. 221/182)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 5 Expected Accuracy: 87% |
Prediction: affect protein function Score: 0.00 Sequence Conservation: 3.01 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Mutation number 5 is the change from the heterocyclic, unpolar and neutral Proline to aliphatic, unpolar and neutral Leucine. As Proline forms a ring structure the mutation may change the structure of the protein. Apart from that the polarity remains the same and so the chemical properties.
The substitution is not as common as the mutations mentioned before. BLOSUM62 has with -3 a very low value and also the PAM matrices show with the values 3 and 5 that the substitution is rare. So the mutation may not be tolerated. If you look at the alignment you can also see a high conservation. The found sequences share the Proline at this position.
The Proline is situated in the interior of the protein at a coil. As the interior is the functional part of the protein the mutation may not be tolerated. But it is not part of the active site.
SNAP, SIFT and Polyphen2 predict the mutation as non-neutral, "affect protein function" and probably damaging and that is what we expected. In HGMD it is also associated with Gaucher Disease 2. The mutation was published 2004. <ref>Functional analysis of 13 GBA mutant alleles identified in Gaucher disease patients: Pathogenic changes and "modifier" polymorphisms. Montfort M, Chabás A, Vilageliu L, Grinberg D. Hum Mutat. 2004 Jun;23(6):567-75.</ref>
Mutation 6: Ser - Asn (Pos. 310/271)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
1 (worst: -3) | 20 (worst: 0) | 5 (worst: 1) | 0.0/11.0 | - | Coil | - | Turn |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Neutral Reliability Index: 0 Expected Accuracy: 53% |
Prediction: tolerated Score: 0.54 Sequence Conservation: 3.01 |
Prediction: benign Score: 0.100 Sensitivity: 0.94 Specificity: 0.85 |
Mutation number 6 is the change from the aliphatic, polar and neutral Serine to the aliphatic, polar and neutral Asparagine. They are both zwitterions and share chemical properties. The substitution matrices show relatively high values, so the substitution is common.
The conservation is very high. If you look at the best 25 hits they all have Serine at this position. If you look at more there are also some sequences with a Glycine at this position. But there is no change to Asparagine.
This position is part of a coil in the exterior of the protein and not near the active site. Therefore the mutation might not influence the function of the protein that much.
SNAP predicts the mutation as neutral with an expected accuracy of 53%. SIFT also predicts the mutation as tolerated with a score of 0.54 as well as Polyphen2 which predicts it as benign. But the mutation is associated with Gaucher Disease as published 1997.<ref>Identification and expression of acid beta-glucosidase mutations causing severe type 1 and neurologic type 2 Gaucher disease in non-Jewish patients. Grace ME, Desnick RJ, Pastores GM. J Clin Invest. 1997 May 15;99(10):2530-7</ref> So the prediction is wrong. The mutation is not tolerated and changes the function of the protein. Although the scores are not that high it was not predicted as damaging.
Mutation 7: Asn - Ser (Pos. 409/370)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
1 (worst: -4) | 34 (worst: 0) | 8 (worst: 1) | 0.0/11.0 | Helix | Helix | Helix | Helix |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 1 Expected Accuracy: 63% |
Prediction: affect protein function Score: 0.05 Sequence Conservation: 3.02 |
Prediction: possibly damaging Score: 0.573 Sensitivity: 0.88 Specificity: 0.91 |
Mutation number 7 is the substitution of the aliphatic, polar and neutral Asparagine to the also aliphatic, polar and neutral Serine. This is the most common mutation found in gaucher disease type 1 patients.
Concerning the substitution matrices the mutation is very common. PAM1 has with 34 a very high value as well as PAM250 with 8. Only BLOSUM62 does not show that tendency because it has only a value of 1.
The conservation in the alignment is very high. All sequences show an Asparagine at that position. So there are no mutations to Serine which would be accepted.
The mutation is situated at a helix in the interior of the protein. So it may affect its function and therefore it may not be tolerated.
SNAP predicts the mutation as non-neutral, SIFT as affecting the protein function and Polyphen2 as possibly damaging. It is interesting, that Polyphen2 predicts it as only possibly damaging and not probably damaging. We know that the mutation is associated with Gaucher Disease 1, which was first published in 1988<ref>Genetic heterogeneity in type 1 Gaucher disease: multiple genotypes in Ashkenazic and non-Ashkenazic individuals.
Tsuji S, Martin BM, Barranger JA, Stubblefield BK, LaMarca ME, Ginns EI. Proc Natl Acad Sci U S A. 1988 Apr;85(7):2349-52.</ref> and is supported by a lot of other sources given in HGMD.
Mutation 8: His - Arg (Pos. 350/311)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) | 0.0/9.0 | Beta sheet | Beta sheet | - | Bend |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 8 Expected Accuracy: 96% |
Prediction: affect protein function Score: 0.00 Sequence Conservation: 3.11 |
Prediction: probably damaging Score: 1.0 Sensitivity: 0.00 Specificity: 1.00 |
Mutation number 8 is the substitution of the aromatic, polar and slightly basic Histidine to the aliphatic, polar and strong basic Arginine. It is situated in the interior of the protein and part of a beta sheet. So the mutation may cause and change in the protein's function.
Concerning BLOSUM62 the amino acid replacement is rare because it has only a value of 0. The values for PAM1 and PAM250 are higher with 10 and 6 which would indicate a common substitution. The conservation is relatively high, there are only two sequences with a substitution to Glutamic acid but no change to Arginine. Therefore the mutation may not be accepted.
SNAP predicts the mutation as non-neutral, SIFT as "affect protein function" and Polyphen2 as probably damaging. And the mutation is associated to Gaucher Disease 2 as published in 1999<ref>Is the perinatal lethal form of Gaucher disease more common than classic type 2 Gaucher disease? Stone DL, van Diggelen OP, de Klerk JB, Gaillard JL, Niermeijer MF, Willemsen R, Tayebi N, Sidransky E. Eur J Hum Genet. 1999 May-Jun;7(4):505-9.</ref> So the prediction is right.
Mutation 9: Met - Val (Pos. 455/416)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|
1 (worst: -3) | 17 (worst: 0) | 4 (worst: 1) | 0.0/11.0 | Helix | Coil/Helix | Helix | Helix |
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 3 Expected Accuracy: 78% |
Prediction: tolerated Score: 0.12 Sequence Conservation: 3.01 |
Prediction: probably damaging Score: 0.999 Sensitivity: 0.14 Specificity: 0.99 |
Mutation number 9 is the substitution of the aliphatic, unpolar and neutral Methionine to the aliphatic, unpolar and neutral Valine. The interesting thing is that Methionine contains a sulfur. Therefore it has other chemical properties than Valine. As it is also situated at a helix at an important part of the protein this may affect the protein's function.
The values in the substitution matrices of BLOSUM62 and PAM250 are relatively low so the substitution is rare. The conservation in the alignment is also very high and there is no other amino acid at this position.
SNAP and Polyphen2 predict the mutation as non-neutral and probably damaging. Only SIFT predicts it as tolerated. In HGMD it is associated with Gaucher Disease 2 which was published in 2005<ref>Novel mutations in type 2 Gaucher disease in Chinese and their functional characterization by heterologous expression. Tang NL, Zhang W, Grabowski GA, To KF, Choy FY, Ma SL, Shi HP. Hum Mutat. 2005 Jul;26(1):59-60.</ref> So the prediction of SIFT is wrong, SNAP and Polyphen2 predict what we expected.
Mutation 10: Leu - Pro (Pos. 509/470)
BLOSUM62 | PAM1 | PAM250 | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|
SNAP | SIFT | Polyphen2 |
---|---|---|
Prediction: Non-neutral Reliability Index: 1 Expected Accuracy: 63% |
Prediction: affect protein function Score: 0.01 Sequence Conservation: 3.09 |
Prediction: probably damaging Score: 0.978 Sensitivity: 0.75 Specificity: 0.96 |
Mutation number 10 is the change from the aliphatic, unpolar and neutral Leucine to the heterocyclic, unpolar and neutral Proline. Proline forms a ring structure so the change might also influence the protein folding. It is part of a beta sheet but at the exterior of the protein. So the change may not affect the function of the protein.
The values in the substitution matrices are very low, in BLOSUM62 as well as in the PAM matrices. So it is not a common substitution. The conservation is relatively high, although there are some substitutions to Valine and Phenylalanine.
SNAP, SIFT and Polyphen2 predict the mutation as non-neutral and probably damaging. But we do not know if it has such an influence on the protein's function. The mutation is only found in dbSNP and not in HGMD.