Difference between revisions of "Sequence based mutation analysis of GBA"
(→Mutation 8: His - Arg (Pos. 350/311)) |
(→Mutation 9: Met - Val (Pos. 455/416)) |
||
Line 658: | Line 658: | ||
Mutation number 9 is the substitution of the aliphatic, unpolar and neutral Methionine to the aliphatic, unpolar and neutral Valine. The interesting thing is that Methionine contains a sulfur. Therefore it has other chemical properties than Valine. As it is also situated at a helix at an important part of the protein this may affect the protein's function.<br/> |
Mutation number 9 is the substitution of the aliphatic, unpolar and neutral Methionine to the aliphatic, unpolar and neutral Valine. The interesting thing is that Methionine contains a sulfur. Therefore it has other chemical properties than Valine. As it is also situated at a helix at an important part of the protein this may affect the protein's function.<br/> |
||
The values in the substitution matrices of BLOSUM62 and PAM250 are relatively low so the substitution is rare. Only PAM1 shows a higher value. The PSSM value is also very low. The conservation in the alignment is very high and there is no other amino acid at this position. These are all signs for a non-neutral mutation.<br/> |
The values in the substitution matrices of BLOSUM62 and PAM250 are relatively low so the substitution is rare. Only PAM1 shows a higher value. The PSSM value is also very low. The conservation in the alignment is very high and there is no other amino acid at this position. These are all signs for a non-neutral mutation.<br/> |
||
− | SNAP and Polyphen2 predict the mutation as non-neutral and probably damaging. Only SIFT predicts it as tolerated |
+ | SNAP and Polyphen2 predict the mutation as non-neutral and probably damaging. Only SIFT predicts it as tolerated.<br/> |
+ | All these results together indicate a non-neutral change. We would classfiy it as damaging because the substitution is rare and the chemical properties of the amino acid change. Also the majority of the prediction tools classify it as damaging. |
||
+ | <br/>In HGMD it is associated with Gaucher Disease 2 which was published in 2005<ref>Novel mutations in type 2 Gaucher disease in Chinese and their functional characterization by heterologous expression. Tang NL, Zhang W, Grabowski GA, To KF, Choy FY, Ma SL, Shi HP. Hum Mutat. 2005 Jul;26(1):59-60.</ref> So the prediction of SIFT is wrong, SNAP and Polyphen2 predict what we expected. |
||
=== Mutation 10: Leu - Pro (Pos. 509/470) === |
=== Mutation 10: Leu - Pro (Pos. 509/470) === |
Revision as of 18:48, 27 June 2011
Contents
- 1 Introduction
- 2 Mutation Analysis
- 3 Discussion
- 3.1 Overview
- 3.2 Mutation 1: Gly - Ser (Pos. 49/10)
- 3.3 Mutation 2: Asp - Asn (Pos. 63/24)
- 3.4 Mutation 3: His - Arg (Pos. 99/60)
- 3.5 Mutation 4: Arg - Gln (Pos. 159/120)
- 3.6 Mutation 5: Pro - Leu (Pos. 221/182)
- 3.7 Mutation 6: Ser - Asn (Pos. 310/271)
- 3.8 Mutation 7: Asn - Ser (Pos. 409/370)
- 3.9 Mutation 8: His - Arg (Pos. 350/311)
- 3.10 Mutation 9: Met - Val (Pos. 455/416)
- 3.11 Mutation 10: Leu - Pro (Pos. 509/470)
- 3.12 Summary
Introduction
The ten SNPs shown in the table below and highlighted in Figure 1 were chosen for the analysis in this task. It was tried to include SNPs all over the protein, in order to investigate the influence of mutations in several parts of the protein. The mutated residues forming hydrogenbonds with the active site of glucocerebrosidase are included, as these should result in either a mal- or nonfunctioning protein. It was not easy to find missense mutations only listed in dbSNP, as most of them were listed in HGMD, too.
Nr. | SNP ID/Accession Number | Database | Position including SP |
Position without SP |
Amino Acid Change | Codon Change | Remarks |
1 | CM081634 | HGMD | 49 | 10 | Gly - Ser | cGGC-AGC | |
2 | rs74953658, CM050263 | dbSNP, HGMD | 63 | 24 | Asp - Asn | tGAC-AAC | |
3 | rs1141820 | dbSNP | 99 | 60 | His - Arg | CAC - CGC | suspected, status not validated |
4 | CM880035 | HGMD | 159 | 120 | Arg - Gln | CGG-CAG | synonymos mutation at this position listed in dbSNP; forming hydrogen bond with active site |
5 | rs80205046, CM041347 | dbSNP, HGMD | 221 | 182 | Pro - Leu | CCC - CTC | |
6 | rs74731340, CM970620 | dbSNP, HGMD | 310 | 271 | Ser - Asn | AGT - AAT | |
7 | CM880036 | HGMD | 409 | 370 | Asn - Ser | AAC-AGC | most common mutation found in gaucher disease type 1 patients |
8 | CM993703 | HGMD | 350 | 311 | His - Arg | CAT-CGT | severe form of gaucher disease 2; forming hydrogen bond with active site |
9 | rs80020805, CM052245 | dbSNP, HGMD | 455 | 416 | Met - Val | cATG-GTG | |
10 | rs113825752 | dbSNP | 509 | 470 | Leu - Pro | CTT - CCT |
Mutations listed in HGMD (Mutations 1, 4, 7 and 8 ) or both HGMD and dbSNP (Mutations 2, 5, 6 and 9) are known to cause Gaucher Disease, whereas mutations only listed in dbSNP (Mutations 3 and 10) are neutral. In this section it is assumed that these facts are not known and it is tried to predict whether the mutations are damaging or harmless.
Mutation Analysis
In the following section, the different mutations/SNPs will be analyzed to determine, whether they are neutral or whether they will affect the function of glucocerebrosidase. To do so, several facts were taken into account: The physicochemical properties and changes of the amino acids, the substitution values of the specific amino acids in various substitution matrices, the conservation of the specific positions in a multiple sequence alignment of homologous structures and the predictions of different prediction tools (SIFT, SNAP and Polyphen 2).
Physicochemical Properties and Changes
The physicochemical properties and changes of the wildtype and mutated aminoacids are listed below, and the superpositions, allowing one to see the structural differences, are shown in Figure 2 to the right. Substitutions of aminoacids that are structurally and chemically different are more likely to affect the function of a protein, than substitutions of very similar amino acids.
Mutation 1
The wildtype amino acid, Glycine is nonpolar, whereas the mutated amino acid Serin is polar. This different polarity, could be an indication, that the mutation is damaging. Looking at the structure, one can see, that this residue (located at pos. 10 in the mature protein) is situated in a beta strand at the exterior of the protein. Therefore a substitution should not affect the function of the protein that much.
Mutation 2
In this mutation the Aspartic acid, an acidic amino acid, is replaced by its derivative Asparagine. Therefore the residue looses its acidic character, which could have an effect on the folding of the protein. If one looks at the structure one can see, that the residue is not located in the interior of the protein, so the mutation should also not affect the function.
Mutation 3
Histidine is mutated to Arginine, which are both amino acids with a positively charged functional group that are basic. Histidine forms a ring structure and is aromatic, whereas Arginine forms a straight chain. The residue is also situated at the exterior of the protein, so its influence is not so strong. As they have the same charge the propertiers are almost the same and the mutation may be tolerated. Only the different structure of Arginine may influence the function.
Mutation 4
In this mutation the positively charged Arginine is replaced by the charged amino acid Glutamine. Glutamine is a zwitterion and can be positively charged as well as negatively. It is situated in the interior of the protein, so it could also affect the function of the protein.
Mutation 5
Proline is the amino acid, which forms a ring structure. Therefore it has a great influence on the folding of the protein. Both, Proline and Leucine are zwitterions. So the chemical properties are the same. As it is also in the interior of the protein it could influence the function of the protein.
Mutation 6
Serine is a hydroxylic amino acid which is mutated in the acidic amino acid Asparagine. They are both zwitterions, aliphatic, polar and neutral and therefore they share chemical propertiers. The structure is also very similar. As it is positioned at the exterior the influence of the mutation may be not that strong.
Mutation 7
Asparagine is an acidic amino acid whereas Serine is hydroxylic. This mutation is exactly the other way round than mutation six. But this amino acid is situated in the interior of the protein as part of a helix. So the mutation could affect the function of the protein.
Mutation 8
Histidine and Arginine are both basic amino acids. Histidine builds an imidazole ring, which is protonated. The residue is in the interior of the protein and involved in the proteins function. So the mutation may not be tolerated.
Mutation 9
Methionine is a sulfur containing amino acid which is mutated to Valine. Their chemical properties differ so the mutation might affect the folding or the function of the protein. If you look at the structure you see that it is placed at a helix.
Mutation 10
Leucine is mutated to Proline, which is the amino acid that forms a ring structure. Therefore the structure of the protein might change. If you look at the structure you see that it is part of a beta sheet, which might be disordered by the mutation and so affect the protein's function.
PSSM
The position-specific scoring matrix shows the conservation of the residues in a multiple alignment. The higher the values are for a substitution the more conserved it is. So if the values are high the substitution is tolerated.
Usage
- command line:
blastpgp -i gbaseq.fasta -j 5 -d /data/blast/nr/nr -e 10E-6 -Q psiblast.mat -o psiblast.out
Last position-specific scoring matrix computed, weighted observed percentages rounded down, information per position, and relative weight of gapless real matches to pseudocounts
A R N D C Q E G H I L K M F P S T W Y V A R N D C Q E G H I L K M F P S T W Y V
49 G 3 -2 -2 -2 -4 1 -1 3 -4 -4 1 1 1 0 0 -1 -3 -4 -4 -3 26 2 2 2 0 6 3 20 0 0 13 10 3 4 4 4 0 0 0 0 0.35 1.25
63 D 0 -4 0 5 -5 1 -1 0 -4 -1 -1 0 -4 -5 -1 -2 -3 -5 -5 0 8 0 4 39 0 6 3 7 0 4 8 6 0 0 3 2 0 0 0 8 0.58 1.51
99 H 0 -1 0 1 -5 0 0 -1 1 -1 -3 0 0 -1 3 1 1 2 0 -1 7 4 4 7 0 5 7 6 3 4 3 6 3 3 13 10 8 3 3 4 0.14 1.99
159 R -4 8 -3 -4 -5 -2 -3 -5 -3 -5 -3 2 -4 -5 -4 -3 -4 1 -4 -5 0 79 0 0 0 0 0 0 0 0 3 12 0 0 0 0 0 2 0 0 1.73 1.34
221 P -3 -5 -4 -5 -6 -3 -4 -4 -5 -3 -5 -4 0 -6 8 -3 -2 -7 -6 -3 1 0 1 0 0 1 0 1 0 1 1 0 2 0 87 1 2 0 0 2 2.52 1.86
310 S 3 2 2 0 -1 0 -1 -2 1 -4 -4 1 -4 -3 -3 2 0 -5 -1 -3 25 11 11 4 1 4 4 2 3 1 2 7 0 1 1 13 5 0 2 1 0.38 1.86
409 N 1 -3 2 4 0 -4 -2 0 -1 -2 -2 -4 1 0 -3 1 0 2 3 -3 10 1 8 24 2 0 2 6 1 2 4 0 3 4 1 9 6 2 12 2 0.40 1.84
350 H -4 -3 -2 -4 -6 1 -1 -5 10 -6 -5 -3 -4 -4 -5 -4 -2 -5 -1 -6 0 0 0 0 0 5 2 0 87 0 0 0 0 0 0 0 2 0 0 0 2.71 1.66
455 M 0 2 1 0 1 -1 1 -2 1 -1 -2 1 1 -2 2 1 -1 1 -4 -1 7 9 6 6 4 3 11 3 3 3 5 7 4 2 9 9 3 2 0 4 0.13 1.79
509 L -2 -5 -5 -5 -1 0 -5 -5 -5 3 2 -5 -1 3 -6 -4 -1 -3 3 4 3 0 0 1 1 4 0 1 0 17 17 0 1 14 0 1 3 0 9 28 0.71 1.77
The substitutions for our mutations are colored in red. Our sixth and seventh substitution have values greater than zero which may indicate that the mutations are tolerated. All others have smaller values, so the substitutions may not be tolerated.
Substitution Matrices
To analyze whether the chosen amino acid substitutions are common or not, their scores in different amino acid substitution matrices are looked up. The scores reflect how often an animo acid was substituted with another in an alignment of related sequences. A high score indicates, that they have often been substituted and that the substituted amino acid is compatible with protein structure and function.
Two different families of matrices are taken into account in this analysis: The Dayhoff Amino Acid Substitution Matrices (Percent Accepted Mutation or PAM Matrices), which is based on the differences in closely related proteins, and the Blocks Amino Acid Substitution Matrices (BLOSUM Matrices), based on a small number of protein sequences and an evolutionary model of protein change.
The following matrices have been used:
Nr. | Substitution | BLOSUM62 | PAM1 | PAM250 |
---|---|---|---|---|
1 | Gly - Ser | 0 (worst: -4) | 16 (worst: 0) | 9 (worst: 1) |
2 | Asp - Asn | 1 (worst: -4) | 36 (worst: 0) | 7 (worst: 1) |
3 | His - Arg | 0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) |
4 | Arg - Gln | 1 (worst: -3) | 9 (worst: 0) | 5 (worst: 1) |
5 | Pro - Leu | -3 (worst: -4) | 3 (worst: 0) | 5 (worst: 1) |
6 | Ser - Asn | 1 (worst: -3) | 20 (worst: 0) | 5 (worst: 1) |
7 | Asn - Ser | 1 (worst: -4) | 34 (worst: 0) | 8 (worst: 1) |
8 | His - Arg | 0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) |
9 | Met - Val | 1 (worst: -3) | 17 (worst: 0) | 4 (worst: 1) |
10 | Leu - Pro | -3 (worst: -4) | 2 (worst: 0) | 3 (worst: 1) |
Multiple Alignment
The conservation in a multiple sequence alignment of homologous structures helps to decide whether a mutation at a certain position alters the function or structure of a protein: a highly conserved position indicates, that this position is crucial for either function or structure of the protein and one can assume that a mutation at this position is damaging. A mutation at a very variable position in the alignment is in contrast more likely to be neutral.
To retrieve homologous sequences, protein blast was used with the option mammals. The resulting sequences are listed in the File:Protein blast results for GBA.pdf for reason of clarity . A multiple sequence alignment of these sequences is shown in Figure 3. The table below shows the conservation of the interesting positions, once calculated for all homologous sequences and once for the 25 best sequences.
Mutation Nr. | Position | Amino acid change | Conservation all (Jalview) | Conservation best 25 (Jalview) |
---|---|---|---|---|
1 | 49 | Gly - Ser | 0.0 | 11.0 (= 100% conservation) |
2 | 63 | Asp - Asn | 4.0 | 11.0 |
3 | 99 | His - Arg | 3.0 | 8.0 |
4 | 159 | Arg - Gln | 0.0 | 11.0 |
5 | 221 | Pro - Leu | 0.0 | 11.0 |
6 | 310 | Ser - Asn | 0.0 | 11.0 |
7 | 409 | Asn - Ser | 0.0 | 11.0 |
8 | 350 | His - Arg | 0.0 | 9.0 |
9 | 455 | Met - Val | 0.0 | 11.0 |
10 | 509 | Leu - Pro | 0.0 | 9.0 |
Secondary Structure
To investigate whether the mutation influence the secondary structure of the resulting protein, secondary structure predictions with JPred3 and PSIPRED have been carried out as described in Task 3. The comparison between the original (wildtype) sequence and the mutated sequence is shown in this File:Secondary structure prediction of mutated GBA sequence.pdf. The mutations show no direct influence on the secondary structure elements: the elements do not get interrupted or destroyed by the mutations chosen in this analysis. There are only some minor variances in the lengths of the elements between the predicted structures of the original and mutated sequences. As these differences are not only located next to the mutations, but all over the protein, this may be due to the fact, that these are only predictions which may not be that accurate.
The table below shows the secondary structure assignments and predictions for the relevant positions chosen in this analysis.
Mutation Nr. | Position | Wildtype | Secondary Structure Wildtype | Mutation | Secondary Structure Mutation | ||||
Uniprot | PSIPRED | JPred3 | DSSP | PSIPRED | JPred3 | ||||
1 | 49 | Gly | Beta sheet | Coil | - | Turn | Ser | Coil | - |
2 | 63 | Asp | - | Coil | - | - | Asn | Coil | - |
3 | 99 | His | - | Coil | - | - | Arg | Coil | - |
4 | 159 | Arg | Beta sheet | Beta sheet | Beta sheet | Bend | Gln | Beta Sheet | Beta Sheet |
5 | 221 | Pro | - | Coil | - | - | Leu | Coil | - |
6 | 310 | Ser | - | Coil | - | Turn | Asn | Coil | - |
7 | 409 | Asn | Helix | Helix | Helix | Helix | Ser | Helix | Helix |
8 | 350 | His | Beta sheet | Beta sheet | - | Bend | Arg | Beta Sheet | - |
9 | 455 | Met | Helix | - | Helix ([? Unterschiedlich in Task 3 und hier: -]) | Helix | Val | Helix | - |
10 | 509 | Leu | Beta sheet | Beta sheet | Beta sheet | Bend | Pro | Beta Sheet | Beta Sheet |
SNAP
SNAP (screening for non-acceptable polymorphisms) is a method that predicts the functional effects of non-synonymous SNPs based on neural networks. The method only needs sequence information as input, but if available, one may include functional and structural annotations. SNAP was established by Rost B. and Bromberg Y. in 2007 <ref>Bromberg Y., Rost B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 2007 June; 35(11): 3823–3835.</ref>
Usage
- command line:
snapfun -i gbaseq.fasta -m mutations.txt -o snapfun_out.out
Results
The predictions made by SNAP are shown in the table below. SNAP identifies the majority of mutations as non-neutral: only three mutations are predicted to be neutral. Additionally the tool provides a reliability index (RI) reflecting the confidence of each prediction ranging from 0 (low reliability) to 9 (high reliability). Only half of the predictions made, have an RI higer than 5. This indicates, that half of the predictions are not very reliable.
Mutation Nr. | AA change | Prediction | Reliability Index | Expected Accuracy |
---|---|---|---|---|
1 | G49S | Neutral | 3 | 78% |
2 | D63N | Non-neutral | 5 | 87% |
3 | H99R | Neutral | 5 | 89% |
4 | R159Q | Non-neutral | 7 | 96% |
5 | P221L | Non-neutral | 5 | 87% |
6 | S310N | Neutral | 0 | 53% |
7 | N409S | Non-neutral | 1 | 63% |
8 | H350R | Non-neutral | 8 | 96% |
9 | M455V | Non-neutral | 3 | 78% |
10 | L509P | Non-neutral | 1 | 63% |
SIFT
SIFT (Sorting Intolerant From Tolerant) is a method which predicts whether a amino acid substitution affects protein function or not. The method is based on the assumption that important amino acids are conserved in the protein family and therefore changes at conserved positions tend to be deletirious. Substitutions with a score less than 0.05 are predicted deletirious. SIFT was introduced in 2003 by Ng P. and Henikoff S. <ref>Ng P., Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003 July 1; 31(13): 3812–3814. </ref>
Usage
- Webserver: http://sift.jcvi.org/www/SIFT_seq_submit2.html
- Input: sequence in fasta format, substitutions of interest (Example for mutation 1:
G49S
)
Results
The predictions made by SIFT are based on the multiple alignment which is shown in Figure 4 to the rights. The predictions made for the chosen mutations are shown in the table below. Half of the mutations are predicted to affect protein function and the other half is predicted to be tolerated.
Mutation Nr. | Prediction | Score | Sequence Conservation |
1 | tolerated | 0.51 | 3.05 |
2 | tolerated | 0.06 | 3.05 |
3 | tolerated | 0.62 | 3.04 |
4 | affect protein function | 0.03 | 3.01 |
5 | affect protein function | 0.00 | 3.01 |
6 | tolerated | 0.54 | 3.01 |
7 | affect protein function | 0.05 | 3.02 |
8 | affect protein function | 0.00 | 3.11 |
9 | tolerated | 0.12 | 3.01 |
10 | affect protein function | 0.01 | 3.09 |
PolyPhen-2
PolyPhen-2 (Polymorphism Phenotyping v2) predicts the possible structural and functional influence of an amino acid substitution of a human protein. The method uses three strucutre-based and eight sequence-based predictive features. Two datasets were used to train and test PolyPhen-2: HumDiv and HumVar. The method was intriduced by Adzhubei et al. in 2010. <ref>Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods 7(4):248-249 (2010).</ref>
Usage
- Webserver: http://genetics.bwh.harvard.edu/pph2/
- Input: sequence in fasta format, position and substitution of mutation
Results
The results of PolyPhen-2 are listed in the table below. The predictions for mutation 7 differ subject to the training set. With HumDiv it is predicted as being damaging, whereas the prediction based on HumVar indicates that the mutation is harmless. For the other mutations the predictions are the same, only varying sligthly in score, sensitivity and specificity.
Mutation Nr. | HumDiv | HumVar | ||||||
Prediction | Score | Sensitivity | Specificity | Prediction | Score | Sensitivity | Specificity | |
1 | probably damaging | 0.997 | 0.40 | 0.98 | probably damaging | 0.992 | 0.44 | 0.97 |
2 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 0.999 | 0.08 | 1.00 |
3 | benign | 0.000 | 1.00 | 0.00 | benign | 0.000 | 1.00 | 0.00 |
4 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 1.000 | 0.00 | 1.00 |
5 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 1.000 | 0.00 | 1.00 |
6 | benign | 0.100 | 0.94 | 0.85 | benign | 0.120 | 0.91 | 0.67 |
7 | possibly damaging | 0.573 | 0.88 | 0.91 | benign | 0.131 | 0.90 | 0.68 |
8 | probably damaging | 1.000 | 0.00 | 1.00 | probably damaging | 0.999 | 0.08 | 1.00 |
9 | probably damaging | 0.999 | 0.14 | 0.99 | probably damaging | 0.980 | 0.54 | 0.95 |
10 | probably damaging | 0.978 | 0.75 | 0.96 | probably damaging | 0.966 | 0.59 | 0.94 |
Discussion
Overview
The following table indicates whether the mutation is rather neutral or non-neutral according to the different analysis steps. The different results are classified as follows:
- Amino-Acid Properties: neutral, if the protperties are the same, non-neutral otherwise.
- Substitution Matrix Scores: neutral, if the scores are high, non-neutral otherwise.
- PSSM: neutral, if the score is greater zero, non-neutral otherwise.
- Conservation: neutral, if mutated amino acid appears in alignment, non-neutral otherwise.
- Secondary Structure: neutral, if the mutation is situated in a region without an assigned secondary structure element, non-neutral otherwise.
- SNAP, SIFT, PolyPhen: according to the prediction.
Detailed descriptions of the applied analysis steps are given in the section above.
Mutation | Amino-Acid Properties | Substitution Matrices | PSSM | Conservation | Secondary Structure | SNAP | SIFT | PolyPhen-2 | |||
BLOSUM62 | PAM1 | PAM250 | HumDiv | HumVar | |||||||
1 | non-neutral | neutral | neutral | neutral | non-neutral | non-neutral | non-neutral | neutral | neutral | non-neutral | non-neutral |
2 | non-neutral | neutral | neutral | neutral | non-neutral | non-neutral | neutral | non-neutral | neutral | non-neutral | non-neutral |
3 | neutral | non-neutral | neutral | neutral | non-neutral | neutral | neutral | neutral | neutral | neutral | neutral |
4 | non-neutral | non-neutral | neutral | neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral |
5 | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral |
6 | neutral | non-neutral | neutral | neutral | neutral | non-neutral | neutral | neutral | neutral | neutral | neutral |
7 | neutral | non-neutral | neutral | neutral | neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | neutral |
8 | non-neutral | non-neutral | neutral | neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral |
9 | neutral | non-neutral | neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | neutral | non-neutral | non-neutral |
10 | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral |
Mutation 1: Gly - Ser (Pos. 49/10)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
0 (worst: -4) | 16 (worst: 0) | 9 (worst: 1) | -1 | 0.0/11.0 | Beta sheet | Coil | - | Turn |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Neutral Reliability Index: 3 Expected Accuracy: 78% |
Prediction: tolerated Score: 0.51 Sequence Conservation: 3.05 |
Prediction: probably damaging Score: 0.997 Sensitivity: 0.40 Specificity: 0.98 |
Prediction: probably damaging Score: 0.992 Sensitivity: 0.44 Specificity: 0.97 |
Mutation number 1 describes the change from the aliphatic, unpolar and neutral amino acid Glycine to the aliphatic, polar and neutral amino acid Serine. The PAM1 and PAM250 substitution matrices indicate that this substitution is very common as its values are very high. Only the value of BLOSUM62 is only 0 which does not stand for a common substitution. The PAM matrices have high values with a score of 16 and 9. The PSSM score is low with -1, which means, that the mutation is not common.
The conservation of this position in the multiple sequence alignment is on the other hand very high: If the 25 best homologous sequences are aligned, there is a total conservation and no other protein shows an amino acid exchange. The reason why the conservation in the multiple sequence alignment of all proteins found with blast is 0, is that the alignment has gaps at that position.
In Uniprot this position is assigned as part of a beta sheet, whereas the different tools predicted it as part of a coil or turn. As it is at the exterior of the protein as you can see at the visualization of figure 2, it should not affect the function of the protein that much.
SNAP and SIFT classify the mutation as neutral with an accuracy of 78% and tolerated with a score of 0.51 whereas Polyphen2 predicts it as probably damaging with both datasets.
Based on the results listed above one tends to classify the mutation as neutral: It is located at the exterior of the protein, the substitution matrices indicate, that the substitution from Glycine to Serine is common and two of the prediction tools predict the mutations as harmless. But this classification is not explicit: The high conservation in the multiple sequence alignment and the predictions of PolyPhen-2 indicate otherwise.
According to HGMD, this mutation is associated with Gaucher Disease and is therefore damaging and affecting the function of the protein. This shows that the prediction in this analysis was not correct.
Mutation 2: Asp - Asn (Pos. 63/24)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
1 (worst: -4) | 36 (worst: 0) | 7 (worst: 1) | 0 | 4.0/11.0 | - | Coil | - | - |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 5 Expected Accuracy: 87% |
Prediction: tolerated Score: 0.06 Sequence Conservation: 3.05 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Prediction: probably damaging Score: 0.999 Sensitivity: 0.08 Specificity: 1.00 |
Mutation number 2 is the exchange from the aliphatic, polar and acidic Aspartic acid to the aliphatic, polar and neutral amino acid Aparagine. Therefore it loses its acidic character and therefore may change the structure of the protein.
The PAM1 and PAM250 show that the substitution has relatively high values, so it is very common. BLOSUM62 has a value of 1 which means that the substitution is not common. The PSSM value with 0 is too low to be neutral but it is not negative. If you look at the alignment the conservation is also very high. There is no protein found with a substitution to Asparagine. So the mutation may be non-neutral.
If you look at the structure you can see that it is part of a coil at the exterior of the protein. This is also the result of our secondary structure predictions. So the mutation may not influence the function of the protein that much, because it is not in the functional part of the protein.
SNAP predicts the mutation as non-neutral with an accuracy of 87%. Polyphen2 also assigns the mutation as probably damaging with a score of even 1.0. Only SIFT predicts it as tolerated but only with a score of 0.06. So the mutation may be damaging.
According to these results it is hard to classify the mutation. But all in all we would tend to classify it as non-neutral because the change from an acidic to a neutral amino acid should change something. Also the high conservation indicates the importance of that position. And the majority of the prediction tools classify it as damaging.
As the mutation is listed in HGMD it affects the function of the protein. It is associated with Gaucher disease 1 and was published in 2005. <ref>Identification and functional characterization of five novel mutant alleles in 58 Italian patients with Gaucher disease type 1. Miocić S, Filocamo M, Dominissini S, Montalvo AL, Vlahovicek K, Deganuto M, Mazzotti R, Cariati R, Bembi B, Pittis MG. Hum Mutat. 2005</ref>
Mutation 3: His - Arg (Pos. 99/60)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) | -1 | 3.0/8.0 | - | Coil | - | - |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Neutral Reliability Index: 5 Expected Accuracy: 89% |
Prediction: tolerated Score: 0.62 Sequence Conservation: 3.04 |
Prediction: benign Score: 0.000 Sensitivity: 1.00 Specificity: 1.00 |
Prediction: benign Score: 0.000 Sensitivity: 1.00 Specificity: 1.00 |
Mutation number 3 is the change from the aromatic, polar and slightly basic Histidine to the aliphatic, polar and very basic Arginine. As they are both basic and have the same charge the exchange may not influence the function of the protein that much.
The values in the PAM substitution matrices are again relatively high. So the change from Histidine to Arginine is common. Only in BLOSUM62 the value is lower. The interesting part here is the alignment. Histidine is not highly conserved. If you look at that position there are also many sequences with an Arginine. So there are proteins of other mammals which were mutated during evolution in that way and this exchange seems not to influence the function of the protein. Interestingly the PSSM value does not show this because with -1 it is too low to indicate a common substitution.
Histidine is part of a coil in the secondary structure and is situated in the exterior of the protein. So it is not at the functional part of the protein and the influence may be low.
SNAP, SIFT and Polyphen2 predict the mutation as neutral, tolerated and benign. All these results together lead to the assumption that the mutation is neutral. The convincing points here are the alignment and the results of the prediction tools. There are several sequences in the alignment with a change from Histidine to Arginine which means that it can be tolerated. And also SNAP, SIFT and Polyphen2 predict it as neutral.
If we look at the source of the SNP it is exactly what we expected. The mutation is not listed in HGMD and therefore not associated with Gaucher Disease.
Mutation 4: Arg - Gln (Pos. 159/120)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
1 (worst: -3) | 9 (worst: 0) | 5 (worst: 1) | -2 | 0.0/11.0 | Beta sheet | Beta sheet | Beta sheet | Bend |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 7 Expected Accuracy: 96% |
Prediction: affect protein function Score: 0.03 Sequence Conservation: 3.01 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Mutation number 4 is the change from the aliphatic, polar and very basic Arginine to the aliphatic, polar and neutral Glutamine. Concerning the PAM substitution matrices the exchange is not rare. The values are relatively high, so it is common. BLOSUM62 has a value of 1 which is not as high as in the PAM matrices and does not indicate a common change. The PSSM value also indicates a rare change because it is low.
The alignment shows a very high conservation for the best 25 hits. All but two have also an Arginine at that position, the others show a substitution with Tryptophan but no Glutamine. The low conservation for all sequences is again because of gaps in the alignment.
This Arginine is part of a beta sheet and plays an important role in forming hydrogen bonds with the active site. So a mutation should affect the function of the protein. As the amino acid change is from basic to neutral it loses chemical properties that are necessary for the protein's function.
SNAP, SIFT and Polyphen2 predict the mutation as damaging. The accuracy of SNAP with 96% is very high and also the prediction of Polyphen2 has a score of 1.00.
All these results together indicate a damaging mutation. The chemical properties of the amino acid change and as it plays an important role in forming hydrogen bonds with the active site the mutation should influence the function of the protein. Also the prediction tools classify the mutation as damaging.
In HGMD it is associated with Gaucher Disease 1 which was already published in 1988. <ref>Gaucher disease type 1: cloning and characterization of a cDNA encoding acid beta-glucosidase from an Ashkenazi Jewish patient. Graves PN, Grabowski GA, Eisner R, Palese P, Smith FI. DNA. 1988 Oct;7(8):521-8.</ref>
Mutation 5: Pro - Leu (Pos. 221/182)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
-3 (worst: -4) | 3 (worst: 0) | 5 (worst: 1) | -5 | 0.0/11.0 | - | Coil | - | - |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 5 Expected Accuracy: 87% |
Prediction: affect protein function Score: 0.00 Sequence Conservation: 3.01 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Prediction: probably damaging Score: 1.000 Sensitivity: 0.00 Specificity: 1.00 |
Mutation number 5 is the change from the heterocyclic, unpolar and neutral Proline to aliphatic, unpolar and neutral Leucine. As Proline forms a ring structure the mutation may change the structure of the protein. Apart from that the polarity remains the same and so the chemical properties.
The substitution is not as common as the mutations mentioned before. BLOSUM62 has with -3 a very low value and also the PAM matrices show with the values 3 and 5 that the substitution is rare. So the mutation may not be tolerated. Also the PSSM value is very low. If you look at the alignment you can also see a high conservation. The found sequences share the Proline at this position. So the mutation may not be tolerated.
The Proline is situated in the interior of the protein at a coil. As the interior is the functional part of the protein the mutation may not be tolerated. But it is not part of the active site.
SNAP, SIFT and Polyphen2 predict the mutation as non-neutral, "affect protein function" and probably damaging and that is what we expected. All in all we would classify the mutation as damaging.
In HGMD it is also associated with Gaucher Disease 2. The mutation was published 2004. <ref>Functional analysis of 13 GBA mutant alleles identified in Gaucher disease patients: Pathogenic changes and "modifier" polymorphisms. Montfort M, Chabás A, Vilageliu L, Grinberg D. Hum Mutat. 2004 Jun;23(6):567-75.</ref>
Mutation 6: Ser - Asn (Pos. 310/271)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
1 (worst: -3) | 20 (worst: 0) | 5 (worst: 1) | 2 | 0.0/11.0 | - | Coil | - | Turn |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumDiv |
---|---|---|---|
Prediction: Neutral Reliability Index: 0 Expected Accuracy: 53% |
Prediction: tolerated Score: 0.54 Sequence Conservation: 3.01 |
Prediction: benign Score: 0.100 Sensitivity: 0.94 Specificity: 0.85 |
Prediction: benign Score: 0.120 Sensitivity: 0.91 Specificity: 0.67 |
Mutation number 6 is the change from the aliphatic, polar and neutral Serine to the aliphatic, polar and neutral Asparagine. They are both zwitterions and share chemical properties. The PAM substitution matrices show relatively high values, so the substitution is common. BLOSUM62 has a lower value which indicates a rare substitution. The PSSM value is greater than zero so the mutation may be tolerated.
The conservation is very high. If you look at the best 25 hits they all have Serine at this position. If you look at more there are also some sequences with a Glycine at this position. But there is no change to Asparagine.
This position is part of a coil in the exterior of the protein and not near the active site. Therefore the mutation might not influence the function of the protein that much.
SNAP predicts the mutation as neutral with an expected accuracy of 53%. SIFT also predicts the mutation as tolerated with a score of 0.54 as well as Polyphen2 which predicts it as benign.
All these results together would indicate a tolerated mutation. The amino acids have the same properties, even the PSSM value is high, so we would classify it as a neutral mutation.
But the mutation is associated with Gaucher Disease as published in 1997.<ref>Identification and expression of acid beta-glucosidase mutations causing severe type 1 and neurologic type 2 Gaucher disease in non-Jewish patients. Grace ME, Desnick RJ, Pastores GM. J Clin Invest. 1997 May 15;99(10):2530-7</ref> So the prediction is wrong. The mutation is not tolerated and changes the function of the protein. Although the scores are not that high it was not predicted as damaging.
Mutation 7: Asn - Ser (Pos. 409/370)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
1 (worst: -4) | 34 (worst: 0) | 8 (worst: 1) | 1 | 0.0/11.0 | Helix | Helix | Helix | Helix |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 1 Expected Accuracy: 63% |
Prediction: affect protein function Score: 0.05 Sequence Conservation: 3.02 |
Prediction: possibly damaging Score: 0.573 Sensitivity: 0.88 Specificity: 0.91 |
Prediction: benign Score: 0.131 Sensitivity: 0.90 Specificity: 0.68 |
Mutation number 7 is the substitution of the aliphatic, polar and neutral Asparagine to the also aliphatic, polar and neutral Serine. This is the most common mutation found in gaucher disease type 1 patients.
Concerning the substitution matrices the mutation is very common. PAM1 has with 34 a very high value as well as PAM250 with 8. Only BLOSUM62 does not show that tendency because it has only a value of 1. Also the PSSM has a high value for the mutation.
The conservation in the alignment is very high. All sequences show an Asparagine at that position. So there are no mutations to Serine which would be accepted.
The mutation is situated at a helix in the interior of the protein. So it may affect its function and therefore it may not be tolerated.
SNAP predicts the mutation as non-neutral, SIFT as affecting the protein function. It is interesting, that Polyphen2 HumDiv predicts it as only possibly damaging and Polyphen2 HumVar even as benign.
These results are hard to interpret. The amino acids have the same properties, the substitution matrices indicate a neutral change and also the value in the PSSM is high. The only signs for a damaging mutation are the position in the interior of the protein and the results of SNAP and SIFT. As two of our prediction tools classify the mutation as damaging and the conservation is very high we would tend to classify it as damaging. But it is hard to decide whether there are enough reasons to classify it as damaging.
We know that the mutation is associated with Gaucher Disease 1, which was first published in 1988<ref>Genetic heterogeneity in type 1 Gaucher disease: multiple genotypes in Ashkenazic and non-Ashkenazic individuals.
Tsuji S, Martin BM, Barranger JA, Stubblefield BK, LaMarca ME, Ginns EI. Proc Natl Acad Sci U S A. 1988 Apr;85(7):2349-52.</ref> and is supported by a lot of other sources given in HGMD.
Mutation 8: His - Arg (Pos. 350/311)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
0 (worst: -3) | 10 (worst: 0) | 6 (worst: 1) | -3 | 0.0/9.0 | Beta sheet | Beta sheet | - | Bend |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 8 Expected Accuracy: 96% |
Prediction: affect protein function Score: 0.00 Sequence Conservation: 3.11 |
Prediction: probably damaging Score: 1.0 Sensitivity: 0.00 Specificity: 1.00 |
Prediction: probably damaging Score: 0.999 Sensitivity: 0.08 Specificity: 1.00 |
Mutation number 8 is the substitution of the aromatic, polar and slightly basic Histidine to the aliphatic, polar and strong basic Arginine. It is situated in the interior of the protein and part of a beta sheet. So the mutation may cause and change in the protein's function.
Concerning BLOSUM62 the amino acid replacement is rare because it has only a value of 0. The values for PAM1 and PAM250 are higher with 10 and 6 which would indicate a common substitution. The value in the PSSM is again very low, so the mutation is rare. The conservation is relatively high, there are only two sequences with a substitution to Glutamic acid but no change to Arginine. Therefore the mutation may not be accepted.
SNAP predicts the mutation as non-neutral, SIFT as "affect protein function" and Polyphen2 as probably damaging.
All these results together point to a damaging mutation. All prediction tools classify it as damaging and also the change from an aromatic to an aliphatic amino acid in a beta sheet indicates a non-neutral change. So we would classify it as damaging.
And the mutation is associated to Gaucher Disease 2 as published in 1999<ref>Is the perinatal lethal form of Gaucher disease more common than classic type 2 Gaucher disease? Stone DL, van Diggelen OP, de Klerk JB, Gaillard JL, Niermeijer MF, Willemsen R, Tayebi N, Sidransky E. Eur J Hum Genet. 1999 May-Jun;7(4):505-9.</ref> So the prediction is right.
Mutation 9: Met - Val (Pos. 455/416)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
1 (worst: -3) | 17 (worst: 0) | 4 (worst: 1) | -1 | 0.0/11.0 | Helix | Coil/Helix | Helix | Helix |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 3 Expected Accuracy: 78% |
Prediction: tolerated Score: 0.12 Sequence Conservation: 3.01 |
Prediction: probably damaging Score: 0.999 Sensitivity: 0.14 Specificity: 0.99 |
Prediction: probably damaging Score: 0.980 Sensitivity: 0.54 Specificity: 0.95 |
Mutation number 9 is the substitution of the aliphatic, unpolar and neutral Methionine to the aliphatic, unpolar and neutral Valine. The interesting thing is that Methionine contains a sulfur. Therefore it has other chemical properties than Valine. As it is also situated at a helix at an important part of the protein this may affect the protein's function.
The values in the substitution matrices of BLOSUM62 and PAM250 are relatively low so the substitution is rare. Only PAM1 shows a higher value. The PSSM value is also very low. The conservation in the alignment is very high and there is no other amino acid at this position. These are all signs for a non-neutral mutation.
SNAP and Polyphen2 predict the mutation as non-neutral and probably damaging. Only SIFT predicts it as tolerated.
All these results together indicate a non-neutral change. We would classfiy it as damaging because the substitution is rare and the chemical properties of the amino acid change. Also the majority of the prediction tools classify it as damaging.
In HGMD it is associated with Gaucher Disease 2 which was published in 2005<ref>Novel mutations in type 2 Gaucher disease in Chinese and their functional characterization by heterologous expression. Tang NL, Zhang W, Grabowski GA, To KF, Choy FY, Ma SL, Shi HP. Hum Mutat. 2005 Jul;26(1):59-60.</ref> So the prediction of SIFT is wrong, SNAP and Polyphen2 predict what we expected.
Mutation 10: Leu - Pro (Pos. 509/470)
BLOSUM62 | PAM1 | PAM250 | PSSM | Conservation (all/best 25) |
Secondary structure (Uniprot) | Psipred | Jpred3 | DSSP |
---|---|---|---|---|---|---|---|---|
-3 (worst: -4) | 2 (worst: 0) | 3 (worst: 1) | -6 | 0.0/9.0 | Beta sheet | Beta sheet | Beta sheet | Bend |
SNAP | SIFT | Polyphen2 HumDiv | Polyphen2 HumVar |
---|---|---|---|
Prediction: Non-neutral Reliability Index: 1 Expected Accuracy: 63% |
Prediction: affect protein function Score: 0.01 Sequence Conservation: 3.09 |
Prediction: probably damaging Score: 0.978 Sensitivity: 0.75 Specificity: 0.96 |
Prediction: probably damaging Score: 0.966 Sensitivity: 0.59 Specificity: 0.94 |
Mutation number 10 is the change from the aliphatic, unpolar and neutral Leucine to the heterocyclic, unpolar and neutral Proline. Proline forms a ring structure so the change might also influence the protein folding. It is part of a beta sheet but at the exterior of the protein. So the change may not affect the function of the protein.
The values in the substitution matrices are very low, in BLOSUM62 as well as in the PAM matrices. So it is not a common substitution. Also the PSSM value is very low. The conservation is relatively high, although there are some substitutions to Valine and Phenylalanine. This all indicates a non-neutral mutation.
SNAP, SIFT and Polyphen2 predict the mutation as non-neutral and probably damaging. But we do not know if it has such an influence on the protein's function. The mutation is only found in dbSNP and not in HGMD.