Sequence-based mutation analysis GLA
by Benjamin Drexler and Fabian Grandke
Contents
Introduction
Selected Mutations
We randomly selected ten annotated point mutations of the human gene GLA and they were chosen out of a pool of mutations that consist of two subsets. The first subset contains mutations that are present in HGMD and these mutations were already gathered in the task 4 Mapping SNPs. The second subset are mutations that are present in dbSNP, but not included in HGMD. This was only the case for three mutations.
Mutations at the amino acid position between 1 and 31 were not included in the selection process, because they are part of the signal peptide (see UniProt entry) and they are not present in the reference structure (PDB ID 1R47).
Number | AA-Position | Codon change | Amino acid change | Visualization |
---|---|---|---|---|
1 | 42 | ATG-ACG | Met -> Thr | |
2 | 65 | AGT-ACG | Ser -> Thr | |
3 | 117 | ATT-AGT | Ile -> Ser | |
4 | 143 | cGCA-ACA | Ala -> Thr | |
5 | 186 | CAC-CGC | His -> Arg | |
6 | 205 | gCCT-ACT | Pro -> Thr | |
7 | 244 | gGAC-CAC | Asp -> His | |
8 | 283 | CAG-CCG | Gln -> Pro | |
9 | 321 | tCAG-TAG | Gln -> Glu | |
10 | 363 | TATa-TAA | Tyr -> Cys |
The visualization was done by using PyMol and the mutagensis of the residue was performed according to this tutorial. The residue of the wildtype is colored green and the mutated residue is colored red.
Mutation Analysis
Physicochemical Properties and Changes
Substitution Matrices
In this section, we take a look at substitution matrices to evaluate whether the introduced substitution of the mutation is favorable in a biological context. For this, we use two different kinds of substitution matrices. First, Blocks of Amino Acid Substitution Matrix (BLOSUM) is a evidence based matrix which is calculated of alignments between proteins <ref name=blosum>[1]</ref>. Second, Point Accepted Mutation or Percent Accepeted Mutation (PAM) is a set of matrices that is derived of from the amino acid substitutions between closely related proteins <ref name=pam>[2]</ref>. In general, a high value in a substitution matrix indicates a more likely substitution.
BLOSUM62 | PAM1 | PAM250 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Number | Substitution | Mutation | Best1 | Worst2 | Mutation | Best | Worst | Mutation | Best | Worst |
1 | Met -> Thr | -1 | 2 | -3 | 2 | 8 | 0 | 1 | 3 | 0 |
2 | Ser -> Thr | 1 | 1 | -3 | 38 | 38 | 0 | 9 | 9 | 3 |
3 | Ile -> Ser | -2 | 2 | -4 | 1 | 33 | 0 | 3 | 9 | 1 |
4 | Ala -> Thr | -1 | 1 | -3 | 32 | 35 | 0 | 11 | 12 | 2 |
5 | His -> Arg | 0 | 2 | -3 | 8 | 20 | 0 | 5 | 7 | 2 |
6 | Pro -> Thr | 1 | 1 | -4 | 4 | 13 | 0 | 5 | 7 | 1 |
7 | Asp -> His | -1 | 2 | -4 | 4 | 53 | 0 | 6 | 10 | 1 |
8 | Gln -> Pro | -1 | 2 | -3 | 6 | 27 | 0 | 4 | 7 | 1 |
9 | Gln -> Glu | 2 | 2 | -3 | 27 | 27 | 0 | 7 | 7 | 1 |
10 | Arg -> Cys | -3 | 2 | -3 | 1 | 19 | 0 | 2 | 9 | 1 |
1 Best is the highest value in the regarding column/row except for the self-substitution (e.g. Met -> Met).
2 Worst is the lowest value in the regarding column/row.
The following coloring scheme was applied:
- green: the substitution value of the mutation is closer to the best value than to the worst value
- red: the substitution value of the mutation is closer to the worst value than to the best value
- gray: the substitution value of the mutation has the same absolute difference to both values
The following substitution matrices were used:
PSSM
Number | Amino acid | A | R | N | D | C | Q | E | G | H | I | L | K | M | F | P | S | T | W | Y | V | A | R | N | D | C | Q | E | G | H | I | L | K | M | F | P | S | T | W | Y | V | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
42 | M | -3 | -2 | -5 | -6 | -4 | -3 | -5 | -6 | -5 | 1 | 1 | -4 | 9 | -1 | -5 | -4 | -3 | -5 | -4 | 2 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 12 | 1 | 58 | 2 | 0 | 0 | 1 | 0 | 0 | 15 | 1.55 | 1.56 | |
65 | S | -2 | -1 | 5 | -1 | -3 | -2 | -2 | -2 | -1 | -3 | -3 | -1 | -3 | 0 | -2 | 3 | 3 | 0 | -1 | -3 | 2 | 2 | 28 | 3 | 0 | 1 | 2 | 2 | 1 | 1 | 2 | 3 | 0 | 4 | 2 | 20 | 20 | 1 | 2 | 1 | 0.59 | 1.27 | |
117 | I | -4 | -5 | -5 | -6 | -4 | -4 | -5 | -5 | -5 | 4 | 4 | -5 | 5 | 3 | -2 | -3 | -4 | 1 | -1 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 25 | 35 | 0 | 17 | 11 | 2 | 2 | 0 | 2 | 2 | 1 | 0.93 | 1.42 | |
143 | A | 1 | 1 | 0 | 2 | 1 | 0 | -1 | 0 | 2 | -1 | -2 | -1 | 2 | 0 | 1 | -1 | -1 | -5 | 0 | -1 | 14 | 7 | 4 | 11 | 3 | 5 | 5 | 6 | 5 | 3 | 4 | 3 | 5 | 5 | 6 | 4 | 4 | 0 | 3 | 3 | 0.12 | 1.86 | |
186 | H | 2 | 1 | 1 | -1 | -5 | 1 | 1 | -2 | 0 | -1 | -1 | 0 | 1 | -1 | 1 | -2 | -1 | -5 | 1 | -1 | 17 | 9 | 6 | 3 | 0 | 7 | 9 | 3 | 3 | 3 | 5 | 5 | 3 | 2 | 8 | 3 | 4 | 0 | 6 | 5 | 0.16 | 1.91 | |
205 | P | -1 | -2 | 1 | 1 | -5 | -1 | 2 | 3 | 1 | -5 | -2 | -2 | -4 | -1 | 2 | 0 | -2 | -5 | -1 | -4 | 6 | 2 | 7 | 7 | 0 | 2 | 14 | 24 | 4 | 0 | 5 | 2 | 0 | 3 | 12 | 7 | 2 | 0 | 2 | 1 | 0.38 | 1.69 | |
244 | D | -2 | -2 | 3 | 4 | 0 | -2 | 0 | 0 | 1 | 0 | -1 | -1 | 1 | -3 | 0 | -1 | -1 | -5 | 1 | -2 | 3 | 2 | 13 | 23 | 2 | 2 | 5 | 7 | 3 | 6 | 5 | 4 | 3 | 1 | 6 | 4 | 3 | 0 | 6 | 3 | 0.31 | 1.95 | |
283 | Q | -2 | 2 | -2 | -3 | -5 | 3 | -2 | -4 | 8 | -2 | -1 | -2 | 1 | -3 | -5 | -2 | -2 | 0 | -1 | -1 | 4 | 11 | 1 | 1 | 0 | 15 | 2 | 1 | 37 | 2 | 8 | 2 | 4 | 1 | 0 | 2 | 3 | 1 | 2 | 4 | 0.96 | 1.70 | |
321 | Q | -2 | 1 | -2 | -2 | -5 | 7 | 1 | -4 | -2 | -4 | -3 | 1 | -2 | -5 | -3 | -2 | -3 | 1 | -3 | -4 | 2 | 6 | 1 | 1 | 0 | 62 | 7 | 1 | 1 | 1 | 2 | 8 | 1 | 0 | 1 | 2 | 1 | 2 | 1 | 1 | 1.29 | 1.54 | |
363 | R | 0 | 2 | -2 | -3 | -1 | 3 | -1 | -4 | -1 | 1 | 0 | -1 | 3 | 0 | -3 | 0 | -1 | -4 | 2 | 1 | 7 | 13 | 1 | 1 | 1 | 12 | 4 | 1 | 1 | 8 | 7 | 4 | 6 | 3 | 1 | 6 | 4 | 0 | 7 | 11 | 0.24 | 1.6 |
Multiple Sequence Alignment
Multiple sequence alignment of 100 sequences by T-Coffee in JalView.
Number | Position | Conservation all (Jalview) | Conservation best 25 (Jalview) |
---|---|---|---|
1 | 42 | 10 | 11 |
2 | 65 | 8 | 11 |
3 | 117 | 9 | 9 |
4 | 143 | 10 | 8 |
5 | 186 | 4 | 2 |
6 | 205 | 10 | 11 |
7 | 244 | 10 | 11 |
8 | 283 | 11 | 11 |
9 | 321 | 11 | 11 |
10 | 363 | 3 | 5 |
A table of the sequences is provided on this page.
Secondary Structure
Programs
SNAP
SIFT
PolyPhen
Discussion
References
<references />