Task 6: Sequence-based mutation analysis
Contents
Task description
A detailed task description can be found here.
Mutation selection
We selected the following ten mutations:
- I65T
- R71H
- R158Q
- R261Q
- T266A
- P275S
- T278N
- P281L
- G312D
- R408W
5 of these mutations are associated with our disease phenylketonuria the other 5 not. The 5 disease causing mutations are the most frequent missense/nonsense mutations of people who suffer from phenylketonuria. These numbers were taken from pahDB. However, at this point we are not going to tell which of these are the associated and which are not. We are going to lift this "secret" after our sequence based mutation analysis in order to validate our in silico generated predictions.
To keep our association secret we encrypted the file which contains the disease association for each mutation with the following linux command: "vim -x selected_mutations_and_association.txt"
The encrypted output is as follows:
The decryption can be only performed with the correct password.
Physicochemical properties and changes
For the annotation of the physicochemical properties of our amino acids we used the following venn diagram from [1]:
Additionally to this we used also the table of this Wikipedia page as a reference.
I65T
I: aliphatic, hydrophob, large, unpolar, neutral
T: small, hydrophob, polar, neutral
R71H
R: polar, aliphatic, positive charged, hydrophob, large, neutral
H: positive charged, polar, aromatic, hydrophob, basic
R158Q
R: polar, aliphatic, positive charged, hydrophob, large, neutral
Q: polar, aliphatic, hydrophob, neutral
R261Q
R: polar, aliphatic, positive charged, hydrophob, large, neutral
Q: polar, aliphatic, hydrophob, neutral
T266A
T: small, hydrophob, polar, neutral
A: tiny, hydrophobic, nonpolar, neutral
P275S
P: small, proline, nonpolar, neutral
S: tiny, polar, neutral
T278N
T: small, hydrophob, polar, neutral
N: small, polar, neutral
P281L
P: small, proline, nonpolar, neutral
L: hydrophobic, aliphatic, nonpolar, neutral
G312D
G: very small, unpolar, neutral
D: polar, small, negative charged, acidic
R408W
R: polar, aliphatic, positive charged, hydrophob, large, neutral
W: polar, aromatic, large, hydrophob, neutral
Visualization of the changed Amino Acid
We used the structure 1J8U to visualize our mutations with Pymol. This structure is only solved from residue 118 to residue 424. Hence we could not visualize the mutations at position 65 and 71.
I65T
Was not included in the used structure.
R71H
Was not included in the used structure.
R158Q
The following picture shows the mutation from the wild type (WT) Arginine (R, seen in orange) to the mutant Glutamine (Q, seen in yellow) at position 158. Furthermore, the seen spheres in the background belong to the ligands FE2 and 5,6,7,8-TETRAHYDROBIOPTERIN (HB4) which are bound to the catalytic site and are required for the reaction.
We observed that the WT has polar contacts to:
- E280
- Y154
- E141
We think that these polar contact get lost in the mutant.
R261Q
Visualization of the mutated residue Q (white) versus the wildtype residue R (green) in the structure of PAH:
T266A
The following picture shows the mutation from the wild type (WT) Threonine (T, seen in orange) to the mutant Alanine (A, seen in yellow) at position 266. Furthermore, the seen spheres in the background belong to the ligands FE2 and 5,6,7,8-TETRAHYDROBIOPTERIN (HB4) which are bound to the catalytic site and are required for the reaction.
We observed that the WT has polar contacts to:
- E286
We think that these polar contact get lost in the mutant.
P275S
The following picture shows the mutation from the wild type (WT) Proline (P, seen in orange) to the mutant Serine (S, seen in yellow) at position 275. Furthermore, the seen spheres in the background belong to the ligands FE2 and 5,6,7,8-TETRAHYDROBIOPTERIN (HB4) which are bound to the catalytic site and are required for the reaction.
We observed that the WT has polar contacts to:
- E270
We think that this polar contact get's not lost. Since this is a polar contact from the backbone and not side chain.
T278N
The following picture shows the mutation from the wild type (WT) Threonine (T, seen in orange) to the mutant Asparagine (N, seen in yellow) at position 278. Furthermore, the seen spheres in the background belong to the ligands FE2 and 5,6,7,8-TETRAHYDROBIOPTERIN (HB4) which are bound to the catalytic site and are required for the reaction.
We observed that the WT has polar contacts to:
- E280
We think that this polar contact get's not lost. Because the mutated residue also has a uncharged polar side chain which points with its O end in the same direction as our WT side chain. However, this is only a guess since we did not calculate the most probable rotamer for the mutant.
P281L
The following picture shows the mutation from the wild type (WT) Proline (P, seen in orange) to the mutant Leucine (L, seen in yellow) at position 281. Furthermore, the seen spheres in the background belong to the ligands FE2 and 5,6,7,8-TETRAHYDROBIOPTERIN (HB4) which are bound to the catalytic site and are required for the reaction.
We observed that the WT has polar contacts to:
- E268
We think that this polar contact get's not lost. Since the polar contact is between the backbone of the residue.
G312D
Visualization of the wildtype residue G (blue) in the structure of PAH.
Visualization of the mutated residue D (white) in the structure of PAH.
R408W
Visualization of the mutated residue W (white) versus the wildtype residue R (green) in the structure of PAH.
Mutations compared to BLOSUM62, PAM(1/250), PSSM and conservation of MSA with all mammalian homologous
BLOSUM 62 matrix
BLOSUM 62 Matrix |
---|
The BLOSUM 62 was calculated from blocks of clusters with a sequence identity of 62%. A positive score is given to the more likely substitutions while a negative score is given to the less likely substitutions. Source: [2] Disclaimer: This file is redistributed from Wikimedia and copyrighted under the Creative Commons licence. |
PAM 1/250 matrix
We took the values for our PAM1 and PAM 250 matrix from this page.
PAM1 Matrix
A | R | N | D | C | Q | E | G | H | I | L | K | M | F | P | S | T | W | Y | V | |
A | 9867 | 2 | 9 | 10 | 3 | 8 | 17 | 21 | 2 | 6 | 4 | 2 | 6 | 2 | 22 | 35 | 32 | 0 | 2 | 18 |
R | 1 | 9913 | 1 | 0 | 1 | 10 | 0 | 0 | 10 | 3 | 1 | 19 | 4 | 1 | 4 | 6 | 1 | 8 | 0 | 1 |
N | 4 | 1 | 9822 | 36 | 0 | 4 | 6 | 6 | 21 | 3 | 1 | 13 | 0 | 1 | 2 | 20 | 9 | 1 | 4 | 1 |
D | 6 | 0 | 42 | 9859 | 0 | 6 | 53 | 6 | 4 | 1 | 0 | 3 | 0 | 0 | 1 | 5 | 3 | 0 | 0 | 1 |
C | 1 | 1 | 0 | 0 | 9973 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 5 | 1 | 0 | 3 | 2 |
Q | 3 | 9 | 4 | 5 | 0 | 9876 | 27 | 1 | 23 | 1 | 3 | 6 | 4 | 0 | 6 | 2 | 2 | 0 | 0 | 1 |
E | 10 | 0 | 7 | 56 | 0 | 35 | 9865 | 4 | 2 | 3 | 1 | 4 | 1 | 0 | 3 | 4 | 2 | 0 | 1 | 2 |
G | 21 | 1 | 12 | 11 | 1 | 3 | 7 | 9935 | 1 | 0 | 1 | 2 | 1 | 1 | 3 | 21 | 3 | 0 | 0 | 5 |
H | 1 | 8 | 18 | 3 | 1 | 20 | 1 | 0 | 9912 | 0 | 1 | 1 | 0 | 2 | 3 | 1 | 1 | 1 | 4 | 1 |
I | 2 | 2 | 3 | 1 | 2 | 1 | 2 | 0 | 0 | 9872 | 9 | 2 | 12 | 7 | 0 | 1 | 7 | 0 | 1 | 33 |
L | 3 | 1 | 3 | 0 | 0 | 6 | 1 | 1 | 4 | 22 | 9947 | 2 | 45 | 13 | 3 | 1 | 3 | 4 | 2 | 15 |
K | 2 | 37 | 25 | 6 | 0 | 12 | 7 | 2 | 2 | 4 | 1 | 9926 | 20 | 0 | 3 | 8 | 11 | 0 | 1 | 1 |
M | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 5 | 8 | 4 | 9874 | 1 | 0 | 1 | 2 | 0 | 0 | 4 |
F | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 8 | 6 | 0 | 4 | 9946 | 0 | 2 | 1 | 3 | 28 | 0 |
P | 13 | 5 | 2 | 1 | 1 | 8 | 3 | 2 | 5 | 1 | 2 | 2 | 1 | 1 | 9926 | 12 | 4 | 0 | 0 | 2 |
S | 28 | 11 | 34 | 7 | 11 | 4 | 6 | 16 | 2 | 2 | 1 | 7 | 4 | 3 | 17 | 9840 | 38 | 5 | 2 | 2 |
T | 22 | 2 | 13 | 4 | 1 | 3 | 2 | 2 | 1 | 11 | 2 | 8 | 6 | 1 | 5 | 32 | 9871 | 0 | 2 | 9 |
W | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 9976 | 1 | 0 |
Y | 1 | 0 | 3 | 0 | 3 | 0 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 21 | 0 | 1 | 1 | 2 | 9945 | 1 |
V | 13 | 2 | 1 | 1 | 3 | 2 | 2 | 3 | 3 | 57 | 11 | 1 | 17 | 1 | 3 | 2 | 10 | 0 | 2 | 9901 |
PAM250 Matrix
A | R | N | D | C | Q | E | G | H | I | L | K | M | F | P | S | T | W | Y | V | |
A | 13 | 6 | 9 | 9 | 5 | 8 | 9 | 12 | 6 | 8 | 6 | 7 | 7 | 4 | 11 | 11 | 11 | 2 | 4 | 9 |
R | 3 | 17 | 4 | 3 | 2 | 5 | 3 | 2 | 6 | 3 | 2 | 9 | 4 | 1 | 4 | 4 | 3 | 7 | 2 | 2 |
N | 4 | 4 | 6 | 7 | 2 | 5 | 6 | 4 | 6 | 3 | 2 | 5 | 3 | 2 | 4 | 5 | 4 | 2 | 3 | 3 |
D | 5 | 4 | 8 | 11 | 1 | 7 | 10 | 5 | 6 | 3 | 2 | 5 | 3 | 1 | 4 | 5 | 5 | 1 | 2 | 3 |
C | 2 | 1 | 1 | 1 | 52 | 1 | 1 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 2 | 3 | 2 | 1 | 4 | 2 |
Q | 3 | 5 | 5 | 6 | 1 | 10 | 7 | 3 | 7 | 2 | 3 | 5 | 3 | 1 | 4 | 3 | 3 | 1 | 2 | 3 |
E | 5 | 4 | 7 | 11 | 1 | 9 | 12 | 5 | 6 | 3 | 2 | 5 | 3 | 1 | 4 | 5 | 5 | 1 | 2 | 3 |
G | 12 | 5 | 10 | 10 | 4 | 7 | 9 | 27 | 5 | 5 | 4 | 6 | 5 | 3 | 8 | 11 | 9 | 2 | 3 | 7 |
H | 2 | 5 | 5 | 4 | 2 | 7 | 4 | 2 | 15 | 2 | 2 | 3 | 2 | 2 | 3 | 3 | 2 | 2 | 3 | 2 |
I | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 10 | 6 | 2 | 6 | 5 | 2 | 3 | 4 | 1 | 3 | 9 |
L | 6 | 4 | 4 | 3 | 2 | 6 | 4 | 3 | 5 | 15 | 34 | 4 | 20 | 13 | 5 | 4 | 6 | 6 | 7 | 13 |
K | 6 | 18 | 10 | 8 | 2 | 10 | 8 | 5 | 8 | 5 | 4 | 24 | 9 | 2 | 6 | 8 | 8 | 4 | 3 | 5 |
M | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 2 | 3 | 2 | 6 | 2 | 1 | 1 | 1 | 1 | 1 | 2 |
F | 2 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 3 | 5 | 6 | 1 | 4 | 32 | 1 | 2 | 2 | 4 | 20 | 3 |
P | 7 | 5 | 5 | 4 | 3 | 5 | 4 | 5 | 5 | 3 | 3 | 4 | 3 | 2 | 20 | 6 | 5 | 1 | 2 | 4 |
S | 9 | 6 | 8 | 7 | 7 | 6 | 7 | 9 | 6 | 5 | 4 | 7 | 5 | 3 | 9 | 10 | 9 | 4 | 4 | 6 |
T | 8 | 5 | 6 | 6 | 4 | 5 | 5 | 6 | 4 | 6 | 4 | 6 | 5 | 3 | 6 | 8 | 11 | 2 | 3 | 6 |
W | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 55 | 1 | 0 |
Y | 1 | 1 | 2 | 1 | 3 | 1 | 1 | 1 | 3 | 2 | 2 | 1 | 2 | 15 | 1 | 2 | 2 | 3 | 31 | 2 |
V | 7 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 5 | 4 | 15 | 10 | 4 | 10 | 5 | 5 | 5 | 72 | 4 | 17 |
PSSM
The readable PSSM file of PSI-Blast can be generated by the -Q option:
blastpgp -d '/data/nr/nr' -i './reference.fasta' -o './reference_psi_e10E-6_i5.blast' -h 10E-6 -j 5 -Q reference_i5_e10E-6.txt