Task 6: MSUD - Sequence-based mutation analysis
Contents
Sequence-based mutation analysis
Task description
For this Task the group that reviewed our page last week was to chose 10 SNP's on which we can work on. We assume we don't know which of the SNP's cause MSUP or affects the protein structure or function. As we haven't received any message from the group which reviewed our page last week by thursday, we've decided to chose the 10 SNP's by ourselves: L17F, M82I,Q125E, I213T, C258Y, T310R, A328T, I361V, R429H, N404S
Comparison of wild type AA and mutant AA
Physiological properties<ref>http://en.wikipedia.org/wiki/Amino_acid</ref> and predicted secondary structure element (reprof with pssm)
Position | wt-AA | wt properties | mutant-AA | mutant properties | expected effect on protein | sec-struct-element( reprof ) |
17 | L | -Hydrophobic side chain, non-polar, charge: neutral | F | Hydrophobic side chain, non-polar, charge: neutral | - | L |
82 | M | Hydrophobic side chain, non-polar, charge: neutral | I | Hydrophobic side chain, non-polar, charge: neutral | - | E |
125 | Q | polar, charge: neutral | E | polar, charge: negative | + | L |
213 | I | Hydrophobic side chain, non-polar, charge: neutral | T | polar, charge: neutral | + | H |
258 | C | polar, charge: neutral | Y | polar, charge: neutral | - | L |
310 | T | polar, charge: neutral | R | polar, charge: positive | + | H |
328 | A | non-polar, charge: neutral | T | polar, charge: neutral | - | E |
361 | I | Hydrophobic side chain, non-polar, charge: neutral | V | non-polar, charge: neutral | - | H |
404 | N | polar, charge: neutral | S | polar, charge: neutral | - | L |
429 | R | polar, charge: positive | H | polar, charge: positive( 10% ) negative( 90% ) | + | H |
Blosum62, PAM1, PAM250 scores
The scores for an aminoacid change in the matrices Blosum and PAM should give another hint, whether the substitution has an effect on the resulting protein or not. The higher the X in BlosumX, the shorter is the evolutionary context of sequences it is calculated for, while in PAMX the opposite is the case.calculated for sequences
Position | wt-AA | mutant-AA | Blosum62-Score<ref>http://www.uky.edu/Classes/BIO/520/BIO520WWW/blosum62.htm</ref> is | PAM1-Score<ref>http://www.icp.ucl.ac.be/~opperd/private/pam1.html</ref> | PAM250-Score<ref>http://www.icp.ucl.ac.be/~opperd/private/pam250.html</ref> |
17 | L | F | 0 | 13 | 13 |
82 | M | I | 1 | 5 | 2 |
125 | Q | E | 2 | 27 | 7 |
213 | I | T | -2 | 7 | 4 |
258 | C | Y | -2 | 3 | 4 |
310 | T | R | -1 | 2 | 5 |
328 | A | T | -1 | 32 | 11 |
361 | I | V | 1 | 33 | 9 |
404 | N | S | 1 | 20 | 5 |
429 | R | H | 0 | 10 | 6 |
3D-Structure
We used pyMols Mutagenesis Wizard to visualize the individual Mutations. In each image the C-chain of the original residue is coloured green, that of the SNP is coloured pink. Oxygen atoms are coloured red and nitrogen blue, while sulphur is coloured yellow for both residues. The L17F SNP is not displayed because it was not in the pdb file we used (1DTW). The easiest to spot changes that look like the might have an effect are C258Y, T310R and R429H. In case of the C258Y mutation the protein looses a sulphur atom and gains an aromatic ring, while the T310R mutation greatly increases the size of the amino acid. In case of the R429H mutation the protein again gains an aromatic ring.
PSI-BLAST - PSSM
the command we used to create the PSSM-file, and the file itsself can be found in the journal
The underlying database is big_db.
Results:
pos | wt-AA | mut-AA | pssm-score | pssm percent wt | pssm percent mut |
17 | L | F | 1 | 31 | 9 |
82 | M | I | 1 | 28 | 6 |
125 | Q | E | -1 | 97 | 0 |
213 | I | T | -2 | 41 | 2 |
258 | C | Y | -4 | 29 | 0 |
310 | T | R | -3 | 60 | 0 |
328 | A | T | 2 | 59 | 12 |
361 | I | V | 2 | 63 | 13 |
429 | R | H | -1 | 26 | 1 |
404 | N | S | 2 | 19 | 16 |
Tools
Comparison
A visual comparison of the predictions can be seen in Figure 1. For the comparison we only used results from SNAP2. The first observation made when comparing the three predictions is, that the number of SNPs that are predicted to have a negative effect greatly varies between the tools. While PolyPhen predicts seven of the SNPs to be deleterious, SIFT predicts five and SNAP only four to impact protein function. All three methods predicted the SNPs Q125E, T310R and C258Y to have a negative effect, which increses our confidence in the quality of these predictions. PolyPhen and SIFT or SNAP respectively both predicted the L17F and I213T mutation to be deleterious again increasing our confidence in these predictions. None of the tools predicted I361V or N404S to be deleterious and the R429H, M82I and A328T SNPs were only predicted by either SIFT or Polyphen, making them more unlikely.
When we add the Reprof prediction to this (Figure 2) we observe that the overlap between predictions becomes even larger. There are only Two predictions M82I and A328T from PolyPhen that are not at least supported by one other prediction method. Again only two SNPs (L17F and R429H) are only supported by two methods. While I213T and C258Y are predicted by three and Q125E and T310R even by all four methods.
Polyphen2
SNP | polyphen-score | sensitivity | specificity | description |
L17F | 0.995 | 0.68 | 0.97 | PROBABLY DAMAGING |
M82I | 0.468 | 0.89 | 0.90 | POSSIBLY DAMAGING |
Q125E | 0.544 | 0.88 | 0.91 | POSSIBLY DAMAGING |
I213T | 0.644 | 0.87 | 0.91 | POSSIBLY DAMAGING |
C258Y | 1.000 | 0.00 | 1.00 | PROBABLY DAMAGING |
T310R | 1.000 | 0.00 | 1.00 | PROBABLY DAMAGING |
A328T | 0.999 | 0.14 | 0.99 | PROBABLY DAMAGING |
I361V | 0.386 | 0.90 | 0.89 | BENIGN |
N404S | 0.000 | 1.00 | 1.00 | BENIGN |
R429H | 0.002 | 0.99 | 0.30 | BENIGN |
Snap/Snap2
SNP | snap-pred | snap-rel | snap-expected-acc | snap2-pred | snap2-rel | snap2-expected-acc |
L17F | Neutral | 3 | 78% | Neutral | 6 | 82% |
M82I | Neutral | 4 | 85% | Neutral | 4 | 72% |
Q125E | Non-neutral | 1 | 63% | Non-neutral | 4 | 71% |
I213T | Neutral | 3 | 78% | Non-neutral | 6 | 80% |
C258Y | Non-neutral | 2 | 70% | Non-neutral | 6 | 80% |
T310R | Non-neutral | 1 | 63% | Non-neutral | 6 | 80% |
A328T | Neutral | 4 | 85% | Neutral | 1 | 57% |
I361V | Neutral | 5 | 89% | Neutral | 8 | 93% |
N404S | Neutral | 7 | 94% | Neutral | 9 | 97% |
R429H | Neutral | 6 | 92% | Neutral | 4 | 72% |
SIFT
SNP | sift-prediction | sift-score | sift-conservation | sequences represented at pos |
L17F | AFFECT PROTEIN FUNCTION | 0.00 | 4.32 | 1 |
M82I | be TOLERATED | 0.26 | 3.12 | 17 |
Q125E | AFFECT PROTEIN FUNCTION | 0.02 | 3.02 | 19 |
I213T | be TOLERATED | 0.22 | 3.02 | 19 |
C258Y | AFFECT PROTEIN FUNCTION | 0.02 | 3.02 | 19 |
T310R | AFFECT PROTEIN FUNCTION | 0.01 | 3.02 | 19 |
A328T | be TOLERATED | 1.00 | 3.02 | 19 |
I361V | be TOLERATED | 0.06 | 3.02 | 19 |
N404S | be TOLERATED | 0.99 | 3.02 | 19 |
R429H | AFFECT PROTEIN FUNCTION | 0.03 | 3.21 | 15 |
Discussion
Given that all three prediction methods agree we will assume the mutations Q125E, T310R and C258Y to be deleterious. The actual results in HGMD for our SNPs:
References
<references />