Task 6: MSUD - Sequence-based mutation analysis

From Bioinformatikpedia
Revision as of 19:13, 16 June 2012 by Wagnerr (talk | contribs)

Sequence-based mutation analysis

Task description

For this Task the group that reviewed our page last week was to chose 10 SNP's on which we can work on. We assume we don't know which of the SNP's cause MSUP or affects the protein structure or function. As we haven't received any message from the group which reviewed our page last week by thursday, we've decided to chose the 10 SNP's by ourselves: L17F, M82I,Q125E, I213T, C258Y, T310R, A328T, I361V, R429H, N404S

Comparison of wild type AA and mutant AA

Physiological properties<ref>http://en.wikipedia.org/wiki/Amino_acid</ref> and predicted secondary structure element (reprof with pssm)

Positionwt-AAwt propertiesmutant-AAmutant propertiesexpected effect on proteinsec-struct-element( reprof )
17L-Hydrophobic side chain, non-polar, charge: neutralFHydrophobic side chain, non-polar, charge: neutral-L
82MHydrophobic side chain, non-polar, charge: neutralIHydrophobic side chain, non-polar, charge: neutral-E
125Qpolar, charge: neutralEpolar, charge: negative+L
213IHydrophobic side chain, non-polar, charge: neutralTpolar, charge: neutral+H
258Cpolar, charge: neutralYpolar, charge: neutral-L
310Tpolar, charge: neutralRpolar, charge: positive+H
328Anon-polar, charge: neutralTpolar, charge: neutral-E
361IHydrophobic side chain, non-polar, charge: neutralVnon-polar, charge: neutral-H
404Npolar, charge: neutralSpolar, charge: neutral-L
429Rpolar, charge: positiveHpolar, charge: positive( 10% ) negative( 90% )+H

Blosum62, PAM1, PAM250 scores

The scores for an aminoacid change in the matrices Blosum and PAM should give another hint, whether the substitution has an effect on the resulting protein or not. The higher the X in BlosumX, the shorter is the evolutionary context of sequences it is calculated for, while in PAMX the opposite is the case.calculated for sequences

Positionwt-AAmutant-AABlosum62-Score<ref>http://www.uky.edu/Classes/BIO/520/BIO520WWW/blosum62.htm</ref> is PAM1-Score<ref>http://www.icp.ucl.ac.be/~opperd/private/pam1.html</ref>PAM250-Score<ref>http://www.icp.ucl.ac.be/~opperd/private/pam250.html</ref>
17LF01313
82MI152
125QE2277
213IT-274
258CY-234
310TR-125
328AT-13211
361IV1339
404NS1205
429RH0106

3D-Structure

We used pyMols Mutagenesis Wizard to visualize the individual Mutations. In each image the C-chain of the original residue is coloured green, that of the SNP is coloured pink. Oxygen atoms are coloured red and nitrogen blue, while sulphur is coloured yellow for both residues. The L17F SNP is not displayed because it was not in the pdb file we used (1DTW). The easiest to spot changes that look like the might have an effect are C258Y, T310R and R429H. In case of the C258Y mutation the protein looses a sulphur atom and gains an aromatic ring, while the T310R mutation greatly increases the size of the amino acid. In case of the R429H mutation the protein again gains an aromatic ring.

M82I
Q125E
I213T
C258Y
T310R
A328T
I361V
N404S
R429H

PSI-BLAST - PSSM

the command we used to create the PSSM-file, and the file itsself can be found in the journal

The underlying database is big_db.

Results:

poswt-AAmut-AApssm-scorepssm percent wtpssm percent mut
17LF1319
82MI1286
125QE-1970
213IT-2412
258CY-4290
310TR-3600
328AT25912
361IV26313
429RH-1261
404NS21916

Tools

Comparison

Figure 1. Venn-Diagram of the SNPs predicted from Polyphen, SIFT and SNAP that are assumed to have a negative effect on protein functionality.

A visual comparison of the predictions can be seen in Figure 1. For the comparison we only used results from SNAP2. The first observation made when comparing the three predictions is, that the number of SNPs that are predicted to have a negative effect greatly varies between the tools. While PolyPhen predicts seven of the SNPs to be deleterious, SIFT predicts five and SNAP only four to impact protein function. All three methods predicted the SNPs Q125E, T310R and C258Y to have a negative effect, which increses our confidence in the quality of these predictions. PolyPhen and SIFT or SNAP respectively both predicted the L17F and I213T mutation to be deleterious again increasing our confidence in these predictions. None of the tools predicted I361V or N404S to be deleterious and the R429H, M82I and A328T SNPs were only predicted by either SIFT or Polyphen, making them more unlikely.

Figure 2. Venn-Diagram of the SNPs predicted from Polyphen, SIFT, SNAP and Reprof that are assumed to have a negative effect on protein functionality.

When we add the Reprof prediction to this (Figure 2) we observe that the overlap between predictions becomes even larger. There are only Two predictions M82I and A328T from PolyPhen that are not at least supported by one other prediction method. Again only two SNPs (L17F and R429H) are only supported by two methods. While I213T and C258Y are predicted by three and Q125E and T310R even by all four methods.

Polyphen2

SNPpolyphen-scoresensitivityspecificitydescription
L17F0.9950.680.97PROBABLY DAMAGING
M82I0.4680.890.90POSSIBLY DAMAGING
Q125E0.5440.880.91POSSIBLY DAMAGING
I213T0.6440.870.91POSSIBLY DAMAGING
C258Y1.0000.001.00PROBABLY DAMAGING
T310R1.0000.001.00PROBABLY DAMAGING
A328T0.9990.140.99PROBABLY DAMAGING
I361V0.3860.900.89BENIGN
N404S0.0001.001.00BENIGN
R429H0.0020.990.30BENIGN

Snap/Snap2

SNPsnap-predsnap-relsnap-expected-accsnap2-predsnap2-relsnap2-expected-acc
L17FNeutral378%Neutral682%
M82INeutral485%Neutral472%
Q125ENon-neutral163%Non-neutral471%
I213TNeutral378%Non-neutral680%
C258YNon-neutral270%Non-neutral680%
T310RNon-neutral163%Non-neutral680%
A328TNeutral485%Neutral157%
I361VNeutral589%Neutral893%
N404SNeutral794%Neutral997%
R429HNeutral692%Neutral472%

SIFT

SNPsift-predictionsift-scoresift-conservationsequences represented at pos
L17FAFFECT PROTEIN FUNCTION0.004.321
M82Ibe TOLERATED0.263.1217
Q125EAFFECT PROTEIN FUNCTION0.023.0219
I213Tbe TOLERATED0.223.0219
C258YAFFECT PROTEIN FUNCTION0.023.0219
T310RAFFECT PROTEIN FUNCTION0.013.0219
A328Tbe TOLERATED1.003.0219
I361Vbe TOLERATED0.063.0219
N404Sbe TOLERATED0.993.0219
R429HAFFECT PROTEIN FUNCTION0.033.2115

Discussion

The actual results in HGMD for our SNPs:

References

<references />