Sequence-Based Mutation Analysis Hemochromatosis
Hemochromatosis>>Task 6: Sequence-based mutation analysis
Contents
Riddle of the Task
Coming soon... or later...
Short task description
Detailed description: Sequence-based mutation analysis
Protocol
A protocol with a description of the data acquisition and other scripts used for this task is available here.
SNPs
From MSUD: M35T V53M G93R Q127H A162S L183P T217I R224W E277K C282S
Amino acid features
<figtable id="TODO_ID">
Mutation | Hydrophobicity (wt) | Hydrophobicity (mt) | Polarity (wt) | Polarity (mt) | pI (wt) | pI (mt) | v.d.W. volume (wt) | v.d.W. volume (mt) |
---|---|---|---|---|---|---|---|---|
M35T | 1.9 | -0.7 | nonpolar | polar | 5.74 | 5.60 | 124 | 93 |
V53M | 4.2 | 1.9 | nonpolar | nonpolar | 6.00 | 5.74 | 105 | 124 |
G93R | -0.4 | -4.5 | nonpolar | polar | 6.06 | 10.76 | 48 | 148 |
Q127H | -3.5 | -3.2 | polar | polar | 5.65 | 7.60 | 114 | 118 |
A162S | 1.8 | -0.8 | nonpolar | polar | 6.01 | 5.68 | 67 | 73 |
L183P | 3.8 | -1.6 | nonpolar | nonpolar | 6.01 | 6.30 | 124 | 90 |
T217I | -0.7 | 4.5 | polar | nonpolar | 5.60 | 6.05 | 93 | 124 |
R224W | -4.5 | -0.9 | polar | nonpolar | 10.76 | 5.89 | 148 | 163 |
E277K | -3.5 | -3.9 | polar | polar | 3.15 | 9.60 | 109 | 135 |
C282S | 2.5 | -0.8 | polar | polar | 5.05 | 5.68 | 86 | 73 |
</figtable>
Evolutionary analysis
BLOSUM62/PAM1/PAM250
<figtable id="TODO_ID">
Mutation | BLOSUM62 | PAM1 | PAM250 |
---|---|---|---|
M35T | -1 | 6 | 500 |
V53M | 1 | 4 | 200 |
G93R | -2 | 0 | 200 |
Q127H | 0 | 20 | 700 |
A162S | 1 | 28 | 900 |
L183P | -3 | 2 | 300 |
T217I | -1 | 7 | 400 |
R224W | -3 | 2 | 200 |
E277K | 1 | 7 | 800 |
C282S | -1 | 11 | 700 |
</figtable>
PSSM
The complete matrix can be found here.
The given table provides the following data: <figtable id="TODO_ID">
Mutation | wt | mt | ||
---|---|---|---|---|
PSSM-value | frequency | PSSM-value | frequency | |
M35T | 3 | 16% | 5 | 78% |
V53M | 5 | 99% | 1 | 1% |
G93R | 3 | 29% | -2 | 1% |
Q127H | 2 | 16% | -2 | 0% |
A162S | 5 | 100% | 1 | 0% |
L183P | 4 | 95% | -3 | 0% |
T217I | 2 | 16% | -2 | 0% |
R224W | 6 | 100% | -3 | 0% |
E277K | 6 | 100% | 0 | 0% |
C282S | 10 | 100% | -1 | 0% |
</figtable>
These values lead to our following conclusions:
M35T would not be predicted as a disease causing mutation as the mutant is occuring frequently.
V53M would most probably be predicted as a disease causing mutation, because the mutant type occurs more often than expected. Another evidence is the high wildtype conservation at this position. The only thing that does not fit the prediction is, that the mutation is seen more often than expected, which could be a sign of a non-disease-causing mutation.
G93R is hard to predict based on the given numbers, as the wildtype is not very conserved with 29%, but because the mutation gets a value of -2 (meaning the occurrence of this mutation is fewer than expected) this position is more likely to be disease causing. In total, 7 different amino acids were observed at this position (A, R, D, E, G, T, V).
Q127H, predicted by only the conservation of wild type and mutation would be quite difficult. The conclusion would be that (like for G93R) the position is disease causing because the mutant type occurrence is lower than expected. Another fact supporting this prediction is, that only 3 different amino acids are observed at position 127 (Q, E, G), which might be an indicator of the importance of this position for the protein.
A162S would be predicted as a disease causing mutation, based on the 100% conservation of the wild type.
L183P would be predicted as a disease causing mutation, based on the wildtype conservation (95%) and the low frequency of occurence (lower than expected) of the mutated position).
T217I is another difficult prediction case when only looking at wildtype and mutation type conservation. Probably it would be predicted as disease causing because of the lower-than-expected frequency of the mutant type. Another supporting fact for this prediction is, that only two amino acids were observed at that position, meaning the sequence is fairly conserved.
R224W would be predicted as a disease causing mutation because of the high wildtype conservation (100%), supported by the lower than expected frequency of the mutant type.
E277K would be predicted as a disease causing mutation because of the high wildtype conservation (100%).
C282S would be predicted as a disease causing mutation because of the high wildtype conservation (100%), supported by the lower than expected frequency of the mutant type.
MSA conservation
M35T
- Secondary structure assignment: sheet
SEQ (mt): LRSHSLHYLFTGASEQDLGLS DSSP (wt): CCEEEEEEEEEEECCCCCCE PsiPred (wt): CCCCCCCEEEEEEECCCCCCC PsiPred (mt): CCCCCCCEEEEEEECCCCCCC
<figtable id="M35T_pymol">
</figtable>
V53M
- Secondary structure assignment: sheet
SEQ (mt): GLSLFEALGYMDDQLFVFYDH DSSP (wt): CCECCEEEEEECCEEEEEEEC PsiPred (wt): CCCEEEEEEEECCEEEEEEEC PsiPred (mt): CCCEEEEEEEECCEEEEEEEC
<figtable id="V53M_pymol">
</figtable>
G93R
- Secondary structure assignment: helix
SEQ (mt): MWLQLSQSLKRWDHMFTVDFW DSSP (wt): HHHHHHHHHHHHHHHHHHHHH PsiPred (wt): HHHHHHHHHHHHHHHHHHHHH PsiPred (mt): HHHHHHHHHHHHHHHHHHHHH
<figtable id="G93R_pymol">
</figtable>
Q127H
- Secondary structure assignment: coil
SEQ (mt): TLQVILGCEMHEDNSTEGYWK DSSP (wt): EEEEEEEEEECCCCCEEEEEE PsiPred (wt): EEEEECCCCCCCCCCCCCEEE PsiPred (mt): CEEEECCCEECCCCCCCCCCE
<figtable id="Q127H_pymol">
</figtable>
A162S
- Secondary structure assignment: helix
SEQ (mt): TLDWRAAEPRSWPTKLEWERH DSSP (wt): HCEEEECCHHHHHHHHHHHCC PsiPred (wt): CCCEECCCCCHHHHHHHHHHH PsiPred (mt): CCCEECCCCCHHHHHHHHHHH
<figtable id="A162S_pymol">
</figtable>
L183P
- Secondary structure assignment: helix (trusting DSSP)
SEQ (mt): KIRARQNRAYPERDCPAQLQQ DSSP (wt): CHHHHHHHHHHHHHHHHHHHH PsiPred (wt): HHHHHHHHCCCCCCHHHHHHH PsiPred (mt): HHHHHHHHCCCCCCHHHHHHH
<figtable id="L183P_pymol">
</figtable>
T217I
- Secondary structure assignment: coil
SEQ (mt): PPLVKVTHHVISSVTTLRCRA DSSP (wt): CCEEEEEEEECCCCEEEEEEE PsiPred (wt): CCCEEEECCCCCCCCEEEEEE PsiPred (mt): CCCEEEECCCCCCCCEEEEEE
<figtable id="T217I_pymol">
</figtable>
R224W
- Secondary structure assignment: sheet
SEQ (mt): HHVTSSVTTLWCRALNYYPQN DSSP (wt): EEECCCCEEEEEEEEEEECCC PsiPred (wt): CCCCCCCCEEEEEECCCCCCC PsiPred (mt): CCCCCCCCEEEEEECCCCCCC
<figtable id="R224W_pymol">
</figtable>
E277K
- Secondary structure assignment: helix (trusting DSSP)
SEQ (mt): WITLAVPPGEKQRYTCQVEHP DSSP (wt): EEEEEECCCHHHHEEEEEECC PsiPred (wt): EEEEEECCCCCCCEEEEEECC PsiPred (mt): EEEEEECCCCCCCEEEEEECC
<figtable id="E277K_pymol">
</figtable>
C282S
- Secondary structure assignment: sheet
SEQ (mt): VPPGEEQRYTSQVEHPGLDQP DSSP (wt): ECCCHHHHEEEEEECCCCCCC PsiPred (wt): ECCCCCCCEEEEEECCCCCCC PsiPred (mt): ECCCCCCCCEEEEECCCCCCC
<figtable id="C282S_pymol">
</figtable>
References
<references/>