Difference between revisions of "Sequence-Based Mutation Analysis Hemochromatosis"

From Bioinformatikpedia
(T217I)
(PSSM)
Line 237: Line 237:
   
 
=== PSSM ===
 
=== PSSM ===
Tables for the mutations can be found [[Hemochromatosis_PSSM_Matrix|here]].
+
The complete matrix can be found [[Hemochromatosis_PSSM_Matrix|here]].
  +
  +
The given table provides the following data:
  +
<figtable id="TODO_ID">
  +
{| class="wikitable" style="width: 900px; margin: 1em 1em 1em 0; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
  +
| SNP
  +
|colspan="2"| wt
  +
|colspan="2"| mt
  +
|-
  +
|
  +
| PSSM-value
  +
| frequency
  +
| PSSM-value
  +
| frequency
  +
|-
  +
| M35T
  +
| 3
  +
| 16%
  +
| 5
  +
| 78%
  +
|-
  +
| V53M
  +
| 5
  +
| 99%
  +
| 1
  +
| 1%
  +
|-
  +
| G93R
  +
| 3
  +
| 29%
  +
| -2
  +
| 1%
  +
|-
  +
| Q127H
  +
| 2
  +
| 16%
  +
| -2
  +
| 0%
  +
|-
  +
| A162S
  +
| 5
  +
| 100%
  +
| 1
  +
| 0%
  +
|-
  +
| L183P
  +
| 4
  +
| 95%
  +
| -3
  +
| 0%
  +
|-
  +
| T217I
  +
| 2
  +
| 16%
  +
| -2
  +
| 0%
  +
|-
  +
| R224W
  +
| 6
  +
| 100%
  +
| -3
  +
| 0%
  +
|-
  +
| E277K
  +
| 6
  +
| 100%
  +
| 0
  +
| 0%
  +
|-
  +
| C282S
  +
| 10
  +
| 100%
  +
| -1
  +
| 0%
  +
|-
  +
|+ style="caption-side: top; text-align: left" | <font size=1>'''TODO''': description.
  +
|}
  +
</figtable>
  +
  +
resulting in the following prefictions:
  +
M35T would ''NOT'' be predicted as a disease causing mutation as the mutant is occuring frequently.
  +
  +
V53M would most probably be predicted as a ''disease causing mutation'', because the mutant type occurs more often than expected. Another evidence is the high wilttype conservation at this position. The only thing that dies not fit the prediction is, that the mutation is seen more often than expected, which could be a sign of a non-disease-causing mutation.
  +
  +
G93R is hard to predict based on the given numbers, as the wildtype is not very conserved with 29%, but because the mutation gets a value of -2 (meaning the occurrence of this mutation is fewer than expected) this position is more likely to be disease causing. In total, 7 different amino acids were observed at this position (A, R, D, E, G, T, V).
  +
  +
Q127H, predicted by only the conservation of wild type and mutation would be quite difficult. The conclusion would be that (like for G93R) the position is disease causing because the mutant type occurrence is lower than expected. Another fact supporting this prediction is, that only 3 different amino acids are observed at position 127 (Q, E, G), which might be an indicator of the importance of this position for the protein.
  +
  +
A162S would be predicted as a disease causing mutation, based on the 100% conservation of the wild type.
  +
  +
L183P would be predicted as a disease causing mutation, based on the wildtype conservation (95%) and the low frequency of occurence (lower than expected) of the mutated position).
  +
  +
T217I is another difficult prediction case when only looking at wildtype and mutation type conservation. Probably it would be predicted as disease causing because of the lower-than-expected frequency of the mutant type. Another supporting fact for this prediction is, that only two amino acids were observed at that position, meaning the sequence is fairly conserved.
  +
  +
R224W would be predicted as a disease causing mutation because of the high wildtype conservation (100%), supported by the lower than expected frequency of the mutant type.
  +
  +
E277K would be predicted as a disease causing mutation because of the high wildtype conservation (100%).
  +
  +
C282S would be predicted as a disease causing mutation because of the high wildtype conservation (100%), supported by the lower than expected frequency of the mutant type.
  +
   
 
<br style="clear:both;">
 
<br style="clear:both;">

Revision as of 14:07, 16 June 2012

Hemochromatosis>>Task 6: Sequence-based mutation analysis


Riddle of the Task

Coming soon... or later...


Short task description

Detailed description: Sequence-based mutation analysis


Protocol

A protocol with a description of the data acquisition and other scripts used for this task is available here.


SNPs

From MSUD: M35T V53M G93R Q127H A162S L183P T217I R224W E277K C282S


Amino acid features

<figtable id="TODO_ID">

Mutation Hydrophobicity (wt) Hydrophobicity (mt) Polarity (wt) Polarity (mt) pI (wt) pI (mt) v.d.W. volume (wt) v.d.W. volume (mt)
M35T 1.9 -0.7 nonpolar polar 5.74 5.60 124 93
V53M 4.2 1.9 nonpolar nonpolar 6.00 5.74 105 124
G93R -0.4 -4.5 nonpolar polar 6.06 10.76 48 148
Q127H -3.5 -3.2 polar polar 5.65 7.60 114 118
A162S 1.8 -0.8 nonpolar polar 6.01 5.68 67 73
L183P 3.8 -1.6 nonpolar nonpolar 6.01 6.30 124 90
T217I -0.7 4.5 polar nonpolar 5.60 6.05 93 124
R224W -4.5 -0.9 polar nonpolar 10.76 5.89 148 163
E277K -3.5 -3.9 polar polar 3.15 9.60 109 135
C282S 2.5 -0.8 polar polar 5.05 5.68 86 73
TODO: description.

</figtable>


Evolutionary analysis


BLOSUM62/PAM1/PAM250

<figtable id="TODO_ID">

Mutation BLOSUM62 PAM1 PAM250
M35T -1 6 500
V53M 1 4 200
G93R -2 0 200
Q127H 0 20 700
A162S 1 28 900
L183P -3 2 300
T217I -1 7 400
R224W -3 2 200
E277K 1 7 800
C282S -1 11 700
TODO: description.

</figtable>



PSSM

The complete matrix can be found here.

The given table provides the following data: <figtable id="TODO_ID">

SNP wt mt
PSSM-value frequency PSSM-value frequency
M35T 3 16% 5 78%
V53M 5 99% 1 1%
G93R 3 29% -2 1%
Q127H 2 16% -2 0%
A162S 5 100% 1 0%
L183P 4 95% -3 0%
T217I 2 16% -2 0%
R224W 6 100% -3 0%
E277K 6 100% 0 0%
C282S 10 100% -1 0%
TODO: description.

</figtable>

resulting in the following prefictions: M35T would NOT be predicted as a disease causing mutation as the mutant is occuring frequently.

V53M would most probably be predicted as a disease causing mutation, because the mutant type occurs more often than expected. Another evidence is the high wilttype conservation at this position. The only thing that dies not fit the prediction is, that the mutation is seen more often than expected, which could be a sign of a non-disease-causing mutation.

G93R is hard to predict based on the given numbers, as the wildtype is not very conserved with 29%, but because the mutation gets a value of -2 (meaning the occurrence of this mutation is fewer than expected) this position is more likely to be disease causing. In total, 7 different amino acids were observed at this position (A, R, D, E, G, T, V).

Q127H, predicted by only the conservation of wild type and mutation would be quite difficult. The conclusion would be that (like for G93R) the position is disease causing because the mutant type occurrence is lower than expected. Another fact supporting this prediction is, that only 3 different amino acids are observed at position 127 (Q, E, G), which might be an indicator of the importance of this position for the protein.

A162S would be predicted as a disease causing mutation, based on the 100% conservation of the wild type.

L183P would be predicted as a disease causing mutation, based on the wildtype conservation (95%) and the low frequency of occurence (lower than expected) of the mutated position).

T217I is another difficult prediction case when only looking at wildtype and mutation type conservation. Probably it would be predicted as disease causing because of the lower-than-expected frequency of the mutant type. Another supporting fact for this prediction is, that only two amino acids were observed at that position, meaning the sequence is fairly conserved.

R224W would be predicted as a disease causing mutation because of the high wildtype conservation (100%), supported by the lower than expected frequency of the mutant type.

E277K would be predicted as a disease causing mutation because of the high wildtype conservation (100%).

C282S would be predicted as a disease causing mutation because of the high wildtype conservation (100%), supported by the lower than expected frequency of the mutant type.



MSA conservation


M35T

  • Secondary structure assignment: sheet
SEQ (mt):     LRSHSLHYLFTGASEQDLGLS
DSSP (wt):     CCEEEEEEEEEEECCCCCCE
PsiPred (wt): CCCCCCCEEEEEEECCCCCCC
PsiPred (mt): CCCCCCCEEEEEEECCCCCCC

<figtable id="M35T_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


V53M

  • Secondary structure assignment: sheet
SEQ (mt):     GLSLFEALGYMDDQLFVFYDH
DSSP (wt):    CCECCEEEEEECCEEEEEEEC
PsiPred (wt): CCCEEEEEEEECCEEEEEEEC
PsiPred (mt): CCCEEEEEEEECCEEEEEEEC

<figtable id="V53M_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


G93R

  • Secondary structure assignment: helix
SEQ (mt):     MWLQLSQSLKRWDHMFTVDFW
DSSP (wt):    HHHHHHHHHHHHHHHHHHHHH
PsiPred (wt): HHHHHHHHHHHHHHHHHHHHH
PsiPred (mt): HHHHHHHHHHHHHHHHHHHHH

<figtable id="G93R_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


Q127H

  • Secondary structure assignment: coil
SEQ (mt):     TLQVILGCEMHEDNSTEGYWK
DSSP (wt):    EEEEEEEEEECCCCCEEEEEE
PsiPred (wt): EEEEECCCCCCCCCCCCCEEE
PsiPred (mt): CEEEECCCEECCCCCCCCCCE

<figtable id="Q127H_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


A162S

  • Secondary structure assignment: helix
SEQ (mt):     TLDWRAAEPRSWPTKLEWERH
DSSP (wt):    HCEEEECCHHHHHHHHHHHCC
PsiPred (wt): CCCEECCCCCHHHHHHHHHHH
PsiPred (mt): CCCEECCCCCHHHHHHHHHHH

<figtable id="A162S_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


L183P

  • Secondary structure assignment: helix (trusting DSSP)
SEQ (mt):     KIRARQNRAYPERDCPAQLQQ
DSSP (wt):    CHHHHHHHHHHHHHHHHHHHH
PsiPred (wt): HHHHHHHHCCCCCCHHHHHHH
PsiPred (mt): HHHHHHHHCCCCCCHHHHHHH

<figtable id="L183P_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


T217I

  • Secondary structure assignment: coil
SEQ (mt):     PPLVKVTHHVISSVTTLRCRA
DSSP (wt):    CCEEEEEEEECCCCEEEEEEE
PsiPred (wt): CCCEEEECCCCCCCCEEEEEE
PsiPred (mt): CCCEEEECCCCCCCCEEEEEE

<figtable id="T217I_pymol">

Wildtype.
Mutant.
Table TODO: ...

</figtable>


R224W

  • Secondary structure assignment: sheet
SEQ (mt):     HHVTSSVTTLWCRALNYYPQN
DSSP (wt):    EEECCCCEEEEEEEEEEECCC
PsiPred (wt): CCCCCCCCEEEEEECCCCCCC
PsiPred (mt): CCCCCCCCEEEEEECCCCCCC


E277K

  • Secondary structure assignment: helix (trusting DSSP)
SEQ (mt):     WITLAVPPGEKQRYTCQVEHP
DSSP (wt):    EEEEEECCCHHHHEEEEEECC
PsiPred (wt): EEEEEECCCCCCCEEEEEECC
PsiPred (mt): EEEEEECCCCCCCEEEEEECC


C282S

  • Secondary structure assignment: sheet
SEQ (mt):     VPPGEEQRYTSQVEHPGLDQP
DSSP (wt):    ECCCHHHHEEEEEECCCCCCC
PsiPred (wt): ECCCCCCCEEEEEECCCCCCC
PsiPred (mt): ECCCCCCCCEEEEECCCCCCC


References

<references/>