Difference between revisions of "Fabry:Sequence-based mutation analysis"
Rackersederj (talk | contribs) (→Polyphen2) |
Rackersederj (talk | contribs) m (→SIFT: SORTABLE) |
||
Line 412: | Line 412: | ||
<figtable id="tab:Sift"> |
<figtable id="tab:Sift"> |
||
<caption>Sift Scores</caption> |
<caption>Sift Scores</caption> |
||
− | {| style="border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; padding-left:5px; padding-right:5px; border-color: #000; padding: 0" |
+ | {| class="wikitable sortable" style="border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; padding-left:5px; padding-right:5px; border-color: #000; padding: 0" |
! style="border-style: solid; padding-left:5px; padding-right:5px; border-width: 0 1px 2px 0;"| SNP |
! style="border-style: solid; padding-left:5px; padding-right:5px; border-width: 0 1px 2px 0;"| SNP |
||
! style="border-style: solid; padding-left:5px; padding-right:5px; border-width: 0 1px 2px 0;"| Prediction |
! style="border-style: solid; padding-left:5px; padding-right:5px; border-width: 0 1px 2px 0;"| Prediction |
Revision as of 12:25, 13 June 2012
Fabry Disease » Sequence-based mutation analysis
The following analyses were performed on the basis of the α-Galactosidase A sequence. Please consult the journal for the commands used to generate the results.
Contents
Dataset preparation
Q279E N215S I289V S65T R356W V316I P323T P40S R118H A143T
Amino acid properties
<figtable id="tab:aaProp">
Physicochemical properties of the chosen SNPs and changes of properties between wildtype (wt) and mutant (mt). Used abbreveations in this table:
AA: Amino Acid, Pol: Side-chain polarity, Charge: Side-chain charge at pH 7.4, HI: Hydropathy index, RM: Residue Mass, iP: isoelectric point
SNP | wt AA |
wt Pol |
wt Charge |
wt HI |
wt RM |
wt iP |
mt AA |
mt Pol |
mt Charge |
mt HI |
mt RM |
mt iP |
change in Pol | change in Charge | change in HI | change in RM | change in iP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Q279E | Q | polar | neutral | −3.5 | 128.131 | 5.65 | E | polar | negative | −3.5 | 129.116 | 3.15 | none | neutral to negative | 0 | 0.985 | -2.5 |
N215S | N | polar | neutral | −3.5 | 114.104 | 5.41 | S | polar | neutral | −0.8 | 87.078 | 5.68 | none | none | −2.7 | -27.026 | 0.27 |
I289V | I | nonpolar | neutral | 4.5 | 113.160 | 6.05 | V | nonpolar | neutral | 4.2 | 99.133 | 6.00 | none | none | -0.3 | -14.027 | -0.045 |
S65T | S | polar | neutral | −0.8 | 87.078 | 5.68 | T | polar | neutral | −0.7 | 101.105 | 5.60 | none | none | 0.1 | 14.027 | -0.08 |
R356W | R | polar | positive | −4.5 | 156.188 | 10.76 | W | nonpolar | neutral | −0.9 | 186.213 | 5.89 | polar to nonpolar | positive to neutral | 3.6 | 30.025 | -4.87 |
V316I | V | nonpolar | neutral | 4.2 | 99.133 | 6.00 | I | nonpolar | neutral | 4.5 | 113.160 | 6.05 | none | none | 0.3 | 14.027 | 0.05 |
P323T | P | nonpolar | neutral | −1.6 | 97.117 | 6.30 | T | polar | neutral | −0.7 | 101.105 | 5.60 | nonpolar to polar | none | 0.9 | 3.988 | -0.7 |
P40S | P | nonpolar | neutral | −1.6 | 97.117 | 6.30 | S | polar | neutral | −0.8 | 87.078 | 5.68 | nonpolar to polar | none | 0.8 | -10.039 | -0.62 |
R118H | R | polar | positive | −4.5 | 156.188 | 10.76 | H | polar | pos(10%), neutr(90%) |
−3.2 | 137.142 | 7.60 | none | positive to pos(10%), neutr(90%) |
1,3 | -19.046 | -3.16 |
A143T | A | nonpolar | neutral | 1.8 | 71.079 | 6.01 | T | polar | neutral | −0.7 | 101.105 | 5.60 | nonpolar to polar | none | −2.5 | 30.026 | -0.41 |
</figtable>
Simple structural analysis
- Now take into consideration where in the protein the mutation occurs and document: Create a picture with PyMOL showing the original and mutated residue in the protein. Use PyMOL for this. More thorough structural analyses will be introduced in the next task.
Location
<figtable id="tab:Location"> Physicochemical properties of the chosen SNPs and changes of properties between wildtype (wt) and mutant (mt)
SNP | SecStruc Psipred | SecStruc Psipred long |
SecStruc Reprof | SecStruc Reprof long |
SecStruc DSSP | SecStruc DSSP long |
---|---|---|---|---|---|---|
Q279E | H | CCCCCCCHHHHHHHHHHHHHH | H | EECCCCCCHHHHHHHHHHHHH | H | CCCCC--HHHHHHHHHHHHHC |
N215S | H | CCCCCCCCCCHHHHHCCCCCC | H | CECCCCCCCCHHHHHHHHHHH | H | HHHCCCC---HHHHCCC-CEE |
I289V | H | HHHHHHHHHHHCCCEEEECCC | H | HHHHHHHHHHHHHCHHCCCCC | C | HHHHHHHHHHCC--EEE-C-C |
S65T | H | CCCCCCCCCCHHHHHHHHHHH | H | CCCCCCCHHHHHHHHHHHHHH | H | CCC-CCCC-CHHHHHHHHHHH |
R356W | C | CCEEEEEEECCCCCCEEEEEE | C | HHHHHHHHHHCCCCCCCCHHH | - | CEEEEEEEE---CCC-EEEEE |
V316I | H | HHHCCCCHHHHHHCCCCCCCC | E | HHHHCCCCCEEEECCCCCCCC | H | HHHHHH-HHHHHHHC-CC--- |
P323T | C | HHHHHHCCCCCCCCCEEEEEC | C | CCEEEECCCCCCCCCCEECCC | C | HHHHHHHC-CC----EEEE-C |
P40S | C | CCCCCCCCCCCCCCCCCCCCC | C | HHCCCCCCCCCCCHHHHHHEE | - | ---CC--CC--EEEECHHHHC |
R118H | H | CCCCCCCCHHHHHHHHHHCCC | H | CCCCCCHHHHHHHHHHHCCCC | H | -CCC-CCHHHHHHHHHHHCC- |
A143T | C | EECCCCCCCCCCCCCCCCHHH | C | EEECCCCCCCCCCCCCCCCCC | C | EEECCCE-CCCCE--CCCHHH |
</figtable>
Substitution matrices
<figtable id="tab:Subsmatr"> Substitution values for all SNPs, both substitution matrices
SNP | Value BLOSUM62 |
Value PAM250 |
---|---|---|
Q279E | 2 | 2 |
N215S | 1 | 1 |
I289V | 3 | 4 |
S65T | 1 | 1 |
R356W | -3 | 2 |
V316I | 3 | 4 |
P323T | -1 | 0 |
P40S | -1 | 1 |
R118H | 0 | 2 |
A143T | 0 | 1 |
</figtable>
PSSM
- Getting a bit closer to evolution you will have to create a PSSM (position specific scoring matrix) for your protein sequence using PSI-BLAST (5 iterations). How conserved are the WT residues in your mutant positions? How is the frequency of occurrence (conservation) for the mutant residue type? Anything interesting?
Multiple sequence alignment
- And another step close to evolution: Identify all mammalian homologous sequences. Create a multiple sequence alignment for them with a method of your choice. Using this you can now calculate conservation for WT and mutant residues again. Compare this to the matrix- and PSSM-derived results.
Scoring methods
SIFT
<figtable id="tab:Sift"> Sift Scores
SNP | Prediction | Sift Score | Sequences represented at this position |
---|---|---|---|
P40S | AFFECT PROTEIN FUNCTION | 0.00 | 41 |
S65T | AFFECT PROTEIN FUNCTION | 0.01 | 45 |
R118H | be TOLERATED | 0.06 | 48 |
A143T | AFFECT PROTEIN FUNCTION | 0.01 | 48 |
N215S | AFFECT PROTEIN FUNCTION | 0.01 | 48 |
Q279E | AFFECT PROTEIN FUNCTION | 0.00 | 48 |
I289V | AFFECT PROTEIN FUNCTION | 0.05 | 48 |
V316I | be TOLERATED | 0.75 | 48 |
P323T | AFFECT PROTEIN FUNCTION | 0.01 | 48 |
R356W | AFFECT PROTEIN FUNCTION | 0.01 | 47 |
</figtable>
Median sequence conservation: 2.99
Polyphen2
<figtable id="tab:Polyphen"> Polyphen Scores
SNP | rs ID | Sec Struc | Prediction | pph2 Class | pph2 Prob | pph2 FPR | pph2 TPR | pph2 FDR |
---|---|---|---|---|---|---|---|---|
Q279E | rs28935485 | H | probably damaging | deleterious | 0.983 | 0.0387 | 0.745 | 0.0657 |
N215S | rs28935197 | . | benign | neutral | 0.048 | 0.167 | 0.941 | 0.194 |
I289V | ? | H | probably damaging | deleterious | 0.975 | 0.0436 | 0.762 | 0.072 |
S65T | ? | . | probably damaging | deleterious | 0.995 | 0.0277 | 0.681 | 0.0521 |
R356W | ? | . | probably damaging | deleterious | 1 | 0.00026 | 0.00018 | 0.0109 |
V316I | ? | H | benign | neutral | 0.308 | 0.113 | 0.904 | 0.144 |
P323T | ? | T | possibly damaging | deleterious | 0.612 | 0.091 | 0.872 | 0.124 |
P40S | ? | . | probably damaging | deleterious | 1 | 0.00026 | 0.00018 | 0.0109 |
R118H | ? | H | benign | neutral | 0.015 | 0.209 | 0.956 | 0.229 |
A143T | ? | T | probably damaging | deleterious | 1 | 0.00026 | 0.00018 | 0.0109 |
</figtable>
SNAP
- SNAP is installed on the VirtualBox and should be used command-line only. -- As blast is the bottleneck of SNAP, and you are doing that anyway, we might as well look at all possible substitutions in the position of our mutations. This way we can learn much more about the nature of the given mutation: Is our mutation problematic because we introduce an unwanted effect, or because the WT residue is essential and by mutating we remove that?
Results and Conclusion
- Compare ALL results and create an overview table.
- Try to come up with a consensus between all the findings requested above.
- Check whether you are right in the HGMD – were you able to predict a change?