Difference between revisions of "Canavan Disease: Task 08 - Sequence-based Mutation Analysis"
(→Comparison) |
(→Comparison) |
||
(48 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
'''Sequence-based mutation analysis''' is important, since mutations may effect the protein stability or function. The analysis can also be used to predict, if a mutation is disease causing, or not. |
'''Sequence-based mutation analysis''' is important, since mutations may effect the protein stability or function. The analysis can also be used to predict, if a mutation is disease causing, or not. |
||
+ | == Wild Type - Mutant Approach == |
||
− | == [[Canavan_Disease:_Task_08_-_Journal|LabJournal]] == |
||
+ | To get a feeling of interpreting the data of different SNPs, ten amino acid mutations were randomly chosen from HGMD (disease causing) and dbSNP (non-synonymous mutations). Those were sorted in ascending order to its original amino acid to shuffle HGMD and dbSNP, such that a memorization from which database it came from is not possible any longer. |
||
− | |||
+ | '''<xr id="data"></xr>''' gives a short summary about these mutations in terms of sidechain changes or the secondary structure: |
||
− | == Wildtype - Mutant Approach == |
||
− | To get a feeling interpreting the data of different SNPs, ten aminoacid mutations were randomly chosen from HGMD (disease causing) and dbSNP (non-synonymous mutations). Those were sorted in ascending order to its original aminoacid to shuffle HGMD and dbSNP, so a memorization is not possible any longer. |
||
− | The following '''<xr id="data"></xr>''' should give a short summary about these mutations in terms of sidechain changes or the secondary structure: |
||
<figtable id="data"> |
<figtable id="data"> |
||
{| border="1" cellpadding="5" cellspacing="0" align="center" |
{| border="1" cellpadding="5" cellspacing="0" align="center" |
||
|- |
|- |
||
− | ! colspan="8" style="background:#87cefa;" | |
+ | ! colspan="8" style="background:#87cefa;" | Comparison of Changes from Wild Type to Mutation Type |
|- |
|- |
||
! style="background:#BFBFBF;" align="center" | Mutation |
! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" colspan="2" | |
+ | ! style="background:#BFBFBF;" align="center" colspan="2" | Sidechain Polarity |
− | ! style="background:#BFBFBF;" align="center" colspan="2" | |
+ | ! style="background:#BFBFBF;" align="center" colspan="2" | Sidechain Charge |
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | Visualization |
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | Sec. Struc. |
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | Uniprot Info |
|- |
|- |
||
! style="background:#E5E5E5;" align="center" | |
! style="background:#E5E5E5;" align="center" | |
||
Line 27: | Line 25: | ||
! style="background:#E5E5E5;" align="center" | |
! style="background:#E5E5E5;" align="center" | |
||
|- |
|- |
||
− | || Arg233Trp || |
+ | || Arg233Trp || basic polar || nonpolar || positive || neutral || [[Image:Canavan_Mutation_Arg233Trp.png|centre|thumb|200px|]] || LOOP || - |
|- |
|- |
||
|| Asn121Asp || polar || acidic polar || neutral || negative || [[Image:Canavan_Mutation_Asn121Asp.png|centre|thumb|200px|]] || LOOP || - |
|| Asn121Asp || polar || acidic polar || neutral || negative || [[Image:Canavan_Mutation_Asn121Asp.png|centre|thumb|200px|]] || LOOP || - |
||
|- |
|- |
||
− | || His21Pro || |
+ | || His21Pro || basic polar || nonpolar || neutral || neutral || [[Image:Canavan_Mutation_His21Pro.png|centre|thumb|200px|]] || LOOP || metal binding |
|- |
|- |
||
|| Ile157Thr || nonpolar || polar || neutral || neutral || [[Image:Canavan_Mutation_Ile157Thr.png|centre|thumb|200px|]] || LOOP || - |
|| Ile157Thr || nonpolar || polar || neutral || neutral || [[Image:Canavan_Mutation_Ile157Thr.png|centre|thumb|200px|]] || LOOP || - |
||
Line 37: | Line 35: | ||
|| Leu272Pro || nonpolar || nonpolar || neutral || neutral || [[Image:Mutation Leu272Pro.png|centre|thumb|200px|]] || LOOP || - |
|| Leu272Pro || nonpolar || nonpolar || neutral || neutral || [[Image:Mutation Leu272Pro.png|centre|thumb|200px|]] || LOOP || - |
||
|- |
|- |
||
− | || Lys213Glu || |
+ | || Lys213Glu || basic polar || acidic polar || positive || negative || [[Image:Canavan_Mutation_Lys213Glu.png|centre|thumb|200px|]] || LOOP || - |
|- |
|- |
||
|| Pro149Ala || nonpolar || nonpolar || neutral || neutral || [[Image:Canavan_Mutation_Pro149Ala.png|centre|thumb|200px|]] || LOOP <br> near HELIX || - |
|| Pro149Ala || nonpolar || nonpolar || neutral || neutral || [[Image:Canavan_Mutation_Pro149Ala.png|centre|thumb|200px|]] || LOOP <br> near HELIX || - |
||
|- |
|- |
||
− | || Pro257Arg || nonpolar || |
+ | || Pro257Arg || nonpolar || basic polar || neutral || positive || [[Image:Canavan_Mutation_Pro257Arg.png|centre|thumb|200px|]] || LOOP || - |
|- |
|- |
||
|| Thr166Ile || polar || nonpolar || neutral || neutral || [[Image:Canavan_Mutation_Thr166Ile.png|centre|thumb|200px|]] || LOOP <br> near HELIX || in binding region |
|| Thr166Ile || polar || nonpolar || neutral || neutral || [[Image:Canavan_Mutation_Thr166Ile.png|centre|thumb|200px|]] || LOOP <br> near HELIX || in binding region |
||
Line 48: | Line 46: | ||
|- |
|- |
||
|} |
|} |
||
− | <center><small>'''<caption>''' |
+ | <center><small>'''<caption>''' Investigation of changes in sidechain, secondary structure of funtional residue (as listed in Uniprot). For each mutation the wild type is colored green, the mutation is colored blue.</caption></small></center> |
</figtable> |
</figtable> |
||
− | + | Interpreting '''<xr id="data"></xr>''' above, someone may assume the following observations: |
|
− | * '''Arg233Trp''': The sidechain polarity changes from basic polar to nonpolar. The charge changes from positive to neutral. The mutation site |
+ | * '''Arg233Trp''': The sidechain polarity changes from basic polar to nonpolar. The charge changes from positive to neutral. The mutation site is located within a LOOP region and there is no information whether it is a functional residue (as it could be found in [http://www.uniprot.org/uniprot/P45381 Uniprot]). Without further investigation a first impression is, that it is '''possibly disease causing'''. |
− | * '''Asn121Asp''': The sidechain polarity changes from polar to acidic polar. The charge changes from neutral to negative. The mutation site |
+ | * '''Asn121Asp''': The sidechain polarity changes from polar to acidic polar. The charge changes from neutral to negative. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is '''possibly disease causing'''. |
− | * '''His21Pro''': The sidechain polarity changes from basic polar to nonpolar. The charge does not change. The mutation site |
+ | * '''His21Pro''': The sidechain polarity changes from basic polar to nonpolar. The charge does not change. The mutation site is located within a LOOP region, but within the active center. This position is needed for the zinc binding. Without further investigation a first impression is, that it is '''disease causing'''. |
− | * '''Ile157Thr''': The sidechain polarity changes from nonpolar to polar. The charge does not change. The mutation site |
+ | * '''Ile157Thr''': The sidechain polarity changes from nonpolar to polar. The charge does not change. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is '''not disease causing'''. |
− | * '''Leu272Pro''': The sidechain polarity does not change. The charge does not change. The mutation site |
+ | * '''Leu272Pro''': The sidechain polarity does not change. The charge does not change. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is '''not disease causing'''. |
− | * '''Lys213Glu''': The sidechain polarity changes from basic polar to acidic polar. The charge changes from positive to negative. The mutation site |
+ | * '''Lys213Glu''': The sidechain polarity changes from basic polar to acidic polar. The charge changes from positive to negative. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is '''disease causing'''. |
− | * '''Pro149Ala''': The sidechain polarity does not change. The charge does not change. The mutation site |
+ | * '''Pro149Ala''': The sidechain polarity does not change. The charge does not change. The mutation site is located within a LOOP region, quite near to a HELIX and there is no information whether it is a functional residue, or not. Since Proline is known to be a typical HELIX-breaker, maybe this Proline is necessary for the sequence. Without further investigation a first impression is, that it is '''possibly disease causing'''. |
− | * '''Pro257Arg''': The sidechain polarity changes from nonpolar to basic polar. The charge changes from neutral to positive. The mutation site |
+ | * '''Pro257Arg''': The sidechain polarity changes from nonpolar to basic polar. The charge changes from neutral to positive. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is '''possibly disease causing'''. |
− | * '''Thr166Ile''': The sidechain polarity changes from polar to nonpolar. The charge does not change. The mutation site |
+ | * '''Thr166Ile''': The sidechain polarity changes from polar to nonpolar. The charge does not change. The mutation site is located within a LOOP region next to a HELIX. This position is known to be pat of the binding region of aspartoacylase. Without further investigation a first impression is, that it is '''disease causing'''. |
− | * '''Tyr288Cys''': The sidechain polarity changes from polar to nonpolar. The charge does not change. The mutation site |
+ | * '''Tyr288Cys''': The sidechain polarity changes from polar to nonpolar. The charge does not change. The mutation site is located within a HELIX. This position is known to be a binding site. Without further investigation a first impression is, that it is '''disease causing'''. |
− | + | For further investigation, the results of matrices as BLOSUM 62, PAM 1/250 and PSSM were taken into account (compare to '''<xr id="matrices"></xr>'''), as well as a comparison to the conservation of aspartoacylase within homologous species, as defined in '''<xr id="msa"></xr>'''. The PSSM matrix was calculated for aspartoacylase using PsiBlast with 5 iterations. |
|
<figtable id="matrices"> |
<figtable id="matrices"> |
||
{| border="1" cellpadding="5" cellspacing="0" align="center" |
{| border="1" cellpadding="5" cellspacing="0" align="center" |
||
|- |
|- |
||
− | ! colspan="8" style="background:#87cefa;" | |
+ | ! colspan="8" style="background:#87cefa;" | Comparison using Matrix Information |
|- |
|- |
||
! style="background:#BFBFBF;" align="center" | Mutation |
! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | BLOSUM 62 |
! style="background:#BFBFBF;" align="center" | PAM 1/250 |
! style="background:#BFBFBF;" align="center" | PAM 1/250 |
||
! style="background:#BFBFBF;" align="center" | PSSM Matrix |
! style="background:#BFBFBF;" align="center" | PSSM Matrix |
||
− | ! style="background:#BFBFBF;" align="center" colspan="2" | PSSM |
+ | ! style="background:#BFBFBF;" align="center" colspan="2" | PSSM Conservation |
− | ! style="background:#BFBFBF;" align="center" colspan="2" | MSA |
+ | ! style="background:#BFBFBF;" align="center" colspan="2" | MSA Conservation |
|- |
|- |
||
! style="background:#E5E5E5;" align="center" | |
! style="background:#E5E5E5;" align="center" | |
||
Line 107: | Line 105: | ||
|- |
|- |
||
|} |
|} |
||
+ | <center><small>'''<caption>''' Representation of effects of a mutation using standard matrices like BLOSUM and PAM. The PSSM was calculated via PsiBlast.<br>A multiple sequence alignment (MSA) was made to look for evolutionary conservations of the sites compares to homologous sequences from mammalian species. WT=wild type, mut=mutation.</caption></small></center> |
||
− | <center><small>'''<caption>''' text </caption></small></center> |
||
</figtable> |
</figtable> |
||
− | Considering '''<xr id="matrices"></xr>''' above, a |
+ | Considering '''<xr id="matrices"></xr>''' above, a deeper look into matrices are done. Therefore a short reflection: In BLOSUM positive values indicate a common chemical substitution. Whereas common amino acids have a low and rare amino acids a high weight. In PAM highly negative values correlate to a high mismatch penalty on this mutation. If the PSSM conservation in the wild type is high and relative to this value very low in the mutation type the PSSM matrix value is negative and indicates, that the mutation is more likely to be disease causing. The more negative it is, the higher is the possibility to have an disease causing effect. The multiple sequence alignment of homologous sequences of aspartoacylase shows the conservation of the original (wild type) and mutated amino acid. A high value in the mutation type would indicate a ''normal mutation'' within range of homologous sequences from other mammalian species. |
− | * '''Arg233Trp''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a negative value, PAM shows no real significant data. The PSSM conservation is high in the wild type |
+ | * '''Arg233Trp''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a negative value, PAM shows no real significant data. The PSSM conservation is high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would result into a change from possibly disease causing to '''disease causing'''. |
− | * '''Asn121Asp''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a positive value, PAM shows no real significant data. The PSSM conservation is medium high in the wild type |
+ | * '''Asn121Asp''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a positive value, PAM shows no real significant data. The PSSM conservation is medium high in the wild type amino acid and the score is slightly negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''possibly disease causing'''. |
− | * '''His21Pro''': The first impression was that it is ''disease causing''. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is very high in the wild type |
+ | * '''His21Pro''': The first impression was that it is ''disease causing''. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is very high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''disease causing'''. |
− | * '''Ile157Thr''': The first impression was that it is ''not disease causing''. BLOSUM represents a slightly negative value, PAM shows zero. The PSSM conservation is higher in the mutation type than in the wild type, therefore score is positive. The mutated |
+ | * '''Ile157Thr''': The first impression was that it is ''not disease causing''. BLOSUM represents a slightly negative value, PAM shows zero. The PSSM conservation is higher in the mutation type than in the wild type, therefore score is positive. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''not disease causing'''. |
− | * '''Leu272Pro''': The first impression was that it is ''not disease causing''. BLOSUM represents a negative value, PAM shows also a negative value. The PSSM conservation is slightly higher in the wild type |
+ | * '''Leu272Pro''': The first impression was that it is ''not disease causing''. BLOSUM represents a negative value, PAM shows also a negative value. The PSSM conservation is slightly higher in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would result into a change from not disease causing to '''possibly disease causing'''. |
− | * '''Lys213Glu''': The first impression was that it is ''disease causing''. BLOSUM represents a positive value, PAM shows also a positive value. The PSSM conservation is slightly higher in the wild type |
+ | * '''Lys213Glu''': The first impression was that it is ''disease causing''. BLOSUM represents a positive value, PAM shows also a positive value. The PSSM conservation is slightly higher in the wild type amino acid and the score is slightly positive. The mutated amino acid is conserved in some homologous species. Since the sidechain changes in the first impression are so extreme, the second impression would remain the decision that it is '''disease causing'''. |
− | * '''Pro149Ala''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a negative value, PAM shows a positive value. The PSSM conservation is slightly higher in the wild type |
+ | * '''Pro149Ala''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a negative value, PAM shows a positive value. The PSSM conservation is slightly higher in the wild type amino acid and the score is zero. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''possibly disease causing'''. |
− | * '''Pro257Arg''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a negative value, PAM shows a zero. The PSSM conservation is higher in the wild type |
+ | * '''Pro257Arg''': The first impression was that it is ''possibly disease causing''. BLOSUM represents a negative value, PAM shows a zero. The PSSM conservation is higher in the wild type amino acid and the score is slightly positive. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''possibly disease causing'''. |
− | * '''Thr166Ile''': The first impression was that it is ''disease causing''. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is high in the wild type |
+ | * '''Thr166Ile''': The first impression was that it is ''disease causing''. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''disease causing'''. |
− | * '''Tyr288Cys''': The first impression was that it is ''disease causing''. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is very high in the wild type |
+ | * '''Tyr288Cys''': The first impression was that it is ''disease causing''. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is very high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is '''disease causing'''. |
− | The following '''<xr id="msa"></xr>''' shows the homologous sequences used for the multiple sequence alignment conservation approach. Only one sequence per species was used to prevent a bias towards those sequences. |
+ | The following '''<xr id="msa"></xr>''' shows the homologous sequences used for the multiple sequence alignment conservation approach. Those were resulting from a Blast search against aspartoacylase in Uniprot using the ''mammalian'' database. Only one sequence per species was used to prevent a bias towards those sequences. |
<figtable id="msa"> |
<figtable id="msa"> |
||
{| border="1" cellpadding="5" cellspacing="0" align="center" |
{| border="1" cellpadding="5" cellspacing="0" align="center" |
||
|- |
|- |
||
− | ! colspan="3" style="background:#87cefa;" | |
+ | ! colspan="3" style="background:#87cefa;" | Mammalian Homologous Sequences |
|- |
|- |
||
! style="background:#BFBFBF;" align="center" | Homolog |
! style="background:#BFBFBF;" align="center" | Homolog |
||
Line 176: | Line 174: | ||
|- |
|- |
||
|} |
|} |
||
− | <center><small>'''<caption>''' |
+ | <center><small>'''<caption>''' List of all homologous sequences to aspartoacylase found by a Blast search against a mammalian database.</caption></small></center> |
</figtable> |
</figtable> |
||
== Scoring Approach == |
== Scoring Approach == |
||
− | The next step was to use different methods available online to |
+ | The next step was to use different methods available online to check whether they predict a mutation to be disease causing, or to show an effect concerning the protein function, or not. For this approach '''[http://sift.jcvi.org/www/SIFT_seq_submit2.html SIFT]''', '''[http://genetics.bwh.harvard.edu/pph2/ PolyPhen]''', '''[http://www.mutationtaster.org MutationTaster]''' and '''[https://rostlab.org/owiki/index.php/Snap SNAP2]''' were used. The single results with their prediction probabilities for each method can be found in the '''[[Canavan_Disease:_Task_08_-_Sequence-based_Mutation_Analysis#Supplement|Supplement]]''' at the end of this Task. The results concerning the protein function are listed in a summary '''<xr id="comparison"></xr>''' in the next section. |
− | |||
− | ==== SIFT ==== |
||
− | |||
− | <figtable id="sift"> |
||
− | {| border="1" cellpadding="5" cellspacing="0" align="center" |
||
− | |- |
||
− | ! colspan="5" style="background:#87cefa;" | Results |
||
− | |- |
||
− | ! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction score |
||
− | ! style="background:#BFBFBF;" align="center" | Median Sequence Conservation |
||
− | ! style="background:#BFBFBF;" align="center" | Sequences represented at this Position |
||
− | |- |
||
− | || Arg233Trp || affect function || 0.00 || 2.99 || 18 |
||
− | |- |
||
− | || Asn121Asp || affect function || 0.03 || 2.99 || 18 |
||
− | |- |
||
− | || His21Pro || affect function || 0.00 || 3.02 || 17 |
||
− | |- |
||
− | || Ile157Thr || tolerated || 0.49 || 3.02 || 17 |
||
− | |- |
||
− | || Leu272Pro || affect function || 0.00 || 3.01 || 17 |
||
− | |- |
||
− | || Lys213Glu || tolerated || 0.92 || 2.99 || 18 |
||
− | |- |
||
− | || Pro149Ala || tolerated || 0.60 || 3.02 || 17 |
||
− | |- |
||
− | || Pro257Arg || tolerated || 0.13 || 3.01 || 17 |
||
− | |- |
||
− | || Thr166Ile || affect function || 0.05 || 3.02 || 17 |
||
− | |- |
||
− | || Tyr288Cys || affect function || 0.00 || 3.01 || 17 |
||
− | |- |
||
− | |} |
||
− | <center><small>'''<caption>''' text </caption></small></center> |
||
− | </figtable> |
||
− | |||
− | |||
− | ==== Polyphen ==== |
||
− | |||
− | |||
− | <figtable id="poly"> |
||
− | {| border="1" cellpadding="5" cellspacing="0" align="center" |
||
− | |- |
||
− | ! colspan="5" style="background:#87cefa;" | Results |
||
− | |- |
||
− | ! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction score |
||
− | ! style="background:#BFBFBF;" align="center" | sensitivity |
||
− | ! style="background:#BFBFBF;" align="center" | specificity |
||
− | |- |
||
− | || Arg233Trp || probably damaging || 1.000 || 0.00 || 1.00 |
||
− | |- |
||
− | || Asn121Asp || probably damaging || 0.978 || 0.76 || 0.96 |
||
− | |- |
||
− | || His21Pro || probably damaging || 1.000 || 0.00 || 1.00 |
||
− | |- |
||
− | || Ile157Thr || benign || 0.023 || 0.95 || 0.81 |
||
− | |- |
||
− | || Leu272Pro || probably damaging || 0.991 || 0.71 || 0.97 |
||
− | |- |
||
− | || Lys213Glu || benign || 0.003 || 0.98 || 0.44 |
||
− | |- |
||
− | || Pro149Ala || possibly damaging || 0.901 || 0.82 || 0.94 |
||
− | |- |
||
− | || Pro257Arg || possibly damaging || 0.13 || 0.00 || 0.00 |
||
− | |- |
||
− | || Thr166Ile || probably damaging || 0.993 || 0.70 || 0.97 |
||
− | |- |
||
− | || Tyr288Cys || probably damaging || 1.000 || 0.00 || 1.00 |
||
− | |- |
||
− | |} |
||
− | <center><small>'''<caption>''' text </caption></small></center> |
||
− | </figtable> |
||
− | |||
− | |||
− | ==== MutationTaster ==== |
||
− | |||
− | <figtable id="muttast"> |
||
− | {| border="1" cellpadding="5" cellspacing="0" align="center" |
||
− | |- |
||
− | ! colspan="5" style="background:#87cefa;" | Results |
||
− | |- |
||
− | ! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction Probability |
||
− | |- |
||
− | || Arg233Trp || disease causing || 0.99 |
||
− | |- |
||
− | || Asn121Asp || disease causing || 0.99 |
||
− | |- |
||
− | || His21Pro || disease causing || 0.99 |
||
− | |- |
||
− | || Ile157Thr || disease causing || 0.98 |
||
− | |- |
||
− | || Leu272Pro || disease causing || 0.99 |
||
− | |- |
||
− | || Lys213Glu || disease causing || 0.61 |
||
− | |- |
||
− | || Pro149Ala || disease causing || 0.99 |
||
− | |- |
||
− | || Pro257Arg || disease causing || 0.99 |
||
− | |- |
||
− | || Thr166Ile || disease causing || 0.99 |
||
− | |- |
||
− | || Tyr288Cys || disease causing || 0.99 |
||
− | |- |
||
− | |} |
||
− | <center><small>'''<caption>''' text </caption></small></center> |
||
− | </figtable> |
||
− | |||
− | |||
− | |||
− | |||
− | ==== SNAP ==== |
||
− | |||
− | <figtable id="snap"> |
||
− | {| border="1" cellpadding="5" cellspacing="0" align="center" |
||
− | |- |
||
− | ! colspan="5" style="background:#87cefa;" | Results |
||
− | |- |
||
− | ! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction |
||
− | ! style="background:#BFBFBF;" align="center" | SNAP score (simplified) |
||
− | ! style="background:#BFBFBF;" align="center" | SNAP score |
||
− | ! style="background:#BFBFBF;" align="center" | Prediction Probability |
||
− | |- |
||
− | || Arg233Trp || Non-neutral || 6 || 65 || 80% |
||
− | |- |
||
− | || Asn121Asp || Non-neutral || 7 || 73 || 85% |
||
− | |- |
||
− | || His21Pro || Non-neutral || 5 || 50 || 75% |
||
− | |- |
||
− | || Ile157Thr || Neutral || 8 || -84 || 93% |
||
− | |- |
||
− | || Leu272Pro || Non-neutral || 3 || 39 || 66% |
||
− | |- |
||
− | || Lys213Glu || Neutral || 9 || -90 || 97% |
||
− | |- |
||
− | || Pro149Ala || Neutral || 6 || -65 || 82% |
||
− | |- |
||
− | || Pro257Arg || Neutral || 8 || -82 || 93% |
||
− | |- |
||
− | || Thr166Ile || Non-neutral || 7 || 71 || 85% |
||
− | |- |
||
− | || Tyr288Cys || Non-neutral || 9 || 90 || 95% |
||
− | |- |
||
− | |} |
||
− | <center><small>'''<caption>''' text </caption></small></center> |
||
− | </figtable> |
||
== Comparison == |
== Comparison == |
||
− | To get a better overview of all methods used in this |
+ | To get a better overview of all methods used in this Task the following '''<xr id="comparison"></xr>''' represent the prediction for each mutation, whereas the color-coding indicates: |
− | * red - predicted to be '''disease causing''' |
+ | * red - predicted to be '''disease causing''' (or having a functional effect for the mutation) |
* yellow - predicted to be '''possibly disease causing''' and therefore '''maybe''' disease causing |
* yellow - predicted to be '''possibly disease causing''' and therefore '''maybe''' disease causing |
||
* green - predicted to be '''not disease causing''' and therefore '''neutral''' |
* green - predicted to be '''not disease causing''' and therefore '''neutral''' |
||
Line 343: | Line 189: | ||
{| border="1" cellpadding="5" cellspacing="0" align="center" |
{| border="1" cellpadding="5" cellspacing="0" align="center" |
||
|- |
|- |
||
− | ! colspan=" |
+ | ! colspan="9" style="background:#87cefa;" | Prediction of Different Approaches and Validation |
|- |
|- |
||
! style="background:#BFBFBF;" align="center" | Mutation |
! style="background:#BFBFBF;" align="center" | Mutation |
||
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | First Personal<br>Impression |
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | Second Personal<br>Impression |
! style="background:#BFBFBF;" align="center" | SIFT |
! style="background:#BFBFBF;" align="center" | SIFT |
||
− | ! style="background:#BFBFBF;" align="center" | |
+ | ! style="background:#BFBFBF;" align="center" | PolyPhen |
! style="background:#BFBFBF;" align="center" | MutationTaster |
! style="background:#BFBFBF;" align="center" | MutationTaster |
||
! style="background:#BFBFBF;" align="center" | SNAP |
! style="background:#BFBFBF;" align="center" | SNAP |
||
− | ! style="background:#BFBFBF;" align="center" | Validation |
+ | ! style="background:#BFBFBF;" align="center" colspan="2" | Validation |
|- |
|- |
||
|| Arg233Trp |
|| Arg233Trp |
||
Line 361: | Line 207: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:#F3FC42;"| not sure<br>(dbSNP data) |
+ | |style="background:#F3FC42;" width="17"| <-- || not sure<br>(dbSNP data) |
|- |
|- |
||
|| Asn121Asp |
|| Asn121Asp |
||
Line 370: | Line 216: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:#FF455E;"| (dbSNP data) but<br>another mutation known to be disease causing |
+ | |style="background:#FF455E;"| <-- || (dbSNP data) but<br>another mutation known to be disease causing |
|- |
|- |
||
|| His21Pro |
|| His21Pro |
||
Line 379: | Line 225: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:#FF455E;"| |
+ | |style="background:#FF455E;"| <-- || definitively disease causing<br>(HGMD data) |
|- |
|- |
||
|| Ile157Thr |
|| Ile157Thr |
||
Line 388: | Line 234: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#45D66B;"| neutral |
|style="background:#45D66B;"| neutral |
||
− | |style="background:#45D66B;"| (dbSNP data) but<br>SNPdbe without a reference to Canavan Disease |
+ | |style="background:#45D66B;"| <-- || (dbSNP data) but<br>SNPdbe without a reference to Canavan Disease |
|- |
|- |
||
|| Leu272Pro |
|| Leu272Pro |
||
Line 397: | Line 243: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:#FF455E;"| |
+ | |style="background:#FF455E;"| <-- || definitively disease causing<br>(HGMD data) |
|- |
|- |
||
|| Lys213Glu |
|| Lys213Glu |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:# |
+ | |style="background:#FF455E;"| dis.caus. |
|style="background:#45D66B;"| neutral |
|style="background:#45D66B;"| neutral |
||
|style="background:#45D66B;"| neutral |
|style="background:#45D66B;"| neutral |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#45D66B;"| neutral |
|style="background:#45D66B;"| neutral |
||
− | |style="background:#FF455E;"| |
+ | |style="background:#FF455E;"| <-- || definitively disease causing<br>(HGMD data) |
|- |
|- |
||
|| Pro149Ala |
|| Pro149Ala |
||
Line 415: | Line 261: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#45D66B;"| neutral |
|style="background:#45D66B;"| neutral |
||
− | |style="background:#F3FC42;"| not sure<br>(dbSNP data) |
+ | |style="background:#F3FC42;"| <-- || not sure<br>(dbSNP data) |
|- |
|- |
||
|| Pro257Arg |
|| Pro257Arg |
||
Line 424: | Line 270: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#45D66B;"| neutral |
|style="background:#45D66B;"| neutral |
||
− | |style="background:#F3FC42;"| not sure<br>(dbSNP data) |
+ | |style="background:#F3FC42;"| <-- || not sure<br>(dbSNP data) |
|- |
|- |
||
|| Thr166Ile |
|| Thr166Ile |
||
Line 433: | Line 279: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:#FF455E;"| |
+ | |style="background:#FF455E;"| <-- || definitively disease causing<br>(HGMD data) |
|- |
|- |
||
|| Tyr288Cys |
|| Tyr288Cys |
||
Line 442: | Line 288: | ||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
|style="background:#FF455E;"| dis.caus. |
|style="background:#FF455E;"| dis.caus. |
||
− | |style="background:#FF455E;"| |
+ | |style="background:#FF455E;"| <-- || definitively disease causing<br>(HGMD data) |
|- |
|- |
||
|} |
|} |
||
− | <center><small>'''<caption>''' |
+ | <center><small>'''<caption>''' Resulting predictions for each method. Red = disease causing / functional effect, yellow = possibly disease causing, green = neutral, no functional effect.<br>Validation - using the information from Task 07.</caption></small></center> |
</figtable> |
</figtable> |
||
− | As it can be seen in |
+ | As it can be seen in '''<xr id="comparison"></xr>''' the personal impressions from the wild type to mutation type approach are quite comparable to those predicted with available online methods. Someone should consider to run those methods, if the personal impression stays with the ''maybe disease causing'' interpretation. Using matrices as BLOSUM, PAM or PSSM definitely influenced the personal impression in a positive manner compared to the validation result. Interestingly Lys213Glu was only predicted ''correctly'' from MutationTaster and the personal impression from the simple approach, which shows that a simple approach (looking at the data) could be a good way to filter first.<br> |
+ | A further validation of the positions showed, that position 121 (here Asn->Asp) is known to be associated with Canavan Disease in HGMD (Asn->Ile). Therefore the assumption that this position is disease causing. Position 157 (here Ile->Thr) can also be found in SNPdbe, but without any association to Canavan Disease. This leads to the assumption that it is neutral. For all other positions any further validation was not possible. |
||
+ | |||
+ | ==[[Canavan_Disease:_Task_08_-_Supplement|Supplement]]== |
||
== Tasks == |
== Tasks == |
Latest revision as of 12:28, 5 September 2013
Sequence-based mutation analysis is important, since mutations may effect the protein stability or function. The analysis can also be used to predict, if a mutation is disease causing, or not.
Wild Type - Mutant Approach
To get a feeling of interpreting the data of different SNPs, ten amino acid mutations were randomly chosen from HGMD (disease causing) and dbSNP (non-synonymous mutations). Those were sorted in ascending order to its original amino acid to shuffle HGMD and dbSNP, such that a memorization from which database it came from is not possible any longer. <xr id="data"></xr> gives a short summary about these mutations in terms of sidechain changes or the secondary structure:
<figtable id="data">
Comparison of Changes from Wild Type to Mutation Type | |||||||
---|---|---|---|---|---|---|---|
Mutation | Sidechain Polarity | Sidechain Charge | Visualization | Sec. Struc. | Uniprot Info | ||
from | to | from | to | ||||
Arg233Trp | basic polar | nonpolar | positive | neutral | LOOP | - | |
Asn121Asp | polar | acidic polar | neutral | negative | LOOP | - | |
His21Pro | basic polar | nonpolar | neutral | neutral | LOOP | metal binding | |
Ile157Thr | nonpolar | polar | neutral | neutral | LOOP | - | |
Leu272Pro | nonpolar | nonpolar | neutral | neutral | LOOP | - | |
Lys213Glu | basic polar | acidic polar | positive | negative | LOOP | - | |
Pro149Ala | nonpolar | nonpolar | neutral | neutral | LOOP near HELIX |
- | |
Pro257Arg | nonpolar | basic polar | neutral | positive | LOOP | - | |
Thr166Ile | polar | nonpolar | neutral | neutral | LOOP near HELIX |
in binding region | |
Tyr288Cys | polar | nonpolar | neutral | neutral | HELIX | binding site |
</figtable>
Interpreting <xr id="data"></xr> above, someone may assume the following observations:
- Arg233Trp: The sidechain polarity changes from basic polar to nonpolar. The charge changes from positive to neutral. The mutation site is located within a LOOP region and there is no information whether it is a functional residue (as it could be found in Uniprot). Without further investigation a first impression is, that it is possibly disease causing.
- Asn121Asp: The sidechain polarity changes from polar to acidic polar. The charge changes from neutral to negative. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is possibly disease causing.
- His21Pro: The sidechain polarity changes from basic polar to nonpolar. The charge does not change. The mutation site is located within a LOOP region, but within the active center. This position is needed for the zinc binding. Without further investigation a first impression is, that it is disease causing.
- Ile157Thr: The sidechain polarity changes from nonpolar to polar. The charge does not change. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is not disease causing.
- Leu272Pro: The sidechain polarity does not change. The charge does not change. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is not disease causing.
- Lys213Glu: The sidechain polarity changes from basic polar to acidic polar. The charge changes from positive to negative. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is disease causing.
- Pro149Ala: The sidechain polarity does not change. The charge does not change. The mutation site is located within a LOOP region, quite near to a HELIX and there is no information whether it is a functional residue, or not. Since Proline is known to be a typical HELIX-breaker, maybe this Proline is necessary for the sequence. Without further investigation a first impression is, that it is possibly disease causing.
- Pro257Arg: The sidechain polarity changes from nonpolar to basic polar. The charge changes from neutral to positive. The mutation site is located within a LOOP region and there is no information whether it is a functional residue, or not. Without further investigation a first impression is, that it is possibly disease causing.
- Thr166Ile: The sidechain polarity changes from polar to nonpolar. The charge does not change. The mutation site is located within a LOOP region next to a HELIX. This position is known to be pat of the binding region of aspartoacylase. Without further investigation a first impression is, that it is disease causing.
- Tyr288Cys: The sidechain polarity changes from polar to nonpolar. The charge does not change. The mutation site is located within a HELIX. This position is known to be a binding site. Without further investigation a first impression is, that it is disease causing.
For further investigation, the results of matrices as BLOSUM 62, PAM 1/250 and PSSM were taken into account (compare to <xr id="matrices"></xr>), as well as a comparison to the conservation of aspartoacylase within homologous species, as defined in <xr id="msa"></xr>. The PSSM matrix was calculated for aspartoacylase using PsiBlast with 5 iterations.
<figtable id="matrices">
Comparison using Matrix Information | |||||||
---|---|---|---|---|---|---|---|
Mutation | BLOSUM 62 | PAM 1/250 | PSSM Matrix | PSSM Conservation | MSA Conservation | ||
WT | mut | WT | mut | ||||
Arg233Trp | -3 | 2 | -5 | 30% | 0% | 0.95 | 0.0 |
Asn121Asp | 1 | 2 | -1 | 15% | 2% | 1.0 | 0.0 |
His21Pro | -2 | 0 | -9 | 99% | 0% | 1.0 | 0.0 |
Ile157Thr | -1 | 0 | 1 | 4% | 8% | 1.0 | 0.0 |
Leu272Pro | -3 | -3 | -3 | 4% | 2% | 1.0 | 0.0 |
Lys213Glu | 1 | 1 | 1 | 8% | 5% | 0.95 | 0.05 |
Pro149Ala | -1 | 1 | 0 | 10% | 7% | 1.0 | 0.0 |
Pro257Arg | -2 | 0 | 1 | 23% | 8% | 1.0 | 0.0 |
Thr166Ile | -1 | 0 | -3 | 16% | 1% | 1.0 | 0.0 |
Tyr288Cys | -2 | 0 | -4 | 81% | 0% | 1.0 | 0.0 |
A multiple sequence alignment (MSA) was made to look for evolutionary conservations of the sites compares to homologous sequences from mammalian species. WT=wild type, mut=mutation.
</figtable>
Considering <xr id="matrices"></xr> above, a deeper look into matrices are done. Therefore a short reflection: In BLOSUM positive values indicate a common chemical substitution. Whereas common amino acids have a low and rare amino acids a high weight. In PAM highly negative values correlate to a high mismatch penalty on this mutation. If the PSSM conservation in the wild type is high and relative to this value very low in the mutation type the PSSM matrix value is negative and indicates, that the mutation is more likely to be disease causing. The more negative it is, the higher is the possibility to have an disease causing effect. The multiple sequence alignment of homologous sequences of aspartoacylase shows the conservation of the original (wild type) and mutated amino acid. A high value in the mutation type would indicate a normal mutation within range of homologous sequences from other mammalian species.
- Arg233Trp: The first impression was that it is possibly disease causing. BLOSUM represents a negative value, PAM shows no real significant data. The PSSM conservation is high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would result into a change from possibly disease causing to disease causing.
- Asn121Asp: The first impression was that it is possibly disease causing. BLOSUM represents a positive value, PAM shows no real significant data. The PSSM conservation is medium high in the wild type amino acid and the score is slightly negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is possibly disease causing.
- His21Pro: The first impression was that it is disease causing. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is very high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is disease causing.
- Ile157Thr: The first impression was that it is not disease causing. BLOSUM represents a slightly negative value, PAM shows zero. The PSSM conservation is higher in the mutation type than in the wild type, therefore score is positive. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is not disease causing.
- Leu272Pro: The first impression was that it is not disease causing. BLOSUM represents a negative value, PAM shows also a negative value. The PSSM conservation is slightly higher in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would result into a change from not disease causing to possibly disease causing.
- Lys213Glu: The first impression was that it is disease causing. BLOSUM represents a positive value, PAM shows also a positive value. The PSSM conservation is slightly higher in the wild type amino acid and the score is slightly positive. The mutated amino acid is conserved in some homologous species. Since the sidechain changes in the first impression are so extreme, the second impression would remain the decision that it is disease causing.
- Pro149Ala: The first impression was that it is possibly disease causing. BLOSUM represents a negative value, PAM shows a positive value. The PSSM conservation is slightly higher in the wild type amino acid and the score is zero. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is possibly disease causing.
- Pro257Arg: The first impression was that it is possibly disease causing. BLOSUM represents a negative value, PAM shows a zero. The PSSM conservation is higher in the wild type amino acid and the score is slightly positive. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is possibly disease causing.
- Thr166Ile: The first impression was that it is disease causing. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is disease causing.
- Tyr288Cys: The first impression was that it is disease causing. BLOSUM represents a negative value, PAM shows zero. The PSSM conservation is very high in the wild type amino acid and the score is negative. The mutated amino acid is not part of any of the homologous species. The second impression would remain the decision that it is disease causing.
The following <xr id="msa"></xr> shows the homologous sequences used for the multiple sequence alignment conservation approach. Those were resulting from a Blast search against aspartoacylase in Uniprot using the mammalian database. Only one sequence per species was used to prevent a bias towards those sequences.
<figtable id="msa">
Mammalian Homologous Sequences | ||
---|---|---|
Homolog | Protein | Organism |
H2QBW4 | Aspartoacylase (Canavan disease) | Pan troglodytes |
G3QQC1 | Uncharacterized protein | Gorilla gorilla |
Q5R9E0 | Aspartoacylase | Pongo abelii |
G1S5Z4 | Uncharacterized protein | Nomascus leucogenys |
G7PT66 | Aspartoacylase | Macaca fascicularis |
F6WMI4 | Uncharacterized protein | Equus caballus |
B1PK17 | Aspartoacylase | Sus scrofa |
P46446 | Aspartoacylase | Bos taurus |
M3Y3U3 | Uncharacterized protein | Mustela putorius furo |
D2HZN6 | Uncharacterized protein (Fragment) | Ailuropoda melanoleuca |
E2R8M6 | Uncharacterized protein | Canis familiaris |
M3X3I5 | Uncharacterized protein | Felis catus |
G1SPT6 | Uncharacterized protein | Oryctolagus cuniculus |
I3N0V6 | Uncharacterized protein | Spermophilus tridecemlineatus |
G5B939 | Aspartoacylase | Heterocephalus glaber |
G3TAV8 | Uncharacterized protein | Loxodonta africana |
G1P679 | Uncharacterized protein | Myotis lucifugus |
H0WW85 | Uncharacterized protein | Otolemur garnettii |
H0UYA8 | Uncharacterized protein (Fragment) | Cavia porcellus |
Q9R1T5 | Aspartoacylase | Rattus norvegicus |
Q8R3P0 | Aspartoacylase | Mus musculus |
</figtable>
Scoring Approach
The next step was to use different methods available online to check whether they predict a mutation to be disease causing, or to show an effect concerning the protein function, or not. For this approach SIFT, PolyPhen, MutationTaster and SNAP2 were used. The single results with their prediction probabilities for each method can be found in the Supplement at the end of this Task. The results concerning the protein function are listed in a summary <xr id="comparison"></xr> in the next section.
Comparison
To get a better overview of all methods used in this Task the following <xr id="comparison"></xr> represent the prediction for each mutation, whereas the color-coding indicates:
- red - predicted to be disease causing (or having a functional effect for the mutation)
- yellow - predicted to be possibly disease causing and therefore maybe disease causing
- green - predicted to be not disease causing and therefore neutral
<figtable id="comparison">
Prediction of Different Approaches and Validation | ||||||||
---|---|---|---|---|---|---|---|---|
Mutation | First Personal Impression |
Second Personal Impression |
SIFT | PolyPhen | MutationTaster | SNAP | Validation | |
Arg233Trp | maybe | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | <-- | not sure (dbSNP data) |
Asn121Asp | maybe | maybe | dis.caus. | dis.caus. | dis.caus. | dis.caus. | <-- | (dbSNP data) but another mutation known to be disease causing |
His21Pro | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | <-- | definitively disease causing (HGMD data) |
Ile157Thr | neutral | neutral | neutral | neutral | dis.caus. | neutral | <-- | (dbSNP data) but SNPdbe without a reference to Canavan Disease |
Leu272Pro | neutral | maybe | dis.caus. | dis.caus. | dis.caus. | dis.caus. | <-- | definitively disease causing (HGMD data) |
Lys213Glu | dis.caus. | dis.caus. | neutral | neutral | dis.caus. | neutral | <-- | definitively disease causing (HGMD data) |
Pro149Ala | maybe | maybe | neutral | maybe | dis.caus. | neutral | <-- | not sure (dbSNP data) |
Pro257Arg | maybe | maybe | neutral | maybe | dis.caus. | neutral | <-- | not sure (dbSNP data) |
Thr166Ile | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | <-- | definitively disease causing (HGMD data) |
Tyr288Cys | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | dis.caus. | <-- | definitively disease causing (HGMD data) |
Validation - using the information from Task 07.
</figtable>
As it can be seen in <xr id="comparison"></xr> the personal impressions from the wild type to mutation type approach are quite comparable to those predicted with available online methods. Someone should consider to run those methods, if the personal impression stays with the maybe disease causing interpretation. Using matrices as BLOSUM, PAM or PSSM definitely influenced the personal impression in a positive manner compared to the validation result. Interestingly Lys213Glu was only predicted correctly from MutationTaster and the personal impression from the simple approach, which shows that a simple approach (looking at the data) could be a good way to filter first.
A further validation of the positions showed, that position 121 (here Asn->Asp) is known to be associated with Canavan Disease in HGMD (Asn->Ile). Therefore the assumption that this position is disease causing. Position 157 (here Ile->Thr) can also be found in SNPdbe, but without any association to Canavan Disease. This leads to the assumption that it is neutral. For all other positions any further validation was not possible.
Supplement
Tasks
- Link to Task 01: Canavan Disease
- Link to Task 02: Alignments
- Link to Task 03: Sequence-based Predictions
- Link to Task 04: Structural Alignments
- Link to Task 05: Homology Modelling
- Link to Task 06: Protein Structure Prediction from Evolutionary Sequence Variation
- Link to Task 07: Researching SNPs
- Link to Task 08: Sequence-based Mutation Analysis
- Link to Task 09: Structure-based Mutation Analysis
- Link to Task 10: Normal Mode Analysis