Difference between revisions of "Sequence-based mutation analysis HEXA"
(→rs121907982: Ile -> Val) |
(→Analysis of the mutations) |
||
(95 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Mutations == |
== Mutations == |
||
+ | |||
+ | The next table listed all mutations, which are used in the following analyses. |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
Line 47: | Line 49: | ||
|AAC -> GAC |
|AAC -> GAC |
||
|- |
|- |
||
+ | |rs1800431 |
||
− | |rs121907982 |
||
|436 |
|436 |
||
|Ile -> Val |
|Ile -> Val |
||
Line 58: | Line 60: | ||
|- |
|- |
||
|} |
|} |
||
+ | <br><br>Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs disease]]<br><br> |
||
== Analysis of the mutations == |
== Analysis of the mutations == |
||
+ | We created for each mutation an extra page. The summary of the analysis can be seen in the Summary Section. |
||
− | ===rs4777505: Asn -> Ser === |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs4777505 rs4777505: Asn -> Ser]] |
||
− | '''pysicochemical properities''' |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs121907979 rs121907979: Leu -> Arg]] |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Asn |
||
− | |Ser |
||
− | |consequences |
||
− | |- |
||
− | |polar, small, hydrophilic, negatively charged |
||
− | |polar, tiny, hydrophilic, neutral |
||
− | |Both amino acids are polar and hydrophilic. Ser is tiny, Asn therefore is a small amino acid. The biggest difference between these two amino acid is, that Asn is negatively charged and Ser is neutral. But this is not that big difference and therefore we suggest, that this mutation do not delete the structure and function of the protein. |
||
− | |- |
||
− | |} |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs61731240 rs61731240: His -> Asp]] |
||
− | '''Visualisation of the mutation''' |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs121907974 rs121907974: Phe -> Ser]] |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
− | |- |
||
− | |[[Image:N29_2.png|thumb|150px|Amino acid Asparagine]] |
||
− | |[[Image:29S_2.png|thumb|150px|Amino acid Serine]] |
||
− | |[[Image:N29S.png|thumb|150px|Picture which visualize the mutation]] |
||
− | |- |
||
− | |} |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs61747114 rs61747114: Leu -> Phe]] |
||
− | '''Subsitution Matrices values''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="3" | PAM 1 |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
− | |- |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |- |
||
− | |20 |
||
− | |36 (Asp) |
||
− | |0 (Cys, Met) |
||
− | |5 |
||
− | |7 (Asp) |
||
− | |2 (Cys, Leu, Phe, Trp) |
||
− | |1 |
||
− | |1 (Asp, His, Ser) |
||
− | | -4 (Trp) |
||
− | |- |
||
− | |} |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs1054374 rs1054374: Ser -> Ile]] |
||
− | '''Conservation analysis with multiple alignments''' |
||
− | [[Image:mut_1.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs121907967 rs121907967: Trp -> TER]] |
||
− | ===rs121907979: Leu -> Arg=== |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs1800430 rs1800430: Asn -> Asp]] |
||
− | '''pysicochemical properities''' |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs1800431 rs1800431: Ile -> Val]] |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Leu |
||
− | |Arg |
||
− | |consequences |
||
− | |- |
||
− | |aliphatic, hydrophobic, neutral |
||
− | |positive charged, polar, hydrophilic |
||
− | |Leucine is smaller and without a positive charge. Therefore, Arg is too big for the position of Leu, therefore, the change of Leu with Arg has to cause changes in the 3D structure of the protein. Furthermore, Leu is a hydrophobic amino acid, whereas Arg is hydrophilic. This is the complete contrary and therefore we suggest, that the protein will not function any longer. |
||
− | |- |
||
− | |} |
||
+ | *[[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/rs121907968 rs121907968: Trp -> Arg]] |
||
− | '''Visualisation of the mutation''' |
||
+ | <br><br>Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs disease]]<br><br> |
||
+ | == Summary page == |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
− | |- |
||
− | |[[Image:L39.png|thumb|150px|Amino acid Leucine]] |
||
− | |[[Image:39R.png|thumb|150px|Amino acid Arginine]] |
||
− | |[[Image:L39R.png|thumb|150px|Picture which visualize the mutation]] |
||
− | |- |
||
− | |} |
||
+ | Here we sum up all analysis we did for the mutations: |
||
− | '''Subsitution Matrices values''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="3" | PAM 1 |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
− | |- |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |- |
||
− | |1 |
||
− | |22 (Ile) |
||
− | |0 (Asp, Cys) |
||
− | |4 |
||
− | |20 (Met) |
||
− | |2 (Cys) |
||
− | | -2 |
||
− | |0 (Phe) |
||
− | | -4 (Asp, Gly) |
||
− | |- |
||
− | |} |
||
+ | === Results === |
||
− | '''Conservation analysis with multiple alignments''' |
||
− | [[Image:mut_2.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | First of all, we want to explain how we decided if we assign the result of the method as neutral or non-neutral. We analysed a lot of different issues of this protein, which are based on the sequence. After the different analyses we had to decide whether the mutation seems to be neutral or non-neutral. In the following list, we want to explain how we decided for each property if it is neutral or not. |
||
− | ===rs61731240: His -> Asp=== |
||
+ | |||
+ | * pysicochemical properties we called a mutation neutral, if the properties of the mutated amino acid are very similar to them of the original amino acid. Otherwise, it is called non-neutral. |
||
+ | *visual analysis: a mutation is called neutral, if the structure of the changed amino acid is very similar to the structure of the original amino acid. |
||
+ | * PAM1, PAM2, BLOSUM62 and PSSM analysis: a mutation is called neutral, if the change score is near to the score of the most frequent exchanged amino acid. |
||
+ | * multiple alignment: a mutation is called non-neutral if the original amino acid is very conserved in the alignment. If there is a conservation rate less than 50%, we decided to call the mutation neutral. |
||
+ | * analysis with JPred, PsiPred: if the mutated amino acid has no secondary structure (coil) in the prediction of the secondary structure, we called the mutation neutral. |
||
+ | * analysis with the real structure: here we look, if the mutation takes place in a secondary structure element or not. Instead of DSSP, we used the real structure. Normally, the real structure is not available and therefore, this value can not be used in the prediction. Therefore, we do not use this value, since we decided wheter the mutation is neutral or not. |
||
+ | * SNAP, SIFT and PolyPhen2 prediction: These are the three mutation prediction methods we used in our analysis. Here a mutation is called neutral, if the program predicts this mutation as neutral. |
||
− | '''pysicochemical properities''' |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |rowspan="2" | method |
||
− | |His |
||
+ | |colspan="10" | mutations |
||
− | |Asp |
||
− | |consequences |
||
|- |
|- |
||
+ | |Asn -> Ser (rs4777505) |
||
− | |aromatic, positive charged, polar, hydrophilic |
||
+ | |Leu -> Arg (rs121907979) |
||
− | |negative charged, small, polar, hydrophilic |
||
+ | |His -> Asp (rs61731240) |
||
− | |On the one side, both amino acids are polar, but on the other side, His is positively charged, while Asp is negatively charged, which is an essential difference between these both amino acids. Therefore it is very likely, that this change causes big changes in the structure of the protein and the protein therefore will probably not work any longer. Furthermore, the structure of the two amino acids is very different, because of the aromatic ring of the His. |
||
+ | |Phe -> Ser (rs121907974) |
||
+ | |Leu -> Phe (rs61747114) |
||
+ | |Ser -> Ile (rs1054374) |
||
+ | |Trp -> TER (rs121907967) |
||
+ | |Asn -> Asp (rs1800430) |
||
+ | |Ile -> Val (rs121907982) |
||
+ | |Trp -> Arg (rs121907968) |
||
|- |
|- |
||
+ | |pysicochemical properties |
||
− | |} |
||
+ | |neutral |
||
− | |||
+ | |non-neutral |
||
− | '''Visualisation of the mutation''' |
||
+ | |non-neutral |
||
− | |||
+ | |non-neutral |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |neutral |
||
− | |picture original aa |
||
+ | |non-neutral |
||
− | |picture mutated aa |
||
+ | |non-neutral |
||
− | |combined picture |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
+ | |visual analysis |
||
− | |[[Image:H179.png|thumb|150px|Amino acid Histidine]] |
||
+ | |neutral |
||
− | |[[Image:179D.png|thumb|150px|Amino acid Aspartate]] |
||
+ | |non-neutral |
||
− | |[[Image:H179D.png|thumb|150px|Picture which visualize the mutation]] |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
+ | |PAM1 |
||
− | |} |
||
+ | |neutral |
||
− | |||
+ | |non-neutral |
||
− | |||
+ | |non-neutral |
||
− | '''Subsitution Matrices values''' |
||
+ | |non-neutral |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |no statement |
||
− | |colspan="3" | PAM 1 |
||
+ | |non-neutral |
||
− | |colspan="3" | Pam 250 |
||
+ | |no information |
||
− | |colspan="3" | BLOSOUM 62 |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
|- |
|- |
||
+ | |PAM250 |
||
− | |value aa |
||
+ | |neutral |
||
− | |most frequent substitution |
||
+ | |non-neutral |
||
− | |rarest substitution |
||
+ | |no statement |
||
− | |value aa |
||
+ | |non-neutral |
||
− | |most frequent substitution |
||
+ | |no statement |
||
− | |rarest substitution |
||
+ | |no statement |
||
− | |value aa |
||
+ | |no information |
||
− | |most frequent substitution |
||
+ | |neutral |
||
− | |rarest substitution |
||
+ | |neutral |
||
+ | |neutral |
||
|- |
|- |
||
+ | |BLOSUM62 |
||
− | |3 |
||
+ | |neutral |
||
− | |20 (Gln) |
||
+ | |no statement |
||
− | |0 (Ile, Met) |
||
+ | |no statement |
||
− | |4 |
||
+ | |non-neutral |
||
− | |7 (Gln) |
||
+ | |neutral |
||
− | |2 (Ala, Cys, Gly, Ile, Leu, Met, Phe, Thr, Trp, Val) |
||
+ | |non-neutral |
||
− | | -1 |
||
+ | |no information |
||
− | |2 (Tyr) |
||
+ | |neutral |
||
− | | -3 (Cys, Ile, Leu, Val) |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
+ | |PSSM analysis |
||
− | |} |
||
+ | |neutral |
||
− | |||
+ | |non-neutral |
||
− | '''Conservation analysis with multiple alignments''' |
||
+ | |non-neutral |
||
− | [[Image:mut_3.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | |non-neutral |
||
− | |||
+ | |non-neutral |
||
− | ===rs121907974: Phe -> Ser === |
||
+ | |non-neutral |
||
− | |||
+ | |no information |
||
− | '''pysicochemical properities''' |
||
+ | |no statement |
||
− | |||
+ | |non-neutral |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |non-neutral |
||
− | |Phe |
||
− | |Ser |
||
− | |consequences |
||
|- |
|- |
||
+ | |multiple alignment |
||
− | |polar, tiny, hydrophilic, neutral |
||
− | | |
+ | |neutral |
+ | |non-neutral |
||
− | |Ile is much bigger than Ser and also is branched, because it is an aliphatic amino acid. Therefore the structure of both amino acids is really different and Ile is to big for the position where Ser was. Therefore, there has to be a big change in the 3D structure of the protein and the protein probably will loose its function. |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
+ | |analysis with Jpred |
||
− | |} |
||
+ | |non-neutral |
||
− | |||
+ | |non-neutral |
||
− | '''Visualisation of the mutation''' |
||
+ | |neutral |
||
− | |||
+ | |neutral |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |neutral |
||
− | |picture original aa |
||
+ | |neutral |
||
− | |picture mutated aa |
||
+ | |neutral |
||
− | |combined picture |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
|- |
|- |
||
+ | |analysis with PsiPred |
||
− | |[[Image:F211.png|thumb|150px|Amino acid Phenylalanine]] |
||
+ | |non-neutral |
||
− | |[[Image:211S.png|thumb|150px|Amino acid Serine]] |
||
+ | |non-neutral |
||
− | |[[Image:F211S.png|thumb|150px|Picture which visualize the mutation]] |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |non-neutral |
||
+ | |neutral |
||
+ | |neutral |
||
|- |
|- |
||
+ | |analysis with real structure |
||
− | |} |
||
+ | |non-neutral |
||
− | |||
+ | |no statement |
||
− | '''Subsitution Matrices values''' |
||
+ | |neutral |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |neutral |
||
− | |colspan="3" | PAM 1 |
||
+ | |non-neutral |
||
− | |colspan="3" | Pam 250 |
||
+ | |neutral |
||
− | |colspan="3" | BLOSOUM 62 |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |- |
||
+ | |SNAP Prediction |
||
+ | |neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |non-neutral |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |no information |
||
+ | |neutral |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
+ | |SIFT Prediction |
||
− | |value aa |
||
+ | |neutral |
||
− | |most frequent substitution |
||
+ | |non-neutral |
||
− | |rarest substitution |
||
+ | |non-neutral |
||
− | |value aa |
||
+ | |non-neutral |
||
− | |most frequent substitution |
||
+ | |neutral |
||
− | |rarest substitution |
||
+ | |neutral |
||
− | |value aa |
||
+ | |no information |
||
− | |most frequent substitution |
||
+ | |neutral |
||
− | |rarest substitution |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
+ | |PolyPhen2 Prediction |
||
− | |2 |
||
+ | |neutral |
||
− | |28 (Tyr) |
||
+ | |non-neutral |
||
− | |0 (Asp, Cys, Glu, Lys, Pro, Val) |
||
+ | |non-neutral |
||
− | |2 |
||
+ | |non-neutral |
||
− | |20 (Tyr) |
||
+ | |neutral |
||
− | |1 (Arg, Asp, Cys, Gln, Glu, Gly, Lys, Pro) |
||
+ | |neutral |
||
− | | -2 |
||
+ | |no information |
||
− | |3 (Tyr) |
||
+ | |neutral |
||
− | | -4 (Pro) |
||
+ | |neutral |
||
+ | |non-neutral |
||
|- |
|- |
||
|} |
|} |
||
+ | <br><br>Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs disease]]<br><br> |
||
+ | === Own prediction === |
||
+ | In the table above you can see the summary of all of the analyses we did to got the possibility to make a statement about the mutation. Here we want to sum it up for each mutation and write it down in an extra table. In the end, we wanted to compare our summing up with the reality and therefore we compared from which database the mutation was extracted. |
||
− | '''Conservation analysis with multiple alignments''' |
||
− | [[Image:mut_4.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | * Asn -> Ser |
||
− | ===rs61747114: Leu -> Phe=== |
||
+ | The first mutation we looked at, is a substitution from Asn to Ser. As you can see in our summary table, there was always a prediction that this mutation is silent, except of the analysis of the secondary structure. Therefore, this means that the mutation is in a secondary structure element. But these two amino acids seems to be very similar and therefore, it seems not to be that bad, if the mutation is in an secondary structure element, because the structure will not change dramatically. Therefore, in sum we predict this mutation as neutral. Also each of the prediction tools predicted this mutation as neutral. So therefore in sum, we think this is a neutral mutation. |
||
− | '''pysicochemical properities''' |
||
+ | * Leu -> Arg |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Leu |
||
− | |Phe |
||
− | |consequences |
||
− | |- |
||
− | |aliphatic, hydrophobic, neutral |
||
− | |aromatic, hydrophobic, neutral |
||
− | |Leu is an aliphatic amino acid, wheras Phe is an aromatic amino acid. This means, that Phe has an aromatic ring in its structure. But both amino acids are relatively big and so it is possible, that the exchange of this amino acids do not change the structure of the protein that much. Therefore, we suggest it is possible, that the protein will work. |
||
− | |- |
||
− | |} |
||
+ | It is a little bit more complicate to predict the effect of this mutation as before, because there are conflicting predictions of the single categories. In sum, most of the categories counted this mutation as non-neutral. By BLOSUM62 it was not possible to make a statement, because the score of this mutation was between the rarest and highest score and therefore, it was not possible to assign the mutation to one of the two categories. The multiple alignment was good conserved at this position, and therefore with this analysis the mutation seems to be non-neutral. |
||
− | '''Visualisation of the mutation''' |
||
+ | In the case of the comparison with the real structure it was not possible to make a clear statement, because this amino acid is located at the boarder between a secondary structure element and a coiled region and therefore, we do not know if a mutation at the last position of the secondary structure really change the structure dramatically. But this is not that important for our prediction, because we do not attend the secondary structure. So therefore, in sum we predicted this mutation as non-neutral, which is the same result as the methods gave us. |
||
+ | * His -> Asp |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
− | |- |
||
− | |[[Image:L248.png|thumb|150px|Amino acid Leucine]] |
||
− | |[[Image:248W.png|thumb|150px|Amino acid Phenylalanine]] |
||
− | |[[Image:L248W.png|thumb|150px|Picture which visualize the mutation]] |
||
− | |- |
||
− | |} |
||
+ | In this case we have to differ between the secondary structure analysis and the other analyses. So the other analyses showed, that this mutation might not be neutral. In the analysis with PAM250 and BLOSUM62 it was not possible to make a statement, so therefore, the amino acid mutated some times, but without a trend the a very common or very rare mutation. The secondary structure analysis predicted a neutral mutation, which means this mutation does not take place in a secondary structure element. But our analysis method for the secondary structure is very simple. We do not regard any contacts with other amino acids in the structure (which would be there). So it is not absolutely impossible, that the mutation of an amino acid in a loop region do have any effects on structure and function of the protein, especially, if the physicochemical properties differ. Furthermore, we know that there exist disordered regions, which are essential for the function of a protein, which do not have a defined secondary structure and therefore, will be predicted as coiled regions. So this is a difficult case, but because of the different physicochemical properties and the very different structure of the two amino acids and also the results of the multiple alignment, we decided to predict this mutation as non-neutral. |
||
− | '''Subsitution Matrices values''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="3" | PAM 1 |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
− | |- |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |- |
||
− | |13 |
||
− | |45 (Met) |
||
− | |0 (Asp, Cys) |
||
− | |13 |
||
− | |20 (Met) |
||
− | |2 (Cys) |
||
− | |0 |
||
− | |2 (Ile, Met) |
||
− | | -4 (Asp, Gly) |
||
− | |- |
||
− | |} |
||
− | '''Conservation analysis with multiple alignments''' |
||
− | [[Image:mut_5.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | * Phe -> Ser |
||
− | ===rs1054374: Ser -> Ile === |
||
+ | In this case we have the same situation as before. All our analysis gave us the hint, that these mutation is non-neutral, except the secondary structure analysis. As we mentioned before, it is also possible that there is a big impact on the structure of the protein, even if the mutation takes place in a coil-region. Especially if we keep in mind, that there is the possibility of a disordered region. So we think, the secondary structure is not that a straight criterion for function of the protein than the mutation rate or the physicochemical properties. Therefore, we decided to predict this mutation as non-neutral, which is consistent with the results of the three prediction methods. |
||
− | '''pysicochemical properities''' |
||
+ | * Leu -> Phe |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Ser |
||
− | |Ile |
||
− | |consequences |
||
− | |- |
||
− | |polar, tiny, hydrophilic, neutral |
||
− | |aliphatic, hydrophobic, neutra |
||
− | |Ile is much bigger than Ser and also is branched, because it is an aliphatic amino acid. Therefore the structure of both amino acids is really different and Ile is to big for the position where Ser was. Therefore, there has to be a big change in the 3D structure of the protein and the protein probably will loose its function. |
||
− | |- |
||
− | |} |
||
+ | This mutation is a very interesting case. Here we have a lot of methods, which gave us other hints. So first of all, both amino acids have the identical physicochemical properties, which is always a strong hint that the mutation does not destroy the function of the protein. Otherwise, if we have a look at the structure of the amino acids, there is a big difference between Leu and Phe and therefore, this is a hint for changing the structure of the amino acid. It was not possible to make a statement about the effect of this mutation by regarding the PAM matrices, but in the BLOSUM matrix this mutation is noted as neutral. The PSSM analysis and the multiple alignment analysis, however, suggest that the mutation is non-neutral. Very interestingly is the result of the secondary structure analysis. So if we have a look at the results of PsiPred and JPred, we have to suggest, that this mutation is neutral, because it takes place in a coiled region. But both methods predicted the secondary structure wrong, because if we have a look to the real structure, we can see, that the mutation takes place in a secondary structure element. So it is important to keep in mind, that we work on predictions, which could be wrong. But as we said before, the real structure is not regarded in our manually prediction and therefore, we decided that this mutation is neutral for the following reasons. First of all the physicochemical properties are equal and this is a very important point. Next, the structure of the residues is not similar, but the mutation takes place at a coiled region, and therefore a wrong structure would not be that dramatically as in a secondary structure element. Furthermore, BLOSUM62 told us, that this substitution is neutral. So in sum, we have more neutral predictions that non-neutral predictions. Of course, the multiple alignment is a strong hint, that the mutation is non-neutral, but as we mentioned above, we also do not know if the alignment is right and we have two secondary structure methods, which gave us the same result. Therefore we have to trust the predictions. |
||
+ | Therefore, we predicted the same effect as the methods did. |
||
+ | * Ser -> Ile |
||
− | '''Visualisation of the mutation''' |
||
+ | Interestingly in this case the physicochemical properties are not identical and also the substitution matrices scored this substitutions as non-neutral, but the rest of our predictions shows that the effect of this mutation is neutral. So there is a similar structure of the residues, the alignment is not conserved and also the position-specific scoring matrix of the PsiBlast run do not show any conservation of this residue. Furthermore, the mutation takes place in a coiled region. Although the physicochemical properties and the substitution matrices are very important hints for the effect of the mutation we decided to predict this mutation as silent. First of all, there are 5 predictions which predict this mutations as silent and only 3 predictions which see a causing effect of this mutation. An argument for a silent mutation is, that the pysicochemical properties perhaps are not that important for a residue which is located in a coiled region, especially if this residue does not have many connections to other residues. In general the substitution matrix showed, that this mutation is not neutral, but the PSSM predicts it as neutral. The PSSM also regards the position of the substitution in the sequence. So therefore, it is possible, that this mutation is normally no silent (in other proteins), but in this special case we have a neutral mutation. This prediction is equal to the predictions of the methods. |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
− | |- |
||
− | |[[Image:S293.png|thumb|150px|Amino acid Serine]] |
||
− | |[[Image:293I.png|thumb|150px|Amino acid Isoleucine]] |
||
− | |[[Image:S293I.png|thumb|150px|Picture which visualize the mutation]] |
||
− | |- |
||
− | |} |
||
+ | * Trp -> Ter |
||
− | '''Subsitution Matrices values''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="3" | PAM 1 |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
− | |- |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |- |
||
− | |2 |
||
− | |38 (Thr) |
||
− | |1 (Leu) |
||
− | |5 |
||
− | |9 (Ala, Gly, Pro, Thr) |
||
− | |3 (Phe) |
||
− | | -2 |
||
− | |1 (Ala, Asn, Thr) |
||
− | | -3 (Trp) |
||
− | |- |
||
− | |} |
||
− | '''Conservation analysis with multiple alignments''' |
||
− | [[Image:mut_6.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | In this case it is not necessary to have a look at the different predictions of the single analysis. This mutation is located at the middle of the protein and leads to a short protein, which surely could not fold in the right way and therefore could not function anymore. Therefore, this mutation is non-neutral. Sadly, it was not possible to predict the effect of a mutation which leads to shortened protein and therefore, it is not possible to compare the results of the methods with our prediction results. This is bad, because it is also possible that a mutation which leads to a shortened protein is neutral, if the mutation takes place at the very end of the protein. But in this case the mutation takes place at the middle of the protein and therefore, it is predicted as non-neutral from us. |
||
− | ===rs121907967: Trp -> TER === |
||
+ | * Asn -> Asp |
||
− | '''pysicochemical properities''' |
||
+ | This mutation is a clear thing, because only the visual analysis do not predict this mutation as neutral. Furthermore, the PsiPred method does also not predict this neutral. This is not very surprisingly, because if we have a look at the real structure of the protein we can see, this amino acid is directly located at the border between a secondary structure element and a coiled region. But the rest of our predictions, especially the physicochemical properties and the multiple alignment as well as the substitution matrices except PSSM showed clearly, that this prediction is neutral. The method we used here for the prediction also gave us the same result. |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Trp |
||
− | |TER |
||
− | |consequences |
||
− | |- |
||
− | |aromatic, polar, hydrophobic |
||
− | |TER |
||
− | |By this change, the protein is not complete, therefore it is not possible for the protein to fold and to function. |
||
− | |- |
||
− | |} |
||
+ | * Ile -> Val |
||
+ | This mutation is also very easy to classify, because every of our categories predicted the mutation as neutral. Only the comparison with the real structure and the PSSM analysis gave us the hint, that this prediction perhaps is not neutral, because it takes place at a secondary structure element. But in the first case, we do not regard the comparison with the real structure and the secondary structure prediction methods failed and secondly, the structure of the residues and the physicochemical properties are very similar and therefore, it should not have big effects on the structure of the protein, even if the mutation is located inside a secondary structure element. The PSSM is a strong hint, that this mutation is possibly non-neutral, but it is the onliest hint and the rest of our analyses gave us another hint. Therefore, we predicted this mutation as neutral, which was also the result of the three prediction methods. |
||
− | '''Visualisation of the mutation''' |
||
+ | |||
+ | * Trp -> Arg |
||
+ | |||
+ | In our last analysed mutation only the secondary structure methods predict this mutation as neutral and the substitution matrices. All other categories scored this mutation as non-neutral. As we can see, the secondary structure prediction failed, because this mutation is located at a secondary structure element. We predicted this mutation as non-neutral. First of all, we have 5 predictions for non-neutral and 4 for neutral. But only 1 categories difference is in general not enough to make a prediction. But the very important categories (physicochemical properties, alignment, PSSM) predict this mutation as non-neutral and we scored these categories as more important than for example secondary structure. Therefore, we decided to predict this mutation as non-neutral, which is consistent with the results of the three prediction methods. |
||
+ | |||
+ | We decided to sum the predictions up, to give the reader to possibility to see our predictions in one view. Furthermore, because we want to verify our predictions in the next section, we also listed the prediction results from the other methods one more time: |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |mutation |
||
− | |picture original aa |
||
+ | |our prediction |
||
− | |picture mutated aa |
||
+ | |SNAP |
||
− | |combined picture |
||
+ | |SIFT |
||
+ | |PolyPhen2 |
||
|- |
|- |
||
+ | |Asn -> Ser (rs4777505) |
||
− | |[[Image:W329.png|thumb|150px|Amino acid Tryptophan]] |
||
+ | |neutral |
||
− | | |
||
+ | |neutral |
||
− | |[[Image:prot_ter.png|thumb|150px|Visualization of the mutated protein]] |
||
+ | |neutral |
||
+ | |neutral |
||
|- |
|- |
||
+ | |Leu -> Arg (rs121907979) |
||
− | |} |
||
+ | |non-neutral |
||
− | |||
+ | |non-neutral |
||
− | '''Subsitution Matrices values''' |
||
+ | |non-neutral |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |non-neutral |
||
− | |colspan="3" | PAM 1 |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
|- |
|- |
||
+ | |His -> Asp (rs61731240) |
||
− | |value aa |
||
+ | |non-neutral |
||
− | |most frequent substitution |
||
+ | |non-neutral |
||
− | |rarest substitution |
||
+ | |non-neutral |
||
− | |value aa |
||
+ | |non-neutral |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
|- |
|- |
||
+ | |Phe -> Ser (rs121907974) |
||
− | |X |
||
+ | |non-neutral |
||
− | |2 (Arg) |
||
+ | |non-neutral |
||
− | |0 (all, except Arg, Phe, Ser, Tyr) |
||
+ | |non-neutral |
||
− | |X |
||
+ | |non-neutral |
||
− | |2 (Arg) |
||
− | |0 (all, except Arg, His, Leu, Phe, Ser, Tyr) |
||
− | |X |
||
− | |2 (Tyr) |
||
− | | -4 (Asn, Asp, Pro) |
||
|- |
|- |
||
+ | |Leu -> Phe (rs61747114) |
||
− | |} |
||
+ | |neutral |
||
− | |||
+ | |neutral |
||
− | '''Conservation analysis with multiple alignments''' |
||
+ | |neutral |
||
− | [[Image:mut_6.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | |neutral |
||
− | |||
− | ===rs1800430: Asn -> Asp=== |
||
− | |||
− | '''pysicochemical properities''' |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Asn |
||
− | |Asp |
||
− | |consequences |
||
|- |
|- |
||
+ | |Ser -> Ile (rs1054374) |
||
− | |polar, small, hydrophilic, negatively charged |
||
+ | |neutral |
||
− | |polar, small, hydrophilic, negatively charged |
||
+ | |neutral |
||
− | |Both amino acids have the same properities and therefore we suggest that an exchange of these two amino acids do not destroy the protein structure and function |
||
+ | |neutral |
||
+ | |neutral |
||
|- |
|- |
||
+ | |Trp -> TER (rs121907967) |
||
− | |} |
||
+ | |non-neutral |
||
− | |||
+ | |no information |
||
− | '''Visualisation of the mutation''' |
||
+ | |no information |
||
− | |||
+ | |no information |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
|- |
|- |
||
+ | |Asn -> Asp (rs1800430) |
||
− | |[[Image:N399.png|thumb|150px|Amino acid Asparagine]] |
||
+ | |neutral |
||
− | |[[Image:399D.png|thumb|150px|Amino acid Aspartic acid]] |
||
+ | |neutral |
||
− | |[[Image:N399D.png|thumb|150px|Picture which visualize the mutation]] |
||
+ | |neutral |
||
− | |- |
||
+ | |neutral |
||
− | |} |
||
− | |||
− | '''Subsitution Matrices values''' |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |colspan="3" | PAM 1 |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
|- |
|- |
||
+ | |Ile -> Val (rs121907982) |
||
− | |value aa |
||
+ | |neutral |
||
− | |most frequent substitution |
||
+ | |neutral |
||
− | |rarest substitution |
||
+ | |neutral |
||
− | |value aa |
||
+ | |neutral |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
|- |
|- |
||
+ | |Trp -> Arg (rs121907968) |
||
− | |36 |
||
+ | |non-neutral |
||
− | |36 (Asp) |
||
+ | |non-neutral |
||
− | |0 (Cys, Met) |
||
+ | |non-neutral |
||
− | |7 |
||
+ | |non-neutral |
||
− | |7 (Asp) |
||
− | |2 (Cys, Leu, Phe, Trp) |
||
− | |1 |
||
− | |1 (Asp, His, Ser) |
||
− | | -4 (Trp) |
||
|- |
|- |
||
|} |
|} |
||
+ | If we look at the table above, we can see that there is 100% consensus between our predictions and the predictions of the different methods (except Trp -> TER, because the other methods were not possible to predict the effect of a chain termination). So therefore, it seems useful to compare the prediction results with the real effects of the mutation, which was done in the next section. |
||
− | '''Conservation analysis with multiple alignments''' |
||
+ | <br><br>Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs disease]]<br><br> |
||
− | [[Image:mut_8.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | === Comparison with the databases === |
||
+ | Here we wanted to figure out, if we and the methods predicted the mutation correctly. Therefore, we dissolve from which database the mutation was taken. We already know, that mutations only annotated in SNP-DB are silent, whereas mutations which are annotated in HGMD oder in both databases are non-neutral. |
||
− | ===rs121907982: Ile -> Val=== |
||
− | |||
− | '''pysicochemical properities''' |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |rowspan="2" | mutation |
||
− | |Ile |
||
+ | |rowspan="2" | database |
||
− | |Val |
||
+ | |colspan="4" | predictions |
||
− | |consequences |
||
|- |
|- |
||
+ | |our |
||
− | |aliphatic, hydrophobic, neutra |
||
+ | |SNAP |
||
− | |aliphatic, hydrophobic, neutral |
||
+ | |SIFT |
||
− | |In this case, the pysicochemical properties are equal. Furthermore, they almost agree in their size. Therefore, we suggest, that there is no big effect on the 3D structure of the protein and therefore, also no big effect on the protein function. |
||
+ | |PolyPhen2 |
||
|- |
|- |
||
+ | |Asn -> Ser (rs4777505) |
||
− | |} |
||
+ | |SNP-DB (neutral) |
||
− | |||
+ | |right |
||
− | '''Visualisation of the mutation''' |
||
+ | |right |
||
− | |||
+ | |right |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |right |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
|- |
|- |
||
+ | |Leu -> Arg (rs121907979) |
||
− | |[[Image:I436.png|thumb|150px|Amino acid Isoleucine]] |
||
+ | |HGMD and SNP-DB (non-neutral) |
||
− | |[[Image:436V.png|thumb|150px|Amino acid Valin]] |
||
+ | |right |
||
− | |[[Image:I436V.png|thumb|150px|Picture which visualize the mutation]] |
||
+ | |right |
||
+ | |right |
||
+ | |right |
||
|- |
|- |
||
+ | |His -> Asp (rs61731240) |
||
− | |} |
||
+ | |SNP-DB (neutral) |
||
− | |||
+ | |wrong |
||
− | '''Subsitution Matrices values''' |
||
+ | |wrong |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |wrong |
||
− | |colspan="3" | PAM 1 |
||
+ | |wrong |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
|- |
|- |
||
+ | |Phe -> Ser (rs121907974) |
||
− | |value aa |
||
+ | |HGMD and SNP-DB (non-neutral) |
||
− | |most frequent substitution |
||
+ | |right |
||
− | |rarest substitution |
||
+ | |right |
||
− | |value aa |
||
+ | |right |
||
− | |most frequent substitution |
||
+ | |right |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
|- |
|- |
||
+ | |Leu -> Phe (rs61747114) |
||
− | |33 |
||
− | | |
+ | |SNP-DB (neutral) |
+ | |right |
||
− | |0 (Gly, Pro, Trp) |
||
+ | |right |
||
− | |9 |
||
+ | |right |
||
− | |9 (Val) |
||
+ | |right |
||
− | |1 (Trp) |
||
− | |3 |
||
− | |3 (Val) |
||
− | | -4 (Gly) |
||
|- |
|- |
||
+ | |Ser -> Ile (rs1054374) |
||
− | |} |
||
+ | |SNP-DB (neutral) |
||
− | |||
+ | |right |
||
− | '''Conservation analysis with multiple alignments''' |
||
+ | |right |
||
− | [[Image:mut_9.png|thumb|center|600px|Mutation in the multiple alignment]] |
||
+ | |right |
||
− | |||
+ | |right |
||
− | ===rs121907968: Trp -> Arg=== |
||
− | |||
− | '''pysicochemical properities''' |
||
− | |||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
− | |Trp |
||
− | |Arg |
||
− | |consequences |
||
|- |
|- |
||
+ | |Trp -> TER (rs121907967) |
||
− | |aromatic, polar, hydrophobic, neutral |
||
+ | |HGMD and SNP-DB (non-neutral) |
||
− | |positive charged, polar, hydrophilic |
||
+ | |right |
||
− | |Trp is very big, because of two aromatic rings in its structure. Furthermore, it is hydrophobic, whereas, Arg is a hydrophilic amino acid. Therefore, the changes in the 3D structure might be extreme and delete the function of the protein. |
||
+ | |no information |
||
+ | |no information |
||
+ | |no information |
||
|- |
|- |
||
+ | |Asn -> Asp (rs1800430) |
||
− | |} |
||
+ | |SNP-DB (neutral) |
||
− | |||
+ | |right |
||
− | '''Visualisation of the mutation''' |
||
+ | |right |
||
− | |||
+ | |right |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |right |
||
− | |picture original aa |
||
− | |picture mutated aa |
||
− | |combined picture |
||
|- |
|- |
||
+ | |Ile -> Val (rs121907982) |
||
− | |[[Image:W485.png|thumb|150px|Amino acid Tryptophan]] |
||
+ | |SNP-DB (neutral) |
||
− | |[[Image:485R.png|thumb|150px|Amino acid Arginine]] |
||
+ | |right |
||
− | |[[Image:W485R.png|thumb|150px|Picture which visualize the mutation]] |
||
+ | |right |
||
+ | |right |
||
+ | |right |
||
|- |
|- |
||
+ | |Trp -> Arg (rs121907968) |
||
− | |} |
||
+ | |HGMD and SNP-DB (non-neutral) |
||
− | |||
+ | |right |
||
− | '''Subsitution Matrices values''' |
||
+ | |right |
||
− | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | |right |
||
− | |colspan="3" | PAM 1 |
||
+ | |right |
||
− | |colspan="3" | Pam 250 |
||
− | |colspan="3" | BLOSOUM 62 |
||
|- |
|- |
||
+ | |right/all |
||
− | |value aa |
||
+ | | |
||
− | |most frequent substitution |
||
+ | |9/10 |
||
− | |rarest substitution |
||
+ | |8/10 |
||
− | |value aa |
||
+ | |8/10 |
||
− | |most frequent substitution |
||
+ | |8/10 |
||
− | |rarest substitution |
||
− | |value aa |
||
− | |most frequent substitution |
||
− | |rarest substitution |
||
− | |- |
||
− | |2 |
||
− | |2 (Arg) |
||
− | |0 (all, except Arg, Phe, Ser, Tyr) |
||
− | |2 |
||
− | |2 (Arg) |
||
− | |0 (all, except Arg, His, Leu, Phe, Ser, Tyr) |
||
− | | -3 |
||
− | |2 (Tyr) |
||
− | | -4 (Asn, Asp, Pro) |
||
|- |
|- |
||
|} |
|} |
||
+ | As we can see in the table above, the prediction of the effect of the SNPs is very good. We predicted 90% of all cases correctly, whereas the prediction methods predict 88% (if we do not regard the chain termination). We think, it is a pity that all methods do not predict what happens if the chain terminates too early, because it is clear, that in almost all cases the mutation is non-neutral. |
||
− | === Summary page === |
||
+ | The onliest prediction which was wrong, was the mutation from His -> Asp which was predicted as non-neutral, although the mutation is neutral. |
||
+ | If we have a look at the table with the results of the single analysis from us, we can see, that in this case it seems to be clear, that this mutation is non-neutral. Therefore, there has to be some other influences, which we did not regard in our analysis. We think, that this mutation is a special case, or perhaps it is wrong annotated or already missing in the HGMD data base. |
||
+ | But in general we can say, that the predictions worked very well. |
||
+ | |||
+ | === Comparison of the different prediction methods === |
||
+ | |||
+ | First of all, you can say each prediction method works very well. If you make a manual prediction, of course you learn more about the protein and this specific mutation, but it is very time-consuming and in the end the results mostly not even better than the results of the prediction methods. All prediction methods showed the same results and therefore all of them worked very well. In our opinion SNAP and SIFT are a little better than PolyPhen2, because by SNAP and SIFT it is possible to upload a list with a lot of mutations whereas PolyPhen2 has to be started for each mutation again. Therefore, SNAP and SIFT are more user friendly than PolyPhen2. |
||
+ | If we compare SNAP and SIFT there is no big difference. However, we wanted to mentioned, that SNAP is the slowest method. Otherwise, the output from SNAP is very good and it is easy to parse the output, whereas the output from SIFT is in html tables or pictures, what is a difficult format, if you want to parse the output. |
||
+ | Therefore, we suggest the user to use SNAP if there is enough time. |
||
+ | <br><br>Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs disease]]<br><br> |
||
+ | === Important points for predicting the effect of a SNP === |
||
+ | |||
+ | Now we learned a lot about the different influences on a SNP and what is useful to regard to decide if a SNP is neutral or non-neutral. |
||
+ | But not each categories is such important than others. Here we want to rank these categories, because if the different categories showed different results it seems useful to know which categories have more impact to the effects than others. |
||
+ | Therefore, in our opinion the most important category is the physicochemical properties of the different residues. |
||
− | [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Sequence-based_mutation_analysis_HEXA/Mutation_Summary Here]] you can find all results of the different analysis in one table. |
||
+ | Next, very important are also the size of the different residues, especially if a small amino acid is replaced by a very large amino acid. |
||
+ | Also very important in our opinion is the value of the PSSM and the conservation in the multiple alignment. We scored the PSSM value and the conservation more than the values of the substitution matrices, because the values in the matrix are position independent, which is not true in our case. So especially conservation is also a very important point, because this is a good hint how important is the residue for the protein family. Although it is important to keep in mind, that we do not know how the right alignment looks and if we worked with a good alignment. |
||
+ | Not that important is the location of the substitution in our view. Because there is the possibility, that the mutation is in a disordered region and therefore if you argument, that a mutation in a coiled region is not that bad as in a secondary structure element, in this case this argumentation is totally wrong. Furthermore, if the amino acids are very similar in properties, size and the residue is not that important for the function of the protein, this mutation could be located in a secondary structure element without changing anything in the protein. Furthermore, there are a lot of residues, which are located at the border of a secondary structure element. And in this case, it is absolutely unclear if a substitution change anything in the function of the protein, although if the secondary structure element ends two or three residues earlier than in a non-mutated sequence. |
||
+ | Therefore, in sum, we thing, it is important to regard all possible categories, but if there a doubts about the effect of the mutation it is also important to score the categories and decided with the scoring. |
||
+ | <br><br>Back to [[http://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Tay-Sachs_Disease Tay-Sachs disease]]<br><br> |
Latest revision as of 13:36, 29 September 2011
Contents
Mutations
The next table listed all mutations, which are used in the following analyses.
SNP-id | codon number | mutation codon | mutation triplet |
rs4777505 | 29 | Asn -> Ser | AAC -> AGC |
rs121907979 | 39 | Leu -> Arg | CTT -> CGT |
rs61731240 | 179 | His -> Asp | CAT -> GAT |
rs121907974 | 211 | Phe -> Ser | TTC -> TCC |
rs61747114 | 248 | Leu -> Phe | CTT -> TTT |
rs1054374 | 293 | Ser -> Ile | AGT -> ATT |
rs121907967 | 329 | Trp -> TER | TGG -> TAG |
rs1800430 | 399 | Asn -> Asp | AAC -> GAC |
rs1800431 | 436 | Ile -> Val | ATA -> GTA |
rs121907968 | 485 | Trp -> Arg | gTGG -> CGG |
Back to [Tay-Sachs disease]
Analysis of the mutations
We created for each mutation an extra page. The summary of the analysis can be seen in the Summary Section.
Back to [Tay-Sachs disease]
Summary page
Here we sum up all analysis we did for the mutations:
Results
First of all, we want to explain how we decided if we assign the result of the method as neutral or non-neutral. We analysed a lot of different issues of this protein, which are based on the sequence. After the different analyses we had to decide whether the mutation seems to be neutral or non-neutral. In the following list, we want to explain how we decided for each property if it is neutral or not.
- pysicochemical properties we called a mutation neutral, if the properties of the mutated amino acid are very similar to them of the original amino acid. Otherwise, it is called non-neutral.
- visual analysis: a mutation is called neutral, if the structure of the changed amino acid is very similar to the structure of the original amino acid.
- PAM1, PAM2, BLOSUM62 and PSSM analysis: a mutation is called neutral, if the change score is near to the score of the most frequent exchanged amino acid.
- multiple alignment: a mutation is called non-neutral if the original amino acid is very conserved in the alignment. If there is a conservation rate less than 50%, we decided to call the mutation neutral.
- analysis with JPred, PsiPred: if the mutated amino acid has no secondary structure (coil) in the prediction of the secondary structure, we called the mutation neutral.
- analysis with the real structure: here we look, if the mutation takes place in a secondary structure element or not. Instead of DSSP, we used the real structure. Normally, the real structure is not available and therefore, this value can not be used in the prediction. Therefore, we do not use this value, since we decided wheter the mutation is neutral or not.
- SNAP, SIFT and PolyPhen2 prediction: These are the three mutation prediction methods we used in our analysis. Here a mutation is called neutral, if the program predicts this mutation as neutral.
method | mutations | |||||||||
Asn -> Ser (rs4777505) | Leu -> Arg (rs121907979) | His -> Asp (rs61731240) | Phe -> Ser (rs121907974) | Leu -> Phe (rs61747114) | Ser -> Ile (rs1054374) | Trp -> TER (rs121907967) | Asn -> Asp (rs1800430) | Ile -> Val (rs121907982) | Trp -> Arg (rs121907968) | |
pysicochemical properties | neutral | non-neutral | non-neutral | non-neutral | neutral | non-neutral | non-neutral | neutral | neutral | non-neutral |
visual analysis | neutral | non-neutral | non-neutral | non-neutral | non-neutral | neutral | non-neutral | non-neutral | neutral | non-neutral |
PAM1 | neutral | non-neutral | non-neutral | non-neutral | no statement | non-neutral | no information | neutral | neutral | neutral |
PAM250 | neutral | non-neutral | no statement | non-neutral | no statement | no statement | no information | neutral | neutral | neutral |
BLOSUM62 | neutral | no statement | no statement | non-neutral | neutral | non-neutral | no information | neutral | neutral | non-neutral |
PSSM analysis | neutral | non-neutral | non-neutral | non-neutral | non-neutral | non-neutral | no information | no statement | non-neutral | non-neutral |
multiple alignment | neutral | non-neutral | non-neutral | non-neutral | non-neutral | neutral | neutral | neutral | neutral | non-neutral |
analysis with Jpred | non-neutral | non-neutral | neutral | neutral | neutral | neutral | neutral | neutral | neutral | neutral |
analysis with PsiPred | non-neutral | non-neutral | neutral | neutral | neutral | neutral | neutral | non-neutral | neutral | neutral |
analysis with real structure | non-neutral | no statement | neutral | neutral | non-neutral | neutral | non-neutral | non-neutral | non-neutral | non-neutral |
SNAP Prediction | neutral | non-neutral | non-neutral | non-neutral | neutral | neutral | no information | neutral | neutral | non-neutral |
SIFT Prediction | neutral | non-neutral | non-neutral | non-neutral | neutral | neutral | no information | neutral | neutral | non-neutral |
PolyPhen2 Prediction | neutral | non-neutral | non-neutral | non-neutral | neutral | neutral | no information | neutral | neutral | non-neutral |
Back to [Tay-Sachs disease]
Own prediction
In the table above you can see the summary of all of the analyses we did to got the possibility to make a statement about the mutation. Here we want to sum it up for each mutation and write it down in an extra table. In the end, we wanted to compare our summing up with the reality and therefore we compared from which database the mutation was extracted.
- Asn -> Ser
The first mutation we looked at, is a substitution from Asn to Ser. As you can see in our summary table, there was always a prediction that this mutation is silent, except of the analysis of the secondary structure. Therefore, this means that the mutation is in a secondary structure element. But these two amino acids seems to be very similar and therefore, it seems not to be that bad, if the mutation is in an secondary structure element, because the structure will not change dramatically. Therefore, in sum we predict this mutation as neutral. Also each of the prediction tools predicted this mutation as neutral. So therefore in sum, we think this is a neutral mutation.
- Leu -> Arg
It is a little bit more complicate to predict the effect of this mutation as before, because there are conflicting predictions of the single categories. In sum, most of the categories counted this mutation as non-neutral. By BLOSUM62 it was not possible to make a statement, because the score of this mutation was between the rarest and highest score and therefore, it was not possible to assign the mutation to one of the two categories. The multiple alignment was good conserved at this position, and therefore with this analysis the mutation seems to be non-neutral. In the case of the comparison with the real structure it was not possible to make a clear statement, because this amino acid is located at the boarder between a secondary structure element and a coiled region and therefore, we do not know if a mutation at the last position of the secondary structure really change the structure dramatically. But this is not that important for our prediction, because we do not attend the secondary structure. So therefore, in sum we predicted this mutation as non-neutral, which is the same result as the methods gave us.
- His -> Asp
In this case we have to differ between the secondary structure analysis and the other analyses. So the other analyses showed, that this mutation might not be neutral. In the analysis with PAM250 and BLOSUM62 it was not possible to make a statement, so therefore, the amino acid mutated some times, but without a trend the a very common or very rare mutation. The secondary structure analysis predicted a neutral mutation, which means this mutation does not take place in a secondary structure element. But our analysis method for the secondary structure is very simple. We do not regard any contacts with other amino acids in the structure (which would be there). So it is not absolutely impossible, that the mutation of an amino acid in a loop region do have any effects on structure and function of the protein, especially, if the physicochemical properties differ. Furthermore, we know that there exist disordered regions, which are essential for the function of a protein, which do not have a defined secondary structure and therefore, will be predicted as coiled regions. So this is a difficult case, but because of the different physicochemical properties and the very different structure of the two amino acids and also the results of the multiple alignment, we decided to predict this mutation as non-neutral.
- Phe -> Ser
In this case we have the same situation as before. All our analysis gave us the hint, that these mutation is non-neutral, except the secondary structure analysis. As we mentioned before, it is also possible that there is a big impact on the structure of the protein, even if the mutation takes place in a coil-region. Especially if we keep in mind, that there is the possibility of a disordered region. So we think, the secondary structure is not that a straight criterion for function of the protein than the mutation rate or the physicochemical properties. Therefore, we decided to predict this mutation as non-neutral, which is consistent with the results of the three prediction methods.
- Leu -> Phe
This mutation is a very interesting case. Here we have a lot of methods, which gave us other hints. So first of all, both amino acids have the identical physicochemical properties, which is always a strong hint that the mutation does not destroy the function of the protein. Otherwise, if we have a look at the structure of the amino acids, there is a big difference between Leu and Phe and therefore, this is a hint for changing the structure of the amino acid. It was not possible to make a statement about the effect of this mutation by regarding the PAM matrices, but in the BLOSUM matrix this mutation is noted as neutral. The PSSM analysis and the multiple alignment analysis, however, suggest that the mutation is non-neutral. Very interestingly is the result of the secondary structure analysis. So if we have a look at the results of PsiPred and JPred, we have to suggest, that this mutation is neutral, because it takes place in a coiled region. But both methods predicted the secondary structure wrong, because if we have a look to the real structure, we can see, that the mutation takes place in a secondary structure element. So it is important to keep in mind, that we work on predictions, which could be wrong. But as we said before, the real structure is not regarded in our manually prediction and therefore, we decided that this mutation is neutral for the following reasons. First of all the physicochemical properties are equal and this is a very important point. Next, the structure of the residues is not similar, but the mutation takes place at a coiled region, and therefore a wrong structure would not be that dramatically as in a secondary structure element. Furthermore, BLOSUM62 told us, that this substitution is neutral. So in sum, we have more neutral predictions that non-neutral predictions. Of course, the multiple alignment is a strong hint, that the mutation is non-neutral, but as we mentioned above, we also do not know if the alignment is right and we have two secondary structure methods, which gave us the same result. Therefore we have to trust the predictions. Therefore, we predicted the same effect as the methods did.
- Ser -> Ile
Interestingly in this case the physicochemical properties are not identical and also the substitution matrices scored this substitutions as non-neutral, but the rest of our predictions shows that the effect of this mutation is neutral. So there is a similar structure of the residues, the alignment is not conserved and also the position-specific scoring matrix of the PsiBlast run do not show any conservation of this residue. Furthermore, the mutation takes place in a coiled region. Although the physicochemical properties and the substitution matrices are very important hints for the effect of the mutation we decided to predict this mutation as silent. First of all, there are 5 predictions which predict this mutations as silent and only 3 predictions which see a causing effect of this mutation. An argument for a silent mutation is, that the pysicochemical properties perhaps are not that important for a residue which is located in a coiled region, especially if this residue does not have many connections to other residues. In general the substitution matrix showed, that this mutation is not neutral, but the PSSM predicts it as neutral. The PSSM also regards the position of the substitution in the sequence. So therefore, it is possible, that this mutation is normally no silent (in other proteins), but in this special case we have a neutral mutation. This prediction is equal to the predictions of the methods.
- Trp -> Ter
In this case it is not necessary to have a look at the different predictions of the single analysis. This mutation is located at the middle of the protein and leads to a short protein, which surely could not fold in the right way and therefore could not function anymore. Therefore, this mutation is non-neutral. Sadly, it was not possible to predict the effect of a mutation which leads to shortened protein and therefore, it is not possible to compare the results of the methods with our prediction results. This is bad, because it is also possible that a mutation which leads to a shortened protein is neutral, if the mutation takes place at the very end of the protein. But in this case the mutation takes place at the middle of the protein and therefore, it is predicted as non-neutral from us.
- Asn -> Asp
This mutation is a clear thing, because only the visual analysis do not predict this mutation as neutral. Furthermore, the PsiPred method does also not predict this neutral. This is not very surprisingly, because if we have a look at the real structure of the protein we can see, this amino acid is directly located at the border between a secondary structure element and a coiled region. But the rest of our predictions, especially the physicochemical properties and the multiple alignment as well as the substitution matrices except PSSM showed clearly, that this prediction is neutral. The method we used here for the prediction also gave us the same result.
- Ile -> Val
This mutation is also very easy to classify, because every of our categories predicted the mutation as neutral. Only the comparison with the real structure and the PSSM analysis gave us the hint, that this prediction perhaps is not neutral, because it takes place at a secondary structure element. But in the first case, we do not regard the comparison with the real structure and the secondary structure prediction methods failed and secondly, the structure of the residues and the physicochemical properties are very similar and therefore, it should not have big effects on the structure of the protein, even if the mutation is located inside a secondary structure element. The PSSM is a strong hint, that this mutation is possibly non-neutral, but it is the onliest hint and the rest of our analyses gave us another hint. Therefore, we predicted this mutation as neutral, which was also the result of the three prediction methods.
- Trp -> Arg
In our last analysed mutation only the secondary structure methods predict this mutation as neutral and the substitution matrices. All other categories scored this mutation as non-neutral. As we can see, the secondary structure prediction failed, because this mutation is located at a secondary structure element. We predicted this mutation as non-neutral. First of all, we have 5 predictions for non-neutral and 4 for neutral. But only 1 categories difference is in general not enough to make a prediction. But the very important categories (physicochemical properties, alignment, PSSM) predict this mutation as non-neutral and we scored these categories as more important than for example secondary structure. Therefore, we decided to predict this mutation as non-neutral, which is consistent with the results of the three prediction methods.
We decided to sum the predictions up, to give the reader to possibility to see our predictions in one view. Furthermore, because we want to verify our predictions in the next section, we also listed the prediction results from the other methods one more time:
mutation | our prediction | SNAP | SIFT | PolyPhen2 |
Asn -> Ser (rs4777505) | neutral | neutral | neutral | neutral |
Leu -> Arg (rs121907979) | non-neutral | non-neutral | non-neutral | non-neutral |
His -> Asp (rs61731240) | non-neutral | non-neutral | non-neutral | non-neutral |
Phe -> Ser (rs121907974) | non-neutral | non-neutral | non-neutral | non-neutral |
Leu -> Phe (rs61747114) | neutral | neutral | neutral | neutral |
Ser -> Ile (rs1054374) | neutral | neutral | neutral | neutral |
Trp -> TER (rs121907967) | non-neutral | no information | no information | no information |
Asn -> Asp (rs1800430) | neutral | neutral | neutral | neutral |
Ile -> Val (rs121907982) | neutral | neutral | neutral | neutral |
Trp -> Arg (rs121907968) | non-neutral | non-neutral | non-neutral | non-neutral |
If we look at the table above, we can see that there is 100% consensus between our predictions and the predictions of the different methods (except Trp -> TER, because the other methods were not possible to predict the effect of a chain termination). So therefore, it seems useful to compare the prediction results with the real effects of the mutation, which was done in the next section.
Back to [Tay-Sachs disease]
Comparison with the databases
Here we wanted to figure out, if we and the methods predicted the mutation correctly. Therefore, we dissolve from which database the mutation was taken. We already know, that mutations only annotated in SNP-DB are silent, whereas mutations which are annotated in HGMD oder in both databases are non-neutral.
mutation | database | predictions | |||
our | SNAP | SIFT | PolyPhen2 | ||
Asn -> Ser (rs4777505) | SNP-DB (neutral) | right | right | right | right |
Leu -> Arg (rs121907979) | HGMD and SNP-DB (non-neutral) | right | right | right | right |
His -> Asp (rs61731240) | SNP-DB (neutral) | wrong | wrong | wrong | wrong |
Phe -> Ser (rs121907974) | HGMD and SNP-DB (non-neutral) | right | right | right | right |
Leu -> Phe (rs61747114) | SNP-DB (neutral) | right | right | right | right |
Ser -> Ile (rs1054374) | SNP-DB (neutral) | right | right | right | right |
Trp -> TER (rs121907967) | HGMD and SNP-DB (non-neutral) | right | no information | no information | no information |
Asn -> Asp (rs1800430) | SNP-DB (neutral) | right | right | right | right |
Ile -> Val (rs121907982) | SNP-DB (neutral) | right | right | right | right |
Trp -> Arg (rs121907968) | HGMD and SNP-DB (non-neutral) | right | right | right | right |
right/all | 9/10 | 8/10 | 8/10 | 8/10 |
As we can see in the table above, the prediction of the effect of the SNPs is very good. We predicted 90% of all cases correctly, whereas the prediction methods predict 88% (if we do not regard the chain termination). We think, it is a pity that all methods do not predict what happens if the chain terminates too early, because it is clear, that in almost all cases the mutation is non-neutral. The onliest prediction which was wrong, was the mutation from His -> Asp which was predicted as non-neutral, although the mutation is neutral. If we have a look at the table with the results of the single analysis from us, we can see, that in this case it seems to be clear, that this mutation is non-neutral. Therefore, there has to be some other influences, which we did not regard in our analysis. We think, that this mutation is a special case, or perhaps it is wrong annotated or already missing in the HGMD data base. But in general we can say, that the predictions worked very well.
Comparison of the different prediction methods
First of all, you can say each prediction method works very well. If you make a manual prediction, of course you learn more about the protein and this specific mutation, but it is very time-consuming and in the end the results mostly not even better than the results of the prediction methods. All prediction methods showed the same results and therefore all of them worked very well. In our opinion SNAP and SIFT are a little better than PolyPhen2, because by SNAP and SIFT it is possible to upload a list with a lot of mutations whereas PolyPhen2 has to be started for each mutation again. Therefore, SNAP and SIFT are more user friendly than PolyPhen2.
If we compare SNAP and SIFT there is no big difference. However, we wanted to mentioned, that SNAP is the slowest method. Otherwise, the output from SNAP is very good and it is easy to parse the output, whereas the output from SIFT is in html tables or pictures, what is a difficult format, if you want to parse the output.
Therefore, we suggest the user to use SNAP if there is enough time.
Back to [Tay-Sachs disease]
Important points for predicting the effect of a SNP
Now we learned a lot about the different influences on a SNP and what is useful to regard to decide if a SNP is neutral or non-neutral. But not each categories is such important than others. Here we want to rank these categories, because if the different categories showed different results it seems useful to know which categories have more impact to the effects than others.
Therefore, in our opinion the most important category is the physicochemical properties of the different residues.
Next, very important are also the size of the different residues, especially if a small amino acid is replaced by a very large amino acid.
Also very important in our opinion is the value of the PSSM and the conservation in the multiple alignment. We scored the PSSM value and the conservation more than the values of the substitution matrices, because the values in the matrix are position independent, which is not true in our case. So especially conservation is also a very important point, because this is a good hint how important is the residue for the protein family. Although it is important to keep in mind, that we do not know how the right alignment looks and if we worked with a good alignment.
Not that important is the location of the substitution in our view. Because there is the possibility, that the mutation is in a disordered region and therefore if you argument, that a mutation in a coiled region is not that bad as in a secondary structure element, in this case this argumentation is totally wrong. Furthermore, if the amino acids are very similar in properties, size and the residue is not that important for the function of the protein, this mutation could be located in a secondary structure element without changing anything in the protein. Furthermore, there are a lot of residues, which are located at the border of a secondary structure element. And in this case, it is absolutely unclear if a substitution change anything in the function of the protein, although if the secondary structure element ends two or three residues earlier than in a non-mutated sequence.
Therefore, in sum, we thing, it is important to regard all possible categories, but if there a doubts about the effect of the mutation it is also important to score the categories and decided with the scoring.
Back to [Tay-Sachs disease]