Difference between revisions of "Fabry:Sequence-based mutation analysis"

From Bioinformatikpedia
(Created page with "Fabry Disease » Sequence-based mutation analysis <hr> Category: Fabry Disease 2012 The following analyses were performed on the basis of the [[Fabry:Alpha-galactosidase…")
 
Line 5: Line 5:
 
The following analyses were performed on the basis of the [[Fabry:Alpha-galactosidase|α-Galactosidase A]] [[Fabry:Alpha-galactosidase_sequence| sequence]]. Please consult the [[Fabry:Sequence-based_mutation_analysis/Journal|journal]] for the commands used to generate the results.
 
The following analyses were performed on the basis of the [[Fabry:Alpha-galactosidase|α-Galactosidase A]] [[Fabry:Alpha-galactosidase_sequence| sequence]]. Please consult the [[Fabry:Sequence-based_mutation_analysis/Journal|journal]] for the commands used to generate the results.
   
  +
== Dataset preparation ==
 
* Pick 10 mutations (SNPs) of your dataset, some of which are from the HGMD (missense mutations) and some that were only found in dbSNP ( '''change in amino acid sequence but not found in the HGMD'''). Shuffle them and '''PLEASE''' do not try to memorize whether they cause the disease! The goal is to pretend that we do '''NOT''' know what is going on. It would be great if the most common disease-causing mutations would be included, too.
 
* Pick 10 mutations (SNPs) of your dataset, some of which are from the HGMD (missense mutations) and some that were only found in dbSNP ( '''change in amino acid sequence but not found in the HGMD'''). Shuffle them and '''PLEASE''' do not try to memorize whether they cause the disease! The goal is to pretend that we do '''NOT''' know what is going on. It would be great if the most common disease-causing mutations would be included, too.
  +
  +
== Amino acid properties ==
 
* The simplest approach is to look at the differences in the WT (wild-type) and mutant amino acids. Please write for each of the 10 mutations a short summary about the physicochemical properties and changes.
 
* The simplest approach is to look at the differences in the WT (wild-type) and mutant amino acids. Please write for each of the 10 mutations a short summary about the physicochemical properties and changes.
  +
  +
== Simple structural analysis ==
 
* Now take into consideration where in the protein the mutation occurs and document: Create a picture with PyMOL showing the original and mutated residue in the protein. [http://www.pymolwiki.org/index.php/Mutagenesis Use PyMOL for this]. More thorough structural analyses will be introduced in the next task.
 
* Now take into consideration where in the protein the mutation occurs and document: Create a picture with PyMOL showing the original and mutated residue in the protein. [http://www.pymolwiki.org/index.php/Mutagenesis Use PyMOL for this]. More thorough structural analyses will be introduced in the next task.
  +
  +
== Location ==
 
* Using your secondary structure predictions from the previous tasks, investigate whether the mutations are inside secondary structure elements (Helix, Strand) or not.
 
* Using your secondary structure predictions from the previous tasks, investigate whether the mutations are inside secondary structure elements (Helix, Strand) or not.
  +
  +
== Substitution matrices ==
 
* Look at the BLOSUM62 and PAM(1/250) matrix. What are the scores for the amino acid substitutions? Is it the worst possible substitution or not? Can we say anything about phenotype from this?
 
* Look at the BLOSUM62 and PAM(1/250) matrix. What are the scores for the amino acid substitutions? Is it the worst possible substitution or not? Can we say anything about phenotype from this?
  +
  +
== PSSM ==
 
* Getting a bit closer to evolution you will have to create a PSSM (position specific scoring matrix) for your protein sequence using PSI-BLAST (5 iterations). How conserved are the WT residues in your mutant positions? How is the frequency of occurrence (conservation) for the mutant residue type? Anything interesting?
 
* Getting a bit closer to evolution you will have to create a PSSM (position specific scoring matrix) for your protein sequence using PSI-BLAST (5 iterations). How conserved are the WT residues in your mutant positions? How is the frequency of occurrence (conservation) for the mutant residue type? Anything interesting?
  +
  +
== Multiple sequence alignment ==
 
* And another step close to evolution: Identify all mammalian homologous sequences. Create a multiple sequence alignment for them with a method of your choice. Using this you can now calculate conservation for WT and mutant residues again. Compare this to the matrix- and PSSM-derived results.
 
* And another step close to evolution: Identify all mammalian homologous sequences. Create a multiple sequence alignment for them with a method of your choice. Using this you can now calculate conservation for WT and mutant residues again. Compare this to the matrix- and PSSM-derived results.
  +
  +
== Scoring methods ==
 
* Finally, we use three different approaches to score our mutants.
 
* Finally, we use three different approaches to score our mutants.
 
** [http://sift.jcvi.org/www/SIFT_seq_submit2.html SIFT]
 
** [http://sift.jcvi.org/www/SIFT_seq_submit2.html SIFT]
 
** [http://genetics.bwh.harvard.edu/pph2/ Polyphen2]
 
** [http://genetics.bwh.harvard.edu/pph2/ Polyphen2]
 
** [https://rostlab.org/owiki/index.php/Snap SNAP] is installed on the VirtualBox and should be used command-line only. -- As blast is the bottleneck of SNAP, and you are doing that anyway, we might as well look at ''all'' possible substitutions in the position of our mutations. This way we can learn much more about the nature of the given mutation: Is our mutation problematic because we introduce an unwanted effect, or because the WT residue is essential and by mutating we remove that?
 
** [https://rostlab.org/owiki/index.php/Snap SNAP] is installed on the VirtualBox and should be used command-line only. -- As blast is the bottleneck of SNAP, and you are doing that anyway, we might as well look at ''all'' possible substitutions in the position of our mutations. This way we can learn much more about the nature of the given mutation: Is our mutation problematic because we introduce an unwanted effect, or because the WT residue is essential and by mutating we remove that?
  +
  +
== Results and Conclusion ==
 
* Compare '''ALL''' results and create an overview table.
 
* Compare '''ALL''' results and create an overview table.
 
* Try to come up with a consensus between all the findings requested above.
 
* Try to come up with a consensus between all the findings requested above.

Revision as of 15:00, 11 June 2012

Fabry Disease » Sequence-based mutation analysis


The following analyses were performed on the basis of the α-Galactosidase A sequence. Please consult the journal for the commands used to generate the results.

Dataset preparation

  • Pick 10 mutations (SNPs) of your dataset, some of which are from the HGMD (missense mutations) and some that were only found in dbSNP ( change in amino acid sequence but not found in the HGMD). Shuffle them and PLEASE do not try to memorize whether they cause the disease! The goal is to pretend that we do NOT know what is going on. It would be great if the most common disease-causing mutations would be included, too.

Amino acid properties

  • The simplest approach is to look at the differences in the WT (wild-type) and mutant amino acids. Please write for each of the 10 mutations a short summary about the physicochemical properties and changes.

Simple structural analysis

  • Now take into consideration where in the protein the mutation occurs and document: Create a picture with PyMOL showing the original and mutated residue in the protein. Use PyMOL for this. More thorough structural analyses will be introduced in the next task.

Location

  • Using your secondary structure predictions from the previous tasks, investigate whether the mutations are inside secondary structure elements (Helix, Strand) or not.

Substitution matrices

  • Look at the BLOSUM62 and PAM(1/250) matrix. What are the scores for the amino acid substitutions? Is it the worst possible substitution or not? Can we say anything about phenotype from this?

PSSM

  • Getting a bit closer to evolution you will have to create a PSSM (position specific scoring matrix) for your protein sequence using PSI-BLAST (5 iterations). How conserved are the WT residues in your mutant positions? How is the frequency of occurrence (conservation) for the mutant residue type? Anything interesting?

Multiple sequence alignment

  • And another step close to evolution: Identify all mammalian homologous sequences. Create a multiple sequence alignment for them with a method of your choice. Using this you can now calculate conservation for WT and mutant residues again. Compare this to the matrix- and PSSM-derived results.

Scoring methods

  • Finally, we use three different approaches to score our mutants.
    • SIFT
    • Polyphen2
    • SNAP is installed on the VirtualBox and should be used command-line only. -- As blast is the bottleneck of SNAP, and you are doing that anyway, we might as well look at all possible substitutions in the position of our mutations. This way we can learn much more about the nature of the given mutation: Is our mutation problematic because we introduce an unwanted effect, or because the WT residue is essential and by mutating we remove that?

Results and Conclusion

  • Compare ALL results and create an overview table.
  • Try to come up with a consensus between all the findings requested above.
  • Check whether you are right in the HGMD – were you able to predict a change?