Task 6 - Sequence-based mutation analysis 2011

From Bioinformatikpedia
Revision as of 11:05, 19 June 2011 by Offman (talk | contribs) (Created page with "All the proteins studied in this practical are involved in monogenetic diseases. These diseases can be caused by single point mutations. == Introductory talks == The following …")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

All the proteins studied in this practical are involved in monogenetic diseases. These diseases can be caused by single point mutations.

Introductory talks

The following topics will be addressed in the talks:

  • General overview on aminoacids and their physical/chemical properties
  • amino acid substitution matrices
  • SNAP
  • Polyphen
  • SIFT

Tasks

In this task we try to learn about the effects mutations can have on protein function/stability, just by looking at sequence changes. For this we will employ several different tools, but also apply some methods you have been introduced to during the course of this practical.

First, pick 10 mutations (SNPs) of your dataset, some of which are from the HGMD (missense mutations) and some that were only found in dbSNP (silent point mutations). Shuffle them and PLEASE do not try to memorize whether they cause the disease! The goal is to pretend that we do NOT know what is going on. It would be great if the most common disease-causing mutations would be included, too.

The simplest approach is to look at the differences in the WT (wild-type) and mutant amino acids. Please write for each of the 10 mutations a short summary about the physicochemical properties and changes.

Also, you will have to create a picture with PyMOL showing the original and mutated residue in the protein in a close-up. Use PyMOL for this[1]. This is purely for visualization and structural analysis will be introduced in the next task.

Next, we can look at the BLOSUM and PAM matrix. What are the scores for the amino acid substitutions? Is it the worst possible substitution or not? Can we say anything about phenotype from this?

Getting a bit closer to evolution you will have to create a PSSM (position specific scoring matrix) for your protein sequence using PSI-BLAST (5 iterations). How conserved are the WT residues in your mutant positions? How is the conservation for the mutant? Anything interesting?

And another step close to evolution. Identify all mammalian homologous sequences. Create a multiple sequence alignment for them with a method of your choice. Using this you can now calculate conservation for WT and mutant residues again. Compare this to the matrix- and PSSM-derived results.

Finally, we use three different approaches to score our mutants. SNAP is installed on the VirtualBox and should be used command-line only.

As a comparison we use:

Compare ALL results and create an overview table. Try to come up with a consensus between all the findings requested above. Check whether you are right in the HGMD – were you able to predict a change? For this task it is very important to us that you properly interpret and discuss your results. The production of the data should not take that long – so you have more time to do real science!