Difference between revisions of "Task 5 - Mapping SNPs 2011"

From Bioinformatikpedia
Line 2: Line 2:
   
 
Due to the Whit holiday (Pfingstferien) this weeks meeting is postponed. The introductory talks will be given next week, June 21st. The following tasks should already be completed until then.
 
Due to the Whit holiday (Pfingstferien) this weeks meeting is postponed. The introductory talks will be given next week, June 21st. The following tasks should already be completed until then.
 
 
   
   
Line 18: Line 16:
 
* HGMD
 
* HGMD
 
* OMIN
 
* OMIN
  +
  +
   
 
== Tasks and questions ==
 
== Tasks and questions ==
Line 25: Line 25:
   
   
'''Example cystic fibrosis'''
+
= Example cystic fibrosis =
   
 
Cystic fibrosis (CF or mucoviscidosis) is a recessive monogenetic disease that affects the respiratory, digestive and reproductive systems. CF involves the production of abnormally thick mucus linings in the lungs [WHO, www.who.int].
 
Cystic fibrosis (CF or mucoviscidosis) is a recessive monogenetic disease that affects the respiratory, digestive and reproductive systems. CF involves the production of abnormally thick mucus linings in the lungs [WHO, www.who.int].
 
CF is caused by a mutation in the gene coding for the cystic fibrosis transmembrane conductance regulator (CFTR).
 
CF is caused by a mutation in the gene coding for the cystic fibrosis transmembrane conductance regulator (CFTR).
   
  +
1. HGMD
 
  +
'''1. HGMD'''
 
* Search for "gene symbol" CFTR
 
* Search for "gene symbol" CFTR
 
* Which mutation types occur? Give a short definition.
 
* Which mutation types occur? Give a short definition.
Line 41: Line 42:
 
You can also write simple scripts yourself.
 
You can also write simple scripts yourself.
   
  +
2. dbSNP
 
  +
'''2. dbSNP'''
 
* Search for "SNP" and "gene" CFTR
 
* Search for "SNP" and "gene" CFTR
 
* Take care that you look at the results for Homo sapiens.
 
* Take care that you look at the results for Homo sapiens.
Line 48: Line 50:
 
* In dbSNP, you may find many deletions and insertions. If a single nucleotide is deleted or inserted, in principle, the constitute point mutations. Do not regard deletions and insertion on your map. Why can the effect of deletions and insertions be significant? When would it be less severe?
 
* In dbSNP, you may find many deletions and insertions. If a single nucleotide is deleted or inserted, in principle, the constitute point mutations. Do not regard deletions and insertion on your map. Why can the effect of deletions and insertions be significant? When would it be less severe?
   
  +
3. Mutation map
 
  +
'''3. Mutation map'''
 
* Generate a map of your protein sequences showing (about 100) point mutations. Mark silent and disease causing mutations.
 
* Generate a map of your protein sequences showing (about 100) point mutations. Mark silent and disease causing mutations.
 
* To map the mutations from HGMD and dbSNP onto the same sequence, you may have to align the two sequences.
 
* To map the mutations from HGMD and dbSNP onto the same sequence, you may have to align the two sequences.

Revision as of 15:53, 14 June 2011

All the proteins studied in this practical are involved in monogenetic diseases. These diseases can be caused by single point mutation, so called missense and nonsense mutations.

Due to the Whit holiday (Pfingstferien) this weeks meeting is postponed. The introductory talks will be given next week, June 21st. The following tasks should already be completed until then.


Introductory talks

The first talk gives background information on:

  • DSSP
  • HSSP
  • UniProt (if there is time)

The second talk introduces:

  • Genes and mutations
  • HGMD
  • OMIN


Tasks and questions

Your task for this week is to map approximately 100 point mutations onto your protein sequence. In HGMD (The Human Gene Mutation Database), look for missense and nonsense mutations. In dbSNP (Short Genetic Variations), look for silent mutations. In most cases, significantly more than 100 mutations are known. The residue numbers in the two databases may not correspond to each other. You need to find a way to compare between HGMD and dbSNP. You can present the results in a table or graphically. In the following, the task is described in more detail using cystic fibrosis as an example. You are free to use a different approach!


Example cystic fibrosis

Cystic fibrosis (CF or mucoviscidosis) is a recessive monogenetic disease that affects the respiratory, digestive and reproductive systems. CF involves the production of abnormally thick mucus linings in the lungs [WHO, www.who.int]. CF is caused by a mutation in the gene coding for the cystic fibrosis transmembrane conductance regulator (CFTR).


1. HGMD

  • Search for "gene symbol" CFTR
  • Which mutation types occur? Give a short definition.
  • Get the "missense/nonsense" mutations. How many are given in the non-professional version of HGMD?
  • The protein sequence can be obtained via the accesaion numberhere "NM_000492.3" that links to the NCBI.

AT NCBI, you can download the protein sequence in FASTA format.

  • Alternatively, you can obtain the protein sequence via the the cDNA sequence. Using the "switch view" link, the protein sequence in three-letter code is given.
  • If you use the cDNA/switch view option, you have to find a tool to translate the three-letter to the one-letter amino acid code, for example: http://molbiol.ru/eng/scripts/01_17.html

Find a tool to translate the DNA sequence to one-letter amino acid code. You can also write simple scripts yourself.


2. dbSNP

  • Search for "SNP" and "gene" CFTR
  • Take care that you look at the results for Homo sapiens.
  • The results of the "SNP" search can be displayed for example as "Graphic Summary" and "FASTA". Choose the display that is most helpful to you.
  • From the "gene" search, you can go to the "SNP Geneview Report". The link is found in the section "Genotypes".
  • In dbSNP, you may find many deletions and insertions. If a single nucleotide is deleted or inserted, in principle, the constitute point mutations. Do not regard deletions and insertion on your map. Why can the effect of deletions and insertions be significant? When would it be less severe?


3. Mutation map

  • Generate a map of your protein sequences showing (about 100) point mutations. Mark silent and disease causing mutations.
  • To map the mutations from HGMD and dbSNP onto the same sequence, you may have to align the two sequences.
  • The results can be presented in a table or graphically.