Task 5: Researching SNPs

From Bioinformatikpedia

In this task we are to research SNPs associated with our protein odba_human, in different databases.

HGMD( free version )

HGMD - Human Gene Mutation Database HGMD represents a collection of published mutations. This means HGMD contains only mutations leading to a disease. In our case publications between 1993-2008 are covered. The sequences and a list of all listed mutations can be found in the protocol-msud-task5 protocol

HGMD - BCKDHA (non-profit)
Mutation typenumber of entries
Missense/nonsense36
small deletions3
small insertions1
Gross deletions2
complex rearrangements1

dbSNP

Here is a list of the silent mutations we found in dbSNP. The complete list of all mutations from dbSNP (including missense and frameshift mutations) can be found in the protocol-msud-task5 protocol.

Insertion or deletion of nucleotides can cause frameshifts. As the amino acid code is triplet based, inserting or deleting a single nucleotide from within the sequence will not only cause a amino acid to be changed, but will move all following codons out of frame, changing every fololowing amino acid after die insertion or deletion. For a graphical representation please see figure 1. Insertions or deletions are less severe when they appear near the end of an exon, as this limits the number of amino acids changed.

Figure 1. Example of a frameshift mutation caused by an insertion.
rs Cluster Id Function AA Pos Codon Pos From Nucleotide To Nucleotide From AA To AA
rs17173144 synonymous 5 3 C T Ile [I] Ile [I]
rs34541442 synonymous 12 1 C A Arg [R] Arg [R]
rs140322984 synonymous 21 3 C T Ala [A] Ala [A]
rs62637712 synonymous 38 3 C G, T Pro [P] Pro [P], Pro [P]
rs80014754 synonymous 39 3 C A Pro [P] Pro [P]
rs138025447 synonymous 71 3 C T Asn [N] Asn [N]
rs143167070 synonymous 80 3 C T Arg [R] Arg [R]
rs148571328 synonymous 96 3 C T His [H] His [H]
rs11549937 synonymous 97 3 G C Leu [L] Leu [L]
rs142967869 synonymous 98 3 G A Pro [P] Pro [P]
rs150700696 synonymous 112 3 T G Leu [L] Leu [L]
rs143608852 synonymous 140 3 G A Thr [T] Thr [T]
rs146804716 synonymous 141 3 C T His [H] His [H]
rs144995574 synonymous 206 3 C T Ser [S] Ser [S]
rs10404506 synonymous 213 3 C T Ile [I] Ile [I]
rs114716391 synonymous 216 3 G T Ala [A] Ala [A]
rs151227241 synonymous 221 3 C T Tyr [Y] Tyr [Y]
rs146932786 synonymous 236 3 C T Phe [F] Phe [F]
rs137960127 synonymous 248 3 C T Ala [A] Ala [A]
rs61737367 synonymous 280 3 C T Arg [R] Arg [R]
rs187669174 synonymous 297 3 C T Arg [R] Arg [R]
rs139390622 synonymous 308 3 C T Asn [N] Asn [N]
rs190858285 synonymous 316 3 G T Arg [R] Arg [R]
rs284652 synonymous 324 3 C T Phe [F] Phe [F]
rs55940366 synonymous 325 3 C T Leu [L] Leu [L]
rs144276456 synonymous 347 3 G A Ser [S] Ser [S]
rs190202447 synonymous 383 3 G A Arg [R] Arg [R]
rs145595627 synonymous 403 3 C T Pro [P] Pro [P]
rs4674 synonymous 407 3 A C, G, T Leu [L] Leu [L], Leu [L], Leu [L]
rs147021347 synonymous 417 3 C T Pro [P] Pro [P]
rs34492894 synonymous 420 3 C T Leu [L] Leu [L]
rs148224513 synonymous 443 3 C T Phe [F] Phe [F]

SNPdbe

As SNPdbe contains predicted and experimental SNPs, we first have to parse the output of the database search. The list of all mutations can be found in the Protocol-msud-task5 protocol Unfortunately we only found not validated SNPs. Among those, 8 leading to MSUP.

SNPedia

In SNPedia wie found the folowing four SNPs for MSUD.

  • rs12021720
  • rs17856511
  • rs28934895, also known as R183P (the most common SNP in Ashkenazi Jews); the risk genotype is CC
  • rs28940288

of those the first two are merged into one in dbSNP and do not concern the BCKDHA protein. The same goes for the third entry and the fourth is already contained in dbSNP.

OMIM

OMIM contains 9 entries for BCKDHA( odba_human - 608348 ), all leading to MSUD. As reference sequence Uniprot is given.

Number Phenotype Mutation dbSNP
.0001 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, TYR393ASN -
.0002 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, 8-BP DEL, 887-894 -
.0003 MAPLE SYRUP URINE DISEASE,INTERMEDIATE, TYPE IA BCKDHA, GLY245ARG -
.0004 MAPLE SYRUP URINE DISEASE, INTERMEDIATE, TYPE IA
MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA, INCLUDED
BCKDHA, PHE364CYS -
.0005 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, ARG220TRP -
.0006 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, GLY204SER -
.0007 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, THR265ARG -
.0008 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, CYS219TRP -
.0009 MAPLE SYRUP URINE DISEASE, CLASSIC, TYPE IA BCKDHA, 1-BP DEL, 117C -

mutationmap

We first (with little sucess) tried to build a graphical per-nucleotide representation of the Sequence the result can be found in figure 2.

Figure 2. First experiment with a graphical representation of a mutation map. Nucleotide sequence is shown in red,green, blue and black. The scale is marked in 50 and 100 nucleotide discance. SNPs are markes red when missense and green when synonymous.

After that we decided to go for a structural representation using pymol. Two images of the colored representation can be found below in figures 3 and 4.

Figure 3. Green marked regions are synonymous mutations, red are missense and orande are positions where both synonymous and missense mutations were found.
Figure 4. Green marked regions are synonymous mutations, red are missense and orande are positions where both synonymous and missense mutations were found.


To actually take a look at the mapping we recommend directly loading it in pymol, for wich we provide the pymol session file here.

conclusion

will be added on tuesday...