Researching SNPs TSD

From Bioinformatikpedia
Revision as of 22:23, 8 June 2012 by Reeb (talk | contribs)

Oh it was gorgeousness and gorgeosity made flesh. The trombones crunched redgold under my bed, and behind my gulliver the trumpets three-wise silverflamed, and there by the door the timps rolling through my guts and out again crunched like candy thunder. Oh, it was wonder of wonders. And then, a bird of like rarest spun heavenmetal, or like silvery wine flowing in a spaceship, gravity all nonsense now, came the violin solo above all the other strings, and those strings were like a cage of silk round my bed. Then flute and oboe bored, like worms of like platinum, into the thick thick toffee gold and silver. I was in such bliss, my brothers.

-A Clockwork Orange

The journal for this task can be found here.

Sequence mapping

The different databases use different sequences as basis for the indices of their SNP data. In the following, the reference protein sequence remains P06865, however all databases base their annotations on nucleotide sequences as well. While the final annotations will only be displayed mapped onto the protein sequence, NM_000520.4 will be used as a nucleotide reference sequence in the background. This entry describes an mRNA of HEXA and is also linked to by the Uniprot entry of P06865.

HGMD lists NM_000520.3 as reference, which is a previous version of NM_000520.4 that was chosen as reference for this task. A Needleman-Wunsch pairwise sequence alignment between the two nucleotide sequences in the entries shows that there are two single nucleotide differences in the last third of the sequence and that the more current version of the entry is 117 nucleotides longer at the beginning of the sequence. Since this region is annotated to belong to an exon, the question remains whether this has an effect on the protein sequence. A short comparison shows that there is a single differing residue at position 436 where a Val in NM_000520.3 is subsituted by an Ile in NM_000520.4. However since HGMD does not list a SNP at this position, this is not an issue.


A table here?, with the different reference sequences per database? snpdbe has a lot tough! HGMD lists 'NM_000520.3' as


Notation of mutations

The Human Genome Variation Society provides a very sophisticated nomenclature <ref name="hgvsnomen">http://www.hgvs.org/mutnomen/</ref> that will be used for all annotations within this task. Most importantly a substituion is written as p.[wtResidue][position][mtResidue].

HGMD

dbSNP

SNPdbe

OMIM

don't know yet where to look

SNPedia

Mutation map

References

<references/>