Researching SNPs (Phenylketonuria)

From Bioinformatikpedia


Single nucleotide polymorphisms (SNPs) are changes from one single nucleotide in the DNA, where the mutated one has a frequency of at least 1% in the population. Since amino acids are formed by three nucleotides, these may result in another amino acid and thus may disturb the function or structure of a protein. If this SNP lies on a coding region this might cause a disease. Some amino acids have more than one triplet, so an exchange of the last base sometimes results in a synonymous amino acid. Here we are only interested in SNPs found in coding regions with change the amino acids also called non-synonymous SNPs. Those SNPs can be found in different databases. Thereby HGMD and OMIM only report SNPs which are associated with a disease, whereas for example dbSNP includes every SNP that is found yet.

Research SNPs


Database research

HGMD (The Human Gene Mutation Database) is a human disease associated mutation database with a free and not current public version available for users from academic/nonprofit institutions. The database contains single base-pair substitutions in coding (missense, nonsense), regulatory and splicing-relevant regions of human nuclear genes, small deletions, small insertions and small combined insertions/deletions (indels), repeat expansions, gross deletions and gross insertions/duplications and complex rearrangements (including inversions). The information for mutations and polymorphisms is updated weekly <ref>, retrieved on July 22nd in year 2013</ref> and received by manual and computer-based search methods. For the optimization of the included data, online library screenings, PubMed and other freely available mutation databases are used. Silent mutations and not adequately analyzed or badly described mutations in the literature are excluded. <ref name="hgmd"> Peter D. Stenson, Matthew Mort, Edward V. Ball, Katy Howells, Andrew D. Phillips, Nick S. T. Thomas and David N. Cooper (2009): "The Human Gene Mutation Database: 2008 update". Genome Medicine 1: 13. doi:10.1186/gm13 </ref> Last update of the professional HGMD was in March 2013. The public version includes entries that are at least three years included in HGMD professional.

Found mutations in protein PAH

The following mutations were found in HGMD on the 22nd June 2013: <figtable id="hgmd">

Mutation types for PAH found in HGMD with appendant quantity and definition
Mutation type # of mutations Definition
Missense/nonsense 415 x
Splicing 82 x
Regulatory 1 x
Small deletions 66 x
Small insertions 8 x
Small indels 6 x
Gross deletions 27 x
Gross insertions/duplications 3 x
Complex rearrangements 1 x
Repeat variations 0 x
Total Number of mutations: 609 (HGMD professional 2013.1: 720)

</figtable> 543 mutations of <xr id="hgmd"/> are known for causing Phenylketonuria, 58 for Hyperphenylalaninaemia, one for Increased activity and one for association with Schizophrenia. For five of the mutations it is assumed that they cause Phenylketonuria and one Hyperphenylalaninanaemia.

The chromosomal location of PAH lies between 12q22-q24.2 and the cDNA sequence has the accession number NM_000277.1.


dbSNP (Single Nucleotide Polymorphism ) is a freely available repository for short genetic variations. The database includes basically single nucleotide substitutions (SNPs) however also a small number of microsatellite repeats and small insertion/deletion polymorphisms. dbSNP is updated after each new data submission <ref name="dbsnp"> Elizabeth M. Smigielski, Karl Sirotkin, Minghong Ward and Stephen T. Sherry (2000): "dbSNP: a database of single nucleotide polymorphisms". Nucleic Acids Research Vol.28(1): 352-355. PubMed:10592272 </ref> and gets the information from two different classes: firstly from original observations of sequence variation (submitted data) and the second one from content generated via computation on original submitted data (computed content). The last release was Build 137 on Jun 26, 2012. Although there was a newer release in April 2013, this is not the current one we are based on as Build 138 is not yet completed (Phase I) and only available for some specific species not including human. However, it is to be updated until end of 2013.

The following results are taken from dbSNP on 10th of July 2013. There are 709 SNPs annotated as missense or nonsense mutations in dbNSP and 75 which result in a synonymous-codon. Altogether including all mutation types 2511 substitutions for PAH can be found. Nevertheless as there are some IDs represented more often than one time and some that are merged with another ID the numbers must be treated carefully.


SNPdbe (nsSNP database of functional effects) is a database for non synonymous SNPs. Those SNPs are taken from dbSNP and 1000 Genomes collection, as well as variants from UniProt and PMD. The effect of the amino acid substitutions are based on predictions of SNAP and SIFT. Nevertheless, those predictions are amplified with other databases like PMD, OMIM or UniProt. SNPdbe includes a wide range of different organisms and for each amino acid substitution funtional and structural informations gained by experiments and the predicted functional effects are reported. Furthermore, amongst other informations associated diseases as well as the conservation of the wildtype and the mutant amino acid can be viewed. <ref name="snpdbe"> Christian Schaefer, Alice Meier, Burkhard Rost and Yana Bromberg (2012): "SNPdbe: constructing an nsSNP functional impacts database". Bioinformatics Vol.28(4): 601-602. doi:10.1093/bioinformatics/btr705 </ref>

Last update was on the 5th of March 2012, where 440 mutations are found for PAH in any species. For humans there are 385 entries and for PKU 187.


OMIM (Online Mendelian Inheritance in Man) is a comprehensive, daily updated and online freely available knowledgebase of human genes and genetic disorders. An entry in OMIM includes the primary and sometimes alternative title and symbol, a gene map locus which displays the cytogenetic location of the gene or disorder, multiple map locations if a disease is known to be genetically heterogeneous, links to the NCBI’s ‘neighboring’ feature and several other informations and links to many useful genetic resources. Within the relevant gene entry allelic variants with functional significance are maintained. A few polymorphisms, associated with particular common disorders, are included as well. OMIM is particularly easy and uncomplicated for the use of the growing information in human genetics. The information is derived from biomedical literature and is written and edited at the John Hopkins University with input from scientists and physicians all over the world. <ref name="omim"> Ada Hamosh, Alan F Scott, Joanna S Amberger, Carol A Bocchini and Victor A McKusick (2005): "Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders". Nucleic Acids Research Vol.33 (Database issue): D514-D517. doi:10.1093/nar/gki033, PubMed:15608251</ref>

Detailed information about the PAH gene and its mutations in OMIM can be found on this page and the summarizing table with the allelic variants searched on 10th July 2013 can be viewed here. Altogether 67 entries were reported as allelic variant for PAH. Nevertheless three of those entries are moved to another one, so there are only 64. There are 50 entries associated with PHENYLKETONURIA and 16 with HYPERPHENYLALANINEMIA, NON-PKU. Two of those entries are associated with both PKU and non-PKU.


SNPedia is a wiki resource, which can be updated daily by users. However, this always depends on the releases of other databases. The included data is edited and updated by both automatic and manual means from individual sources as well as masses. Wiki users can add data on a continous basis, which is augmented via regular updates from public data sources. The appendant sources cite notably publications like Pubmed PMID or DOI identifiers. On SNPedia you can search genes, genomes, genosets, genotypes, medicines and medical conditions for SNPs. The SNPs are listed with rs-ID and their amino acid exchange. Links to other databases for further informations are included. <ref name="snpedia"> Michael Cariaso and Greg Lennon (2012): "SNPedia: a wiki supporting personal genome annotation, interpretation and analysis". Nucleic Acids Research Vol.40 (Database issue): D1308-D1312. doi:10.1093/nar/gkr798</ref>

For Phenylketonuria there are 58 SNPs listed on SNPedia on the 24th June 2013 on this page.

Mutation Map

In <xr id="mutMap"/> 100 SNPs are drawn to their nucleotide sequence position. We only choose SNPs listed in dbSNP, OMIM or HGMD. Thereby all SNPs found in OMIM are also included in dbSNP, which is why we added only a few more of dbSNP. The complete list can be viewed in the Lab journal.
<figure id="mutMap">

100 SNPs of dbSNP, OMIM and HGMD. Each red bar represents the nucleotide position that is exchanged to another one resulting in another amino acid.

</figure> As you can see the SNPs that cause PKU are distributed over the whole sequence.