Canavan Disease: Task 07 - Researching SNPs

From Bioinformatikpedia
Revision as of 12:50, 30 August 2013 by Mahlich (talk | contribs)

Researching SNPs: Since it is already known from Task 1, Canavan Disease is primarily caused by point mutations. These point mutations are either synonymous or non-synonymous. Those with an effect almost all refer to non-sysnonymous SNPs. Here all known disease-causing SNPs concerning Canavan Disease were looked up. Generally non-synonymous mutations are the SNPs of interest. However insertion and especially deletions can be of interest if they occur in specific parts of the protein. Deletions of residues that make up the binding pocket may for example disrupt the function of the protein. Insertions of a Proline within a helix may have a significant impact on secondary structure. Deletions and insertions in loop regions or near the end or start of the amino acid chain are supposed to have no severe effect however.

LabJournal

Overview of Databases

Five databases were used to find SNPs associated to aspartoacylase or even Canavan Disease. The following <xr id="data">Table</xr> should give an overview of the resulting information from those databases:

<figtable id="data">

Results for database searches
Database further Information Mutation type Mutations Positions Canavan Disease
HGMD refers to BIOBASE
collate published gene lesions responsible for human inherited disease
public version is out of date for 4 years
(for this task results are from August, 7th 2013)
missense, nonsense 49 40 all
indels 23
splicing 5
dbSNP dbSNP build 137 (newer version not completely released)
contains short genetic variations
(for this task results are from August, 7th 2013)
SNP (silent mutations) 12 12 none
SNP (Canavan) 10 9 all
SNPdbe refers to dbSNP, SwissProt, SwissVar, PMD and 1000Genomes
offers information like experimentally derivation and evidences,
prediction of functional effects, disease associations, heterozygosity,
evolutionary conservation, links to external databases
(for this task results are from August, 7th 2013)
SNP (no association) 26 22 none
SNP (Canavan) 29 24 all
SNPedia wiki with informations about risk alleles and effects of DNA variation
refers often to dbSNP
(for this task results are from June, 21st 2013)
SNP (no association) 6 5 none
SNP (Canavan) 4 4 all
OMIM refers to dbSNP
updated daily
(for this task results are from June, 21st 2013)
SNP 9 8 all
indels 3
Overview of database searches to find SNPs in aspartoacylase either or not associated with Canavan Disease. There are often different mutations on the same position of the protein. Therefore the column positions should give information about the number of positions found in all mutations.

</figtable>

In HGMD only mutations associatied with Canavan Disease are listed. For dbSNP two searches were made: one for silent mutations in dbSNP and one for SNPs associated with Canavan Disease. In SNPdbe the search was against asartoacylase, those associated with Canavan Disease were filtered. SNPedia had also the possibility in searching for different inputs. Therefore two searches, one against aspartoacylae and one against Canavan Disease were done. Since OMIM refers to diseases, the search was restricted to Canavan Disease specific mutations.
Some detailed results can be found in the following sections per database. A list of all specific mutations can be found in the Supplement, to keep the content complete.

HGMD

<xr id="hgmd">Table</xr> should give a more detailed view on which information HGMD provides searching for aspartoacylase:

<figtable id="hgmd">

HGMD Data for ASPA
Mutation Type Explanation Number of Mutations
Missense (Nonsense) Single base-pair substitutions in coding regions (resulting into STOP Codon) 49 (5)
Splicing Mutations with consequences for mRNA splicing 5
Regulatory Substitutions causing regulatory abnormalities 0
Small Deletions Micro-deletions (20 bp or less) 12
Small Insertions Micro-insertions (20 bp or less) 2
Small Indels Micro-indels (20 bp or less) 1
Gross Deletions Information regarding the nature and location of each lesion 8
Gross Insertions / Duplications Information regarding the nature and location of each lesion 0
Complex Rearrangements Information regarding the nature and location of each lesion 0
Repeat Variations Information regarding the nature and location of each lesion 0
Total (see on HGMD website) 77
Resultlist from HGMD in search for ASPA

</figtable>

SNPdbe

Since SNPdbe gives the opportunity to search for experimental evidence of the data, <xr id="snpdbe">Table</xr> shows the kind of experimental evidence and its number of entries (multiple entries per mutation are possible)

<figtable id="snpdbe">

Experimental Evidence in SNPdbe
Experimental Evidence Number of entries
1000Genomes 4
by cluster 10
by frequency 5
not validated 43
Experimental Evidence of SNPdbe entries

</figtable>

The data provided by SNPdbe can also be used to calculate quick and dirty if a mutation has an effect: When calculation the average of the PSSM (position specific scoring matrix) and PERC (percentage) scores per wild type and mutated type, a conservational score can be build. A SNP is assumed to be disease causing if the following three assumptions are true:

  • the PSSM score of the wildtype is larger to its average
  • the PSSM score of the mutation type is smaller then its average
  • the PERC score of the wildtype is larger to its average
  • the PERC score of the mutation type is smaller then its average

Using this method brought 12 SNPs (of originally 55) from which three have an already known dbSNP id. Of those twelve SNPs nine are already associated with Canavan Disease.

Comparison

For a better overview the following two Venn Diagrams <xr id="vennSNPA"></xr> and <xr id="vennSNPB"></xr> shows the number of common SNPs among the databases as well as the number of common SNP positions, both associated with Canavan Disease:

<figure id="vennSNPA">
Figure 1: Venn Diagram, showing the number of common SNPs associated with Canavan Disease using different databases
</figure>
<figure id="vennSNPB">
Figure 2: Venn Diagram, showing the number of common SNP positions associated with Canavan Disease using different databases
</figure>

Hot-Spots

From <xr id="vennSNPB"> Venn Diagram </xr>, the hotspots associated with Canavan Disease can be read off from the overlapping regions. This brings 6 hot-spot positions:
One position is part of all database searches:

  • It is position 24 of the protein sequence in aspartoacylase, which is important for binding the zinc atom in the active center (referenced in Uniprot)

Five positions are part of at least three databases (dbSNP, SNPdbe and HGMD), whereas some of them are part of a secondary structure element. There is no information refering to Uniprot:

  • position 152: beginning of a beta sheet
  • position 231: loop region
  • position 249: loop region
  • position 285: part of helix
  • position 305: ending of a beta sheet

Mutation Map

To get an overview of the mutations concerning aspartoacylase the following <xr id="mutation"></xr> shows disease mutations in red and silent mutations in green: <figure id="mutation">

Figure: mutation Map of disease causing mutations (red) and silent mutations (green)

</figure>

Supplement

Tasks