Canavan Disease: Task 07 - Researching SNPs
Researching SNPs: Since it is already known from Task 1, Canavan Disease is primarily caused by point mutations. These point mutations are either synonymous or non-synonymous. Those with an effect almost all refer to non-sysnonymous SNPs. Here all known disease-causing SNPs concerning Canavan Disease were looked up. Generally non-synonymous mutations are the SNPs of interest. However insertion and especially deletions can be of interest if they occur in specific parts of the protein. Deletions of residues that make up the binding pocket may for example disrupt the function of the protein. Insertions of a Proline within a helix may have a significant impact on secondary structure. Deletions and insertions in loop regions or near the end or start of the amino acid chain are supposed to have no severe effect however.
Contents
LabJournal
Overview of Databases
Five databases were used to find SNPs associated with the protein aspartoacylase or directly associated with Canavan Disease. The following <xr id="data">Table</xr> should give an overview of the resulting information from those databases:
<figtable id="data">
Results for database searches | |||||
---|---|---|---|---|---|
Database | further Information | Mutation type | Mutations | Positions | Canavan Disease |
HGMD | refers to BIOBASE collate published gene lesions responsible for human inherited disease public version is out of date for 4 years (for this task results are from August, 7th 2013) |
missense, nonsense | 49 | 40 | all |
indels | 23 | ||||
splicing | 5 | ||||
dbSNP | dbSNP build 137 (newer version not completely released) contains short genetic variations (for this task results are from August, 7th 2013) |
SNP (silent mutations) | 12 | 12 | none |
SNP (Canavan) | 10 | 9 | all | ||
SNPdbe | refers to dbSNP, SwissProt, SwissVar, PMD and 1000Genomes offers information like experimentally derivation and evidences, prediction of functional effects, disease associations, heterozygosity, evolutionary conservation, links to external databases (for this task results are from August, 7th 2013) |
SNP (no association) | 26 | 22 | none |
SNP (Canavan) | 29 | 24 | all | ||
SNPedia | wiki with informations about risk alleles and effects of DNA variation refers often to dbSNP (for this task results are from June, 21st 2013) |
SNP (no association) | 6 | 5 | none |
SNP (Canavan) | 4 | 4 | all | ||
OMIM | refers to dbSNP updated daily (for this task results are from June, 21st 2013) |
SNP | 9 | 8 | all |
indels | 3 |
</figtable>
In HGMD only mutations associatied with Canavan Disease are listed. For dbSNP two searches were made: one for silent mutations in dbSNP and one for SNPs associated with Canavan Disease. In SNPdbe the search was against asartoacylase, those associated with Canavan Disease were filtered. SNPedia had also the possibility in searching for different inputs. Therefore two searches, one against aspartoacylae and one against Canavan Disease were done. Since OMIM refers to diseases, the search was restricted to Canavan Disease specific mutations.
Some detailed results can be found in the following sections per database. A list of all specific mutations can be found in the Supplement, to keep the content complete.
HGMD
<xr id="hgmd">Table</xr> should give a more detailed view on which information HGMD provides searching for aspartoacylase:
<figtable id="hgmd">
HGMD Data for ASPA | ||
---|---|---|
Mutation Type | Explanation | Number of Mutations |
Missense (Nonsense) | Single base-pair substitutions in coding regions (resulting into STOP Codon) | 49 (5) |
Splicing | Mutations with consequences for mRNA splicing | 5 |
Regulatory | Substitutions causing regulatory abnormalities | 0 |
Small Deletions | Micro-deletions (20 bp or less) | 12 |
Small Insertions | Micro-insertions (20 bp or less) | 2 |
Small Indels | Micro-indels (20 bp or less) | 1 |
Gross Deletions | Information regarding the nature and location of each lesion | 8 |
Gross Insertions / Duplications | Information regarding the nature and location of each lesion | 0 |
Complex Rearrangements | Information regarding the nature and location of each lesion | 0 |
Repeat Variations | Information regarding the nature and location of each lesion | 0 |
Total | (see on HGMD website) | 77 |
</figtable>
SNPdbe
Since SNPdbe gives the opportunity to search for experimental evidence of the data, <xr id="snpdbe">Table</xr> shows the kind of experimental evidence and its number of entries (multiple entries per mutation are possible)
<figtable id="snpdbe">
Experimental Evidence in SNPdbe | |
---|---|
Experimental Evidence | Number of entries |
1000Genomes | 4 |
by cluster | 10 |
by frequency | 5 |
not validated | 43 |
</figtable>
The data provided by SNPdbe can also be used to calculate quick and dirty if a mutation has an effect: When calculation the average of the PSSM (position specific scoring matrix) and PERC (percentage) scores per wild type and mutated type, a conservational score can be build. A SNP is assumed to be disease causing if the following three assumptions are true:
- the PSSM score of the wildtype is larger to its average
- the PSSM score of the mutation type is smaller then its average
- the PERC score of the wildtype is larger to its average
- the PERC score of the mutation type is smaller then its average
Using this method brought 12 SNPs (of originally 55) from which three have an already known dbSNP id. Of those twelve SNPs nine are already associated with Canavan Disease.
Comparison
For a better overview the following two Venn Diagrams <xr id="vennSNPA"></xr> and <xr id="vennSNPB"></xr> shows the number of common SNPs among the databases as well as the number of common SNP positions, both associated with Canavan Disease:
<figure id="vennSNPA"></figure> | <figure id="vennSNPB"></figure> |
Hot-Spots
From <xr id="vennSNPB"> Venn Diagram </xr>, the hotspots associated with Canavan Disease can be read off from the overlapping regions. This brings 6 hot-spot positions:
One position is part of all database searches:
- It is position 24 of the protein sequence in aspartoacylase, which is important for binding the zinc atom in the active center (referenced in Uniprot)
Five positions are part of at least three databases (dbSNP, SNPdbe and HGMD), whereas some of them are part of a secondary structure element. There is no information refering to Uniprot:
- position 152: beginning of a beta sheet
- position 231: loop region
- position 249: loop region
- position 285: part of helix
- position 305: ending of a beta sheet
Mutation Map
To get an overview of the mutations concerning aspartoacylase the following <xr id="mutation"></xr> shows disease mutations in red and silent mutations in green: <figure id="mutation">
</figure>
Supplement
Tasks
- Link to Task 01: Canavan Disease
- Link to Task 02: Alignments
- Link to Task 03: Sequence-based Predictions
- Link to Task 04: Structural Alignments
- Link to Task 05: Homology Modelling
- Link to Task 06: Protein Structure Prediction from Evolutionary Sequence Variation
- Link to Task 07: Researching SNPs
- Link to Task 08: Sequence-based Mutation Analysis
- Link to Task 09: Structure-based Mutation Analysis
- Link to Task 10: Normal Mode Analysis