Task 5 - Mapping SNPs Canavan

First impression

Protocol

Further information can be found in the protocol.

First impression

HGMD

74 (79 in 2012 professional) total for cDNA sequence NM_000049.2 and amino acid sequence NP_000040.1, out of which

47 missense/nonsense
5 splicing
12 small deletions
2 small insertions
1 indel
7 gross deletions

link to search

SNPdbe

55 total, includes predicted functional effect. 29 of these 55 labelled as involved in Canavan Disease.

link to search

dbSNP

same identifiers for sequence as HGMD
"synonymous-codon"[Function_Class] AND ASPA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] yields only 9 results
505 results for SNPs in general in human
- 458 for NP_000040.1 (coding: 23)
- 493 for NP_001121557.1 (coding: 23)

link to search

SNPedia

links to other pages

OMIM

12 allelic variants listed for Aspartoacylase => all linkes to Canavan Disease

to ASPA variants

Coding SNPs

From the above mentioned SNP databases we extracted all coding SNPs for Aspartoacylase. We used the sequence position of the listed SNPs as identifier to create a unique list of known SNPs. <xr id="coding_snp_table"/> shows the resulting list of unique SNPS. For each polymorphism, the source is given (from which DB the SNP was extracted), as well as annotated validation.

**<xr nolink id="coding_snp_table"/>** In this table all SNPs for Aspartoacylase are listed that could be found in HGMD, dbSNPs, SNPDBe and OMIM. Mutations in red are reported to cause the Canavan Disease.
Residue Position	Identifier	Reference DB	Validation evidence	SNP Type	Mutation
4	rs142041344	dbSNP SNPDBe	1000Genomes	missense	C4R
14	CM063852	HGMD	Zeng (2006) Mol Genet Metab 89	missense	V14G
16	CM960084	SNPDBe HGMD	Kaul (1996) Am J Hum Genet 59	missense	I16T
18	CM067343	HGMD	Zeng (2006) Adv Exp Med Biol 576	missense	G18R
21	CM001608	SNPDBe HGMD	SHGMD857217istermans (2000) Eur J Hum Genet 8	missense	H21P
24	rs104894551 CM023602	dbSNP SNPDBe OMIM HGMD	Multiple independent submissions to the refSNP cluster Zeng (2002) J Inherit Metab Dis 25	missense	E24G
26	rs145616193	dbSNP		synonymous	T26T
27	CM960085	SNPDBe HGMD	Kaul (1996) Am J Hum Genet 59	missense	G27R
33	rs138158568	dbSNP SNPDBe		missense	H33R
53	rs17850703	dbSNP SNPDBe		missense	T53A
57	CM001609	SNPDBe HGMD	Sistermans (2000) Eur J Hum Genet 8	missense	A57T
68	CM023603	SNPDBe HGMD	Zeng (2002) J Inherit Metab Dis 25	missense	D68A
71	rs104894553 CM060201	dbSNP SNPDBe HGMD	multiple independent submissions to the refSNP cluster Janson (2006) Ann Neurol 59	missense	R71H
71_2		SNPDBe		missense	R71K
82	rs80099330	dbSNP SNPDBe	Multiple independent submissions to the refSNP cluster Validated by frequency or genotype data 1000 Genome project	missense	M82T
93	rs144639820	dbSNP	Validated by frequency or genotype data	synonymous	A93A
109	CM990192	HGMD	Elpeleg (1999) J Inherit Metab Dis 22	nonsense	Y109Ter
111	rs181347986	dbSNP SNPDBe	1000 Genome project	missense	I111V
114	CM023014	SNPDBe HGMD	Olsen (2002) J Med Genet 39	missense	D114Y
114_2	CM960086	HGMD	Kaul (1996) Am J Hum Genet 59	missense	D114E
121	rs148451498	dbSNP SNPDBe		missense	N121D
121_2	CM063846	HGMD	Zeng (2006) Mol Genet Metab 89	missense	N121I
123	CM960087	SNPDBe HGMD	Kaul (1996) Am J Hum Genet 59	missense	G123E
143	rs199565861	dbSNP SNPDBe	1000 Genome project	missense	I143V
143_2	CM063849	HGMD	Zeng (2006) Mol Genet Metab 89	missense	I143F
143_3	CM980125	HGMD	Kobayashi (1998) Hum Mutat S1, S308	missense	I143T
152	rs104894548 CM950102	dbSNP SNPDBe OMIM HGMD	Multiple independent submissions to the refSNP cluster Kaul (1995) Hum Mutat 5	missense	C152R
152_2	CM023604	HGMD	Zeng (2002) J Inherit Metab Dis 25	missense	C152W
152_3	CM960088	HGMD	Kaul (1996) Am J Hum Genet 59, 95	missense	C152Y
153	rs141755746	dbSNP		synonymous	Y153Y
154	rs147193431 rs2228435	dbSNP SNPDBe		missense	V154I
157	rs140357187	dbSNP SNPDBe		missense	I157T
164		SNPDBe		missense	Y164F
166	CM063847	HGMD	Zeng (2006) Mol Genet Metab 89	missense	T166I
168	CM001610	SNPDBe HGMD	Sistermans (2000) Eur J Hum Genet 8	missense	R168H
168_2	CM960089	HGMD	Kaul (1996) Am J Hum Genet 59	missense	R168C
170	rs144321760	dbSNP SNPDBe	Validated by frequency or genotype data by freq	missense	I170T
178		SNPDBe		missense	E178A
181_2	CM063850	HGMD	Zeng (2006) Mol Genet Metab 89	missense	P181L
181	CM001611	SNPDBe HGMD	Sistermans (2000) Eur J Hum Genet 8	missense	P181T
183	CM990193	SNPDBe HGMD	Elpeleg (1999) J Inherit Metab Dis 22	missense	P183H
184	CM023605	HGMD	Zeng (2002) J Inherit Metab Dis 25	nonsense	Q184Ter
186	CM990194	SNPDBe HGMD	Elpeleg (1999) J Inherit Metab Dis 22	missense	V186F
195	CM990195	SNPDBe HGMD	Elpeleg (1999) J Inherit Metab Dis 22	missense	M195R
202	rs147763700	dbSNP SNPDBe		missense	A202S
213	CM055097	HGMD	Tacke (2005) Neuropediatrics 36	missense	K213E
214	CM023606	HGMD	Zeng (2002) J Inherit Metab Dis 25	nonsense	E214Ter
218	rs104894549 CM950103	dbSNP HGMD	Multiple independent submissions to the refSNP cluster Shaag (1995) Am J Hum Genet 57	nonsense	C218Ter
220	rs139053885	dbSNP SNPDBe		missense	I220T
226_2	CM086530	HGMD	Di Pietro (2008) Clin Biochem 41	missense	I226T
226	rs201887670	dbSNP		missense	I226K
231	rs104894550 CM994594	dbSNP SNPDBe OMIM HGMD	Multiple independent submissions to the refSNP cluster Rady (1999) Am J Med Genet 87	missense	Y231C
231_2	CM940123	HGMD	Kaul (1994) Am J Hum Genet 55	nonsense	Y231Ter
235	rs149842031	dbSNP SNPDBe	Multiple independent submissions to the refSNP cluster Validated by frequency or genotype data 1000 Genome project	missense	E235K
236	rs149189911	dbSNP		synonymous	N236N
239	rs145085349	dbSNP SNPDBe		missense	I239T
244	CM023607	SNPDBe HGMD	Zeng (2002) J Inherit Metab Dis 25	missense	H244R
244_2	CM063848	HGMD	Zeng (2006) Mol Genet Metab 89	missense	H244L
249	rs104894552 CM023015	dbSNP SNPDBe HGMD	Multiple independent submissions to the refSNP cluster Olsen (2002) J Med Genet 39	missense	D249V
270	rs200126822	dbSNP SNPDBe	1000 Genome project	missense	I270T
272	CM063851	HGMD	Zeng (2006) Mol Genet Metab 89	missense	L272P
274	CM950104	SNPDBe HGMD	Shaag (1995) Am J Hum Genet 57	missense	G274R
277	rs78677072	dbSNP	Multiple independent submissions to the refSNP cluster 1000 Genome project	synonymous	T277T
278	rs140581464	dbSNP SNPDBe	Multiple independent submissions to the refSNP cluster Validated by frequency or genotype data 1000Genome project	missense	V278M
279	rs145717248	dbSNP SNPDBe		missense	Y279H
280	rs148081446	dbSNP	Multiple independent submissions to the refSNP cluster Validated by frequency or genotype data 1000 Genome project	synonymous	P280P
280_2	CM990197	HGMD	Elpeleg (1999) J Inherit Metab Dis 22	missense	P280S
281	rs141858640	dbSNP SNPDBe		missense	V281M
285	rs28940279 CM930046	dbSNP SNPDBe OMIM HGMD	Multiple independent submissions to the refSNP cluster Validated by frequency or genotype data Kaul (1993) Nat Genet 5	missense	E285A
285_2		SNPDBe		missense	E285D
286	rs138062143	dbSNP		synonymous	A286A
287	CM990198	SNPDBe HGMD	Elpeleg (1999) J Inherit Metab Dis 22	missense	A287T
288		SNPDBe		missense	Y288F
288_2	CM034717	HGMD	Surendran (2003) Mol Genet Metab 80	missense	Y288C
295	CM950105	SNPDBe HGMD	Shaag (1995) Am J Hum Genet 57	missense	F295S
305	rs28940574 CM940124	dbSNP SNPDBe OMIM HGMD	Multiple independent submissions to the refSNP cluster Kaul (1994) Am J Hum Genet 55	missense	A305E
310		SNPDBe		missense	C310G
314	CM023608	HGMD	Zeng (2002) J Inherit Metab Dis 25	missense	Ter314W

</figtable>

SNP Visualization

The coding SNPs are written below the reference sequence. An 'X'denotes a nonsense mutation coding a stop codon

MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKP  50
   R         G T R  P  G TR     R                 
                                                  
FITNPRAVKKCTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLF  100
  A   T          A  H          T          A       
                    K                             
GPKDSEDSYDIIFDLHNTTSNMGCTLILEDSRNNFLIQMFHYIKTSLAPL  150
        X V  Y      D E                   V       
             E      I                     F       
                                          T
PCYVYLIEHPSLKYATTRSIAKYPVGIEVGPQPQGVLRADILDQMRKMIK  200
 RYI  T      F I H T       A  T HX F        R     
 W               C            L                   
 Y
HALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIAAIIHPNLQDQ  250
 S          EX   X T     K    C   KN  T    R    V 
                         T    X            L      
DWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK  300
                   T P R  TMHPM   AATF      S     
                             S    D  C            
LTLNAKSIRCCLH-  314
    E    G   W

Hotspots

Visual inspection

For the visual inspection, we looked at two kinds of reported SNPs: first, those reported by the HGMD, so they are reported to cause the Canavan Disease. Second, SNPs from dbSNP and xx that are not reported to cause the Canavan Disease (however, they are also not reported to NOT cause it).

For the HGMD data, SNPs were pretty much scattered all over the structure. It was surprising for us that even many SNPs on the surface of the protein, not even close to the dimer interaction site, were also reported to cause the Canavan Disease. See <xr id="hgmd_all" /> and <xr id ="hgmd_surface"/>.

<xr nolink id="hgmd_all"/>
Mutations of Aspartoacylase which cause the Canavan Disease residues are coloured in red. Many of these mutations are close to the binding site (around the zinc ion), as can be expected.

</figure>

<xr nolink id="hgmd_surface"/>
Mutations of Aspartoacylase in red - surface view. Many of these mutations that cause the Canavan Disease can be found both on the surface, far away from the binding site or the dimer interaction site, which we found surprising.

</figure>

</figtable>

For SNPs NOT reported to cause the Canavan Disease, SNPs again were scattered all over the structure. We had expected to visually be able to see some concentration of mutations in certain - apparently functionally unimportant or structurally flexible - regions, but could not validate this expectation with the given data, at least not visually. See <xr id="others_all" /> and <xr id ="others_surface"/>.

<xr nolink id="others_all"/>
Mutations of Aspartoacylase residues are coloured in blue. Visually, we cannot detect a pattern or hotspots.

</figure>

<xr nolink id="others_surface"/>
Mutations of Aspartoacylase in blue - surface view. Mutations that are not reported to cause the Canavan Disease can be found both on the surface, as seen in this picture, as well as on the inside of the protein (as seen on the left).

</figure>

</figtable>

Frequency Distribution

Since visual inspection did not bring further enlightenment, we had a look at the frequency distribution of the disease causing and non-disease causing SNPs. These can be found in <xr id = "can"/> and <xr id="no_can"/>. Neither for the disease causing, nor for the non-disease causing SNPs, we were able to identify remarkable hotspot-regions (except for das ende da, which is curious since it involves residue 288)

<xr nolink id="can"/>
Frequency distribution of disease-causing mutations along the sequence. Mutations are fairly evenly distributed.

</figure>

<xr nolink id="no_can"/>
Frequency distribution of mutations not reported to cause Canavan Disease. Again, the distribution is fairly even, except for a slight accumulation towards the end of the sequence in regions 277-288.

</figure>

Task 5 - Mapping SNPs Canavan

Contents

First impression

Protocol

First impression

HGMD

SNPdbe

dbSNP

SNPedia

OMIM

Coding SNPs

SNP Visualization

Hotspots

Visual inspection

Frequency Distribution

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools