Mapping mutations of ARS A

Mutations are changes in the genomic nucleotide sequence of an organism. These changes are accidentally introduced, e.g. if wrong bases are incorporated during DNA replication. The common types of mutations are insertions into the DNA, deletions from it or Nucleotide substitutions.
Depending on the type mutation on the DNA level, it might influence the structure or function of a protein in different extent.

Frameshift mutation: If an insertion or deletion of a sequence occurs - and if the length of this sequence is not divisible by 3 - the reading frame of the downstream protein sequence is shifted either by one or two nucleotides. This leads to a completely different translation of the mRNA into the protein and if the downstream regions is long the protein is likely to be dysfunctional.
In a nonsense mutation the codon of an amino acid within the protein is changed, such that a premature stop codon arises. This leads to trncation of the downstream protein sequence. The protein is likely to be dysfunctional if the truncated sequence is long.
A missense mutation describes an alteration of the codon, such that the amino acid in the protein is changed. Depending on the properties and location of the mutated amino acid, changes in structure and function can have a more or less dramatic effect.
In a silent mutation, the mutated codon still encodes the same amino acid. Thus the amino acid sequence of the protein is not changed and no structural or functional alteration should be observed.

In the following, we will map known nonsense, missense and silent (= synonymous) mutations from the databases dbSNP and HGMD on the sequence and the structure of the lysosomal enzyme ARS A.

HGMD

The Human Gene Muation Database (HGMD) <ref> Krawczak M, Cooper DN: The human gene mutation database (HGMD). Genome Digest 3: 7-8, 1996. </ref> provides a comprehensive collection of mutations within human genes, that underly or are associated with diseases. We used the protocol as described Task 5 - Mapping SNPs here to get all missense and nonsense mutations of ARSA. The table of all 90 missense/nonsense mutations is depicted at the end of this section. Furthermore, we mapped all 90 mutations on the sequnece of ARSA and colored them in red to get an impression of the distribution of the mutations (see below).

>sp|P15289|ARSA_HUMAN

MGAPRSLLLALAAGLAVARPPNIVLIFADDLGYGDLGCYGHPSSTTPNLDQLAAGGLRFT

DFYVPVSLCTPSRAALLTGRLPVRMGMYPGVLVPSSRGGLPLEEVTVAEVLAARGYLTGM

AGKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIP

LLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAE

RSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRC

GKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAALAGAPLPNVTLDGFDLSP

LLLGTGKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTADPACHASSSL

TAHEPPLLYDLSKDPGENYNLLGGVAGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARG

EDPALQICCHPGCTPRPACCHCPDPHA

Accession Number	Codon change	Amino acid change	position
CM042298	cGAC-AAC	Asp-Asn	29
CM990171	cGGC-AGC	Gly-Ser	32
CM065974	CTG-CCG	Leu-Pro	52
CM990172	CTG-CCG	Leu-Pro	68
CM950092	CCG-CTG	Pro-Leu	82
CM940096	CGG-CAG	Arg-Gln	84
CM990173	tCGG-TGG	Arg-Trp	84
CM940097	GGC-GAC	Gly-Asp	86
CM990174	gCCC-GCC	Pro-Ala	94
CM970109	AGC-AAC	Ser-Asn	95
CM910049	TCC-TTC	Ser-Phe	96
CM910050	GGC-GAC	Gly-Asp	99
CM990175	GGC-GTC	Gly-Val	99
CM970110	aGGA-AGA	Gly-Arg	119
CM940098	cGGC-AGC	Gly-Ser	122
CM980118	CTG-CCG	Leu-Pro	135
CM940099	CCC-CTC	Pro-Leu	136
CM990176	gCCC-TCC	Pro-Ser	136
CM004461	tCGA-GGA	Arg-Gly	143
CM990177	CCG-CTG	Pro-Leu	148
CM970111	cGAC-TAC	Asp-Tyr	152
CM962419	CAGg-CAC	Gln-His	153
CM940100	GGC-GAC	Gly-Asp	154
CM940101	CCC-CGC	Pro-Arg	155
CM032834	CCC-CTC	Pro-Leu	155
CM042299	cTGC-CGC	Cys-Arg	156
CM940102	CCT-CGT	Pro-Arg	167
CM940103	cGAC-AAC	Asp-Asn	169
CM950093	TGT-TAT	Cys-Tyr	172
CM910051	ATC-AGC	Ile-Ser	179
CM032835	CTG-CAG	Leu-Gln	181
CM950094	CAGc-CAC	Gln-His	190
CM990178	gCCC-ACC	Pro-Thr	191
CM990179	TGGc-TGA	Trp-Term	193
CM950095	TAC-TGC	Tyr-Cys	201
CM930039	GCC-GTC	Ala-Val	212
CM050538	cTTC-GTC	Phe-Val	219
CM930040	GCC-GTC	Ala-Val	224
CM990180	cCAC-TAC	His-Tyr	227
CM940104	cCCT-ACT	Pro-Thr	231
CM940105	cCGC-TGC	Arg-Cys	244
CM970112	CGC-CAC	Arg-His	244
CM930041	cGGG-AGG	Gly-Arg	245
CM034715	TTT-TCT	Phe-Ser	247
CM970113	TCC-TAC	Ser-Tyr	250
CM024340	gGAG-AAG	Glu-Lys	253
CM960078	gGAT-CAT	Asp-His	255
CM930042	ACG-ATG	Thr-Met	274
CM074714	ACT-ATT	Thr-Ile	279
CM993444	aGAC-TAC	Asp-Tyr	281
CM023013	gACC-CCC	Thr-Pro	286
CM990181	CGT-CAT	Arg-His	288
CM940106	gCGT-TGT	Arg-Cys	288
CM042300	cGGC-AGC	Gly-Ser	293
CM044574	GGC-GAC	Gly-Asp	293
CM042301	TGC-TAC	Cys-Tyr	294
CM930043	TCC-TAC	Ser-Tyr	295
CM980119	TTG-TCG	Leu-Ser	298
CM990182	TGT-TTT	Cys-Phe	300
CM032836	cTAC-CAC	Tyr-His	306
HM060041	cGAG-AAG	Glu-Lys	307
CM990183	GGC-GAC	Gly-Asp	308
CM962420	GGC-GTC	Gly-Val	308
CM930044	cGGT-AGT	Gly-Ser	309
CM950096	CGA-CAA	Arg-Gln	311
CM001061	GAGc-GAT	Glu-Asp	312
CM970114	tGCC-ACC	Ala-Thr	314
CM004546	TGGc-TGA	Trp-Term	318
CM032837	cGGC-AGC	Gly-Ser	325
CM990184	ACC-ATC	Thr-Ile	327
CM065973	cGAG-TAG	Glu-Term	329
CM940107	GAC-GTC	Asp-Val	335
CM890013	AAT-AGT	Asn-Ser	350
CM970115	AAGa-AAC	Lys-Asn	367
CM940108	CGG-CAG	Arg-Gln	370
CM940109	tCGG-TGG	Arg-Trp	370
CM940110	CCG-CTG	Pro-Leu	377
CM930045	cGAG-AAG	Glu-Lys	382
CM970116	cCGT-TGT	Arg-Cys	384
CM980120	CGG-CAG	Arg-Gln	390
CM940111	gCGG-TGG	Arg-Trp	390
CM910052	ACT-AGT	Thr-Ser	391
CM980121	tCAC-TAC	His-Tyr	397
CM065972	cAGT-GGT	Ser-Gly	406
CM012065	ACC-ATC	Thr-Ile	408
CM940112	ACT-ATT	Thr-Ile	409
CM990185	gCCC-ACC	Pro-Thr	425
CM940113	CCG-CTG	Pro-Leu	426
CM970117	CTC-CCC	Leu-Pro	428
CM032838	TAT-TCT	Tyr-Ser	429
CM990186	GCC-GTC	Ala-Val	464
CM034716	GCT-GGT	Ala-Gly	469
CM940114	gCAG-TAG	Gln-Term	486
CM044573	cTGT-GGT	Cys-Gly	489

dbSNP

The "SNP" search for ARSA yielded 123 known human mutations for the protein. These are more than selecting the SNPO over the GeneView interface... This is because there exist two isoforms of ARSA and we manually checked that a lot of the other mutations belonged to the other isoform (NP_001078897). We used the isoform, which we have used so far (NP_000478). We chose the gene sequence within GeneView such that it matched the sequence version, that we used so far.

SNP ID	SNP type	nucleotide (mutation)	amino acid (mutation)	nucleotide (reference)	amino acid (reference)	position
rs6151428	missense	A	His [H]	G	Arg [R]	496
rs117341984	missense	A	Arg [R]	G	Gly [G]	447
rs6151427	missense	G	Ser [S]	A	Asn [N]	440
rs6151425	synonymous	T	Asp [D]	C	Asp [D]	381
rs6151422	missense	G	Val [V]	T	Phe [F]	356
rs113990230	synonymous	C	His [H]	T	His [H]	206
rs62001867	missense	A	Thr [T]	G	Ala [A]	205
rs34457249	synonymous	T	Pro [P]	C	Pro [P]	195
rs6151415	missense	T	Cys [C]	G	Trp [W]	193
rs113209108	synonymous	T	Ser [S]	C	Ser [S]	186
rs6151412	synonymous	T	His [H]	C	His [H]	151
rs60504011	missense	G	Ala [A]	C	Pro [P]	136
rs6151411	missense	G	Leu [L]	C	Pro [P]	82
rs6151410	missense	T	Gly [G]	C	Gly [G]	79

>sp|P15289|ARSA_HUMAN

MGAPRSLLLALAAGLAVARPPNIVLIFADDLGYGDLGCYGHPSSTTPNLDQLAAGGLRFT

DFYVPVSLCTPSRAALLTGRLPVRMGMYPGVLVPSSRGGLPLEEVTVAEVLAARGYLTGM

AGKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIP

LLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAE

RSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRC

GKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAALAGAPLPNVTLDGFDLSP

LLLGTGKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTADPACHASSSL

TAHEPPLLYDLSKDPGENYNLLGGVAGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARG

EDPALQICCHPGCTPRPACCHCPDPHA

The mutation map

>sp|P15289|ARSA_HUMAN

MGAPRSLLLALAAGLAVARPPNIVLIFADDLGGGDLGCYGHPSSTTPNLDQLLAGGLRFT

DFYVPVSLLTPSRAALLTGGLPPPRRGGYPGVLVPPSSGGGPLEEVTVAEVLAARGYLTGG

AGGWHLGVGPEGAFLLPPHQGFHRRLGIPPSHHDQGPCNLTCFPPATPPDDGCCQGLVPII

LLANLSSEAQQPWWPPLEARYYAFAAHLMADAARQDRPFFLYYAAHHHHYPPFSGQSFAE

RSGRRGFFDSSMEEDDAVGTLMTAIGDLGLLEETTVIFTTDDGPETTRRSRGGGCSLLLCC

KGTTYYEGGRREAAAFWPGHIAPGGTTELASSLDDLPTLAALAGAPLPNNTLDGFFLSP

LLLGTGKKPRRSLFFYPPYPDDERRVFAVRRTKYKAHHFTQGSAHSSTTTDPACHASSSL

TAHEPPPLLYLSKDPGENYNNLGGVAGGTPEVLQALKQLQLLKAALDAAATFGPSQVARG

EDPALQICCCPGCTPRRACCHCPDPHA

There are 3 identical mutated residues, that are annotated in both databases are at position:

193: The mutations are different. In HGMD, the mutation results in a premature stop codon, thus the main part of the whole protein is truncated. In dbSNP, there is a amino acid substitution (W -> C).
136: The mutationas are different amino acid substitutions. P -> L is annotated in HGMD and P -> A is annotated in dbSNP.
82: Is mutation is identical in both databases and leads to a substitution: P -> L.

Structure of ARSA. Synonymous mutations are shown in green, missense/nonsense mutations in red. The active site is depicted in yellow.

References

Mapping mutations of ARS A

Contents

Mutations in general

HGMD

dbSNP

The mutation map

References

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools