Difference between revisions of "Mapping mutations of ARS A"
(→HGMD) |
(→HGMD) |
||
Line 13: | Line 13: | ||
=== HGMD === |
=== HGMD === |
||
− | The Human Gene Muation Database (HGMD) <ref> Krawczak M, Cooper DN: The human gene mutation database (HGMD). Genome Digest 3: 7-8, 1996. </ref> provides a comprehensive collection of mutations within human genes, that underly or are associated with diseases. We used the protocol as described [[Task 5 - Mapping SNPs | here]] to get all missense and nonsense mutations of ARSA. The table of all 90 missense/nonsense mutations is depicted at the end of this section. Furthermore, we mapped all 90 mutations on the sequnece of ARSA and colored them in red to get an impression of the distribution of the mutations (see below). Together with the sequence and the location of the mutations, we marked important binding sites in the graphical illustration below. '''*''' are metal binding sites, '''.''' are substrate binding sites and ":" is the active site. |
+ | The Human Gene Muation Database (HGMD) <ref> Krawczak M, Cooper DN: The human gene mutation database (HGMD). Genome Digest 3: 7-8, 1996. </ref> provides a comprehensive collection of mutations within human genes, that underly or are associated with diseases. We used the protocol as described [[Task 5 - Mapping SNPs | here]] to get all missense and nonsense mutations of ARSA. All mutations found were known to be associated with Metachromatic Leukodystrophy. The table of all 90 missense/nonsense mutations is depicted at the end of this section. Furthermore, we mapped all 90 mutations on the sequnece of ARSA and colored them in red to get an impression of the distribution of the mutations (see below). Together with the sequence and the location of the mutations, we marked important binding sites in the graphical illustration below. "'''*'''" are metal binding sites, "'''.'''" are substrate binding sites and "''':'''" is the active site. One can see, that these important functional sites are always near a known mutation, which are therefore likely to cause a misfunction of the enzyme. |
<code> |
<code> |
Revision as of 16:22, 16 June 2011
Mutations in general
Mutations are changes in the genomic nucleotide sequence of an organism. These changes are accidentally introduced, e.g. if wrong bases are incorporated during DNA replication. The common types of mutations are insertions into the DNA, deletions from it or Nucleotide substitutions.
Depending on the type mutation on the DNA level, it might influence the structure or function of a protein in different extent.
- Frameshift mutation: If an insertion or deletion of a sequence occurs - and if the length of this sequence is not divisible by 3 - the reading frame of the downstream protein sequence is shifted either by one or two nucleotides. This leads to a completely different translation of the mRNA into the protein and if the downstream regions is long the protein is likely to be dysfunctional.
- In a nonsense mutation the codon of an amino acid within the protein is changed, such that a premature stop codon arises. This leads to trncation of the downstream protein sequence. The protein is likely to be dysfunctional if the truncated sequence is long.
- A missense mutation describes an alteration of the codon, such that the amino acid in the protein is changed. Depending on the properties and location of the mutated amino acid, changes in structure and function can have a more or less dramatic effect.
- In a silent mutation, the mutated codon still encodes the same amino acid. Thus the amino acid sequence of the protein is not changed and no structural or functional alteration should be observed.
In the following, we will map known nonsense, missense and silent (= synonymous) mutations from the databases dbSNP and HGMD on the sequence and the structure of the lysosomal enzyme ARS A.
HGMD
The Human Gene Muation Database (HGMD) <ref> Krawczak M, Cooper DN: The human gene mutation database (HGMD). Genome Digest 3: 7-8, 1996. </ref> provides a comprehensive collection of mutations within human genes, that underly or are associated with diseases. We used the protocol as described here to get all missense and nonsense mutations of ARSA. All mutations found were known to be associated with Metachromatic Leukodystrophy. The table of all 90 missense/nonsense mutations is depicted at the end of this section. Furthermore, we mapped all 90 mutations on the sequnece of ARSA and colored them in red to get an impression of the distribution of the mutations (see below). Together with the sequence and the location of the mutations, we marked important binding sites in the graphical illustration below. "*" are metal binding sites, "." are substrate binding sites and ":" is the active site. One can see, that these important functional sites are always near a known mutation, which are therefore likely to cause a misfunction of the enzyme.
>sp|P15289|ARSA_HUMAN
**
MGAPRSLLLALAAGLAVARPPNIVLIFADDLGYGDLGCYGHPSSTTPNLDQLAAGGLRFT
*
DFYVPVSLCTPSRAALLTGRLPVRMGMYPGVLVPSSRGGLPLEEVTVAEVLAARGYLTGM
. : .
AGKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIP
.
LLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAE
** .
RSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRC
GKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAALAGAPLPNVTLDGFDLSP
LLLGTGKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTADPACHASSSL
TAHEPPLLYDLSKDPGENYNLLGGVAGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARG
EDPALQICCHPGCTPRPACCHCPDPHA
Accession Number | Codon change | Amino acid change | position |
CM042298 | cGAC-AAC | Asp-Asn | 29 |
CM990171 | cGGC-AGC | Gly-Ser | 32 |
CM065974 | CTG-CCG | Leu-Pro | 52 |
CM990172 | CTG-CCG | Leu-Pro | 68 |
CM950092 | CCG-CTG | Pro-Leu | 82 |
CM940096 | CGG-CAG | Arg-Gln | 84 |
CM990173 | tCGG-TGG | Arg-Trp | 84 |
CM940097 | GGC-GAC | Gly-Asp | 86 |
CM990174 | gCCC-GCC | Pro-Ala | 94 |
CM970109 | AGC-AAC | Ser-Asn | 95 |
CM910049 | TCC-TTC | Ser-Phe | 96 |
CM910050 | GGC-GAC | Gly-Asp | 99 |
CM990175 | GGC-GTC | Gly-Val | 99 |
CM970110 | aGGA-AGA | Gly-Arg | 119 |
CM940098 | cGGC-AGC | Gly-Ser | 122 |
CM980118 | CTG-CCG | Leu-Pro | 135 |
CM940099 | CCC-CTC | Pro-Leu | 136 |
CM990176 | gCCC-TCC | Pro-Ser | 136 |
CM004461 | tCGA-GGA | Arg-Gly | 143 |
CM990177 | CCG-CTG | Pro-Leu | 148 |
CM970111 | cGAC-TAC | Asp-Tyr | 152 |
CM962419 | CAGg-CAC | Gln-His | 153 |
CM940100 | GGC-GAC | Gly-Asp | 154 |
CM940101 | CCC-CGC | Pro-Arg | 155 |
CM032834 | CCC-CTC | Pro-Leu | 155 |
CM042299 | cTGC-CGC | Cys-Arg | 156 |
CM940102 | CCT-CGT | Pro-Arg | 167 |
CM940103 | cGAC-AAC | Asp-Asn | 169 |
CM950093 | TGT-TAT | Cys-Tyr | 172 |
CM910051 | ATC-AGC | Ile-Ser | 179 |
CM032835 | CTG-CAG | Leu-Gln | 181 |
CM950094 | CAGc-CAC | Gln-His | 190 |
CM990178 | gCCC-ACC | Pro-Thr | 191 |
CM990179 | TGGc-TGA | Trp-Term | 193 |
CM950095 | TAC-TGC | Tyr-Cys | 201 |
CM930039 | GCC-GTC | Ala-Val | 212 |
CM050538 | cTTC-GTC | Phe-Val | 219 |
CM930040 | GCC-GTC | Ala-Val | 224 |
CM990180 | cCAC-TAC | His-Tyr | 227 |
CM940104 | cCCT-ACT | Pro-Thr | 231 |
CM940105 | cCGC-TGC | Arg-Cys | 244 |
CM970112 | CGC-CAC | Arg-His | 244 |
CM930041 | cGGG-AGG | Gly-Arg | 245 |
CM034715 | TTT-TCT | Phe-Ser | 247 |
CM970113 | TCC-TAC | Ser-Tyr | 250 |
CM024340 | gGAG-AAG | Glu-Lys | 253 |
CM960078 | gGAT-CAT | Asp-His | 255 |
CM930042 | ACG-ATG | Thr-Met | 274 |
CM074714 | ACT-ATT | Thr-Ile | 279 |
CM993444 | aGAC-TAC | Asp-Tyr | 281 |
CM023013 | gACC-CCC | Thr-Pro | 286 |
CM990181 | CGT-CAT | Arg-His | 288 |
CM940106 | gCGT-TGT | Arg-Cys | 288 |
CM042300 | cGGC-AGC | Gly-Ser | 293 |
CM044574 | GGC-GAC | Gly-Asp | 293 |
CM042301 | TGC-TAC | Cys-Tyr | 294 |
CM930043 | TCC-TAC | Ser-Tyr | 295 |
CM980119 | TTG-TCG | Leu-Ser | 298 |
CM990182 | TGT-TTT | Cys-Phe | 300 |
CM032836 | cTAC-CAC | Tyr-His | 306 |
HM060041 | cGAG-AAG | Glu-Lys | 307 |
CM990183 | GGC-GAC | Gly-Asp | 308 |
CM962420 | GGC-GTC | Gly-Val | 308 |
CM930044 | cGGT-AGT | Gly-Ser | 309 |
CM950096 | CGA-CAA | Arg-Gln | 311 |
CM001061 | GAGc-GAT | Glu-Asp | 312 |
CM970114 | tGCC-ACC | Ala-Thr | 314 |
CM004546 | TGGc-TGA | Trp-Term | 318 |
CM032837 | cGGC-AGC | Gly-Ser | 325 |
CM990184 | ACC-ATC | Thr-Ile | 327 |
CM065973 | cGAG-TAG | Glu-Term | 329 |
CM940107 | GAC-GTC | Asp-Val | 335 |
CM890013 | AAT-AGT | Asn-Ser | 350 |
CM970115 | AAGa-AAC | Lys-Asn | 367 |
CM940108 | CGG-CAG | Arg-Gln | 370 |
CM940109 | tCGG-TGG | Arg-Trp | 370 |
CM940110 | CCG-CTG | Pro-Leu | 377 |
CM930045 | cGAG-AAG | Glu-Lys | 382 |
CM970116 | cCGT-TGT | Arg-Cys | 384 |
CM980120 | CGG-CAG | Arg-Gln | 390 |
CM940111 | gCGG-TGG | Arg-Trp | 390 |
CM910052 | ACT-AGT | Thr-Ser | 391 |
CM980121 | tCAC-TAC | His-Tyr | 397 |
CM065972 | cAGT-GGT | Ser-Gly | 406 |
CM012065 | ACC-ATC | Thr-Ile | 408 |
CM940112 | ACT-ATT | Thr-Ile | 409 |
CM990185 | gCCC-ACC | Pro-Thr | 425 |
CM940113 | CCG-CTG | Pro-Leu | 426 |
CM970117 | CTC-CCC | Leu-Pro | 428 |
CM032838 | TAT-TCT | Tyr-Ser | 429 |
CM990186 | GCC-GTC | Ala-Val | 464 |
CM034716 | GCT-GGT | Ala-Gly | 469 |
CM940114 | gCAG-TAG | Gln-Term | 486 |
CM044573 | cTGT-GGT | Cys-Gly | 489 |
dbSNP
The "SNP" search for ARSA yielded 123 known human mutations for the protein. These are more than selecting the SNPO over the GeneView interface... This is because there exist two isoforms of ARSA and we manually checked that a lot of the other mutations belonged to the other isoform (NP_001078897). We used the isoform, which we have used so far (NP_000478). We chose the gene sequence within GeneView such that it matched the sequence version, that we used so far.
SNP ID | SNP type | nucleotide (mutation) | amino acid (mutation) | nucleotide (reference) | amino acid (reference) | position |
---|---|---|---|---|---|---|
rs6151428 | missense | A | His [H] | G | Arg [R] | 496 |
rs117341984 | missense | A | Arg [R] | G | Gly [G] | 447 |
rs6151427 | missense | G | Ser [S] | A | Asn [N] | 440 |
rs6151425 | synonymous | T | Asp [D] | C | Asp [D] | 381 |
rs6151422 | missense | G | Val [V] | T | Phe [F] | 356 |
rs113990230 | synonymous | C | His [H] | T | His [H] | 206 |
rs62001867 | missense | A | Thr [T] | G | Ala [A] | 205 |
rs34457249 | synonymous | T | Pro [P] | C | Pro [P] | 195 |
rs6151415 | missense | T | Cys [C] | G | Trp [W] | 193 |
rs113209108 | synonymous | T | Ser [S] | C | Ser [S] | 186 |
rs6151412 | synonymous | T | His [H] | C | His [H] | 151 |
rs60504011 | missense | G | Ala [A] | C | Pro [P] | 136 |
rs6151411 | missense | G | Leu [L] | C | Pro [P] | 82 |
rs6151410 | missense | T | Gly [G] | C | Gly [G] | 79 |
>sp|P15289|ARSA_HUMAN
MGAPRSLLLALAAGLAVARPPNIVLIFADDLGYGDLGCYGHPSSTTPNLDQLAAGGLRFT
DFYVPVSLCTPSRAALLTGRLPVRMGMYPGVLVPSSRGGLPLEEVTVAEVLAARGYLTGM
AGKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIP
LLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAE
RSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRC
GKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAALAGAPLPNVTLDGFDLSP
LLLGTGKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTADPACHASSSL
TAHEPPLLYDLSKDPGENYNLLGGVAGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARG
EDPALQICCHPGCTPRPACCHCPDPHA
The mutation map
>sp|P15289|ARSA_HUMAN
MGAPRSLLLALAAGLAVARPPNIVLIFADDLGGGDLGCYGHPSSTTPNLDQLLAGGLRFT
DFYVPVSLLTPSRAALLTGGLPPPRRGGYPGVLVPPSSGGGPLEEVTVAEVLAARGYLTGG
AGGWHLGVGPEGAFLLPPHQGFHRRLGIPPSHHDQGPCNLTCFPPATPPDDGCCQGLVPII
LLANLSSEAQQPWWPPLEARYYAFAAHLMADAARQDRPFFLYYAAHHHHYPPFSGQSFAE
RSGRRGFFDSSMEEDDAVGTLMTAIGDLGLLEETTVIFTTDDGPETTRRSRGGGCSLLLCC
KGTTYYEGGRREAAAFWPGHIAPGGTTELASSLDDLPTLAALAGAPLPNNTLDGFFLSP
LLLGTGKKPRRSLFFYPPYPDDERRVFAVRRTKYKAHHFTQGSAHSSTTTDPACHASSSL
TAHEPPPLLYLSKDPGENYNNLGGVAGGTPEVLQALKQLQLLKAALDAAATFGPSQVARG
EDPALQICCCPGCTPRRACCHCPDPHA
There are 3 identical mutated residues, that are annotated in both databases are at position:
- 193: The mutations are different. In HGMD, the mutation results in a premature stop codon, thus the main part of the whole protein is truncated. In dbSNP, there is a amino acid substitution (W -> C).
- 136: The mutationas are different amino acid substitutions. P -> L is annotated in HGMD and P -> A is annotated in dbSNP.
- 82: Is mutation is identical in both databases and leads to a substitution: P -> L.
References
<references/>