Difference between revisions of "Mapping SNPs GLA"

From Bioinformatikpedia
m (Mutation Map)
(dbSNP)
Line 737: Line 737:
   
 
==dbSNP==
 
==dbSNP==
We used [http://www.ncbi.nlm.nih.gov/projects/SNP/ DbSNP] to search for silent mutations, as they are not listed in HGMD (because the do not lead to a disease). The query <code>"synonymous-codon"[Function_Class] AND gla[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]</code> only output two results, thus we decided to use a different approach: We searched for <code>gla</code> only and chose the "FlatFile" view. The pages were parsed for the term "synonymous-codon" and the results are shown in the table below. Insertions and deletions have not been taken into account, because they usually cause frameshifts, that means that the reading frame of the transcription is changed. Thus the mutation does not only course the change of a single amino acid, but it most probably changes the whole protein sequence. Additionally the new reading frame my read a stop codon and the transcription is terminated to early or the normal stop codon is missed and the sequence gets to long. In all cases the mutation can not be mapped to the initial sequence and thus is excluded from this task.
+
We used [http://www.ncbi.nlm.nih.gov/projects/SNP/ DbSNP] to search for silent mutations, as they are not listed in HGMD (because the do not lead to a disease). The query <code>"synonymous-codon"[Function_Class] AND gla[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]</code> only output two results, these are listed in the table below. Insertions and deletions have not been taken into account, because they usually cause frameshifts, that means that the reading frame of the transcription is changed. Thus the mutation does not only course the change of a single amino acid, but it most probably changes the whole protein sequence. Additionally the new reading frame my read a stop codon and the transcription is terminated to early or the normal stop codon is missed and the sequence gets to long. In all cases the mutation can not be mapped to the initial sequence and thus is excluded from this task.
   
   
Line 759: Line 759:
 
| C -> T
 
| C -> T
 
| R -> R
 
| R -> R
|-
 
| rs117579981
 
| 92
 
| GAT -> GAC
 
| A -> C
 
| T -> T
 
|-
 
| rs78630710
 
| 119
 
| CAG -> CAA
 
| G -> A
 
| V -> V
 
|-
 
| rs77056043
 
| 157
 
| CAG -> CAC
 
| G -> C
 
| L -> L
 
|-
 
| rs72635973
 
| 130
 
| AAG -> AAA
 
| C -> A
 
| P -> P
 
|-
 
| rs1062874
 
| 227
 
| CGA -> CGA
 
| G -> A
 
| L -> L
 
|-
 
| rs1062869
 
| 217
 
| ACA -> ACT
 
| C -> T
 
| Y -> Y
 
 
|}
 
|}
   

Revision as of 01:18, 27 June 2011

by Benjamin Drexler and Fabian Grandke

Introduction

Databases

Two databases have been used to do research about SNPs in our protein: HGMD and DbSNP.

HGMD

Human Gene Mutation Database (HGMD) is a database maintained by the Institute of Medical Genetics in Cardiff.

The table below shows the different types of mutations that are listed in HGMD:

Mutation Type Number of Mutations Describtion
Missense/nonsense 344 different amino acid/stop codon
Splicing 26 different splice site
Regulatory 1 substitutions cause regulatory abnormalities
Small deletions 69 micro-deletions (20 bp or less)
Small insertions 28 micro-insertions (20 bp or less)
Small indels 8 micro-indels (20 bp or less)
Gross deletions 14 deletions in regions of data with variable quality
Gross insertions/duplications 1 insertions/duplications in regions of data with variable quality
Complex rearrangements 3 sequence is not continuous
Repeat variations 0 variations of the sequence are repeated

The professional version provides many more mutations. The mutations in the free version are at least 2.5 years old (date of publication).


The table below shows all missense and nonsense mutations from HGMD. To gather them, we searched HGMD for gla, chose the drop-down option "Gene symbol", copied&pasted the text from the website into a file ,and parsed out the required information.

Position Amino Acids Codons
1 Met->Thr ATG-ACG
1 Met->Arg ATG-AGG
1 Met->Ile ATGc-ATA
14 Leu->Pro CTT-CCT
16 Leu->Pro CTT-CCT
18 Phe->Ser TTC-TCC
19 Leu->Pro CTG-CCG
20 Ala->Pro gGCC-CCC
31 Ala->Val GCA-GTA
32 Leu->Pro CTG-CCG
34 Asn->Ser AAT-AGT
34 Asn->Lys AATg-AAG
35 Gly->Arg tGGA-AGA
40 Pro->Leu CCT-CTT
40 Pro->Ser gCCT-TCT
41 Thr->Ile ACC-ATC
42 Met->Thr ATG-ACG
42 Met->Leu cATG-CTG
42 Met->Val cATG-GTG
43 Gly->Asp GGC-GAC
43 Gly->Val GGC-GTC
43 Gly->Arg gGGC-CGC
44 Trp->Ter TGG-TAG
46 His->Arg CAC-CGC
46 His->Tyr gCAC-TAC
47 Trp->Gly cTGG-GGG
47 Trp->Leu TGG-TTG
48 Glu->Lys gGAG-AAG
49 Arg->Pro CGC-CCC
49 Arg->Leu CGC-CTC
49 Arg->Ser gCGC-AGC
49 Arg->Gly gCGC-GGC
50 Phe->Cys TTC-TGC
51 Met->Lys ATG-AAG
51 Met->Ile ATGt-ATA
52 Cys->Arg gTGC-CGC
52 Cys->Ser TGC-TCC
52 Cys->Ter TGCa-TGA
54 Leu->Pro CTT-CCT
56 Cys->Gly cTGC-GGC
56 Cys->Tyr TGC-TAC
56 Cys->Phe TGC-TTC
56 Cys->Ter TGCc-TGA
59 Glu->Lys aGAG-AAG
63 Cys->Tyr TGC-TAC
66 Glu->Gly GAG-GGG
66 Glu->Lys tGAG-AAG
66 Glu->Gln tGAG-CAG
68 Leu->Phe gCTC-TTC
72 Met->Arg ATG-AGG
72 Met->Ile ATGg-ATA
72 Met->Val gATG-GTG
73 Ala->Val GCA-GTA
76 Met->Thr ATG-ACG
78 Ser->Ter TCA-TGA
79 Glu->Ter aGAA-TAA
81 Trp->Ter TGG-TAG
81 Trp->Ser TGG-TCG
85 Gly->Asp GGT-GAT
86 Tyr->Cys TAT-TGT
86 Tyr->Ter TATg-TAG
88 Tyr->Asp gTAC-GAC
89 Leu->Pro CTC-CCC
89 Leu->Arg CTC-CGC
91 Ile->Thr ATT-ACT
92 Asp->Asn tGAT-AAT
92 Asp->His tGAT-CAT
92 Asp->Tyr tGAT-TAT
93 Asp->Gly GAC-GGC
93 Asp->Val GAC-GTC
93 Asp->Asn tGAC-AAC
94 Cys->Tyr TGT-TAT
94 Cys->Ser TGT-TCT
95 Trp->Ter TGG-TAG
95 Trp->Ser TGG-TCG
97 Ala->Val GCT-GTT
97 Ala->Pro gGCT-CCT
99 Gln->Ter cCAA-TAA
100 Arg->Lys AGA-AAA
100 Arg->Thr AGA-ACA
102 Ser->Ter TCA-TGA
103 Glu->Gln aGAA-CAA
106 Leu->Arg CTT-CGT
107 Gln->Ter tCAG-TAG
112 Arg->His CGC-CAC
112 Arg->Ser gCGC-AGC
112 Arg->Cys gCGC-TGC
113 Phe->Leu cTTT-CTT
113 Phe->Ser TTT-TCT
118 Arg->Cys tCGC-TGC
119 Gln->Ter cCAG-TAG
121 Ala->Pro aGCT-CCT
125 His->Pro CAC-CCC
126 Ser->Gly cAGC-GGC
127 Lys->Ter cAAA-TAA
128 Gly->Glu GGA-GAA
129 Leu->Pro CTG-CCG
131 Leu->Pro CTA-CCA
132 Gly->Arg aGGG-AGG
134 Tyr->Ser TAT-TCT
134 Tyr->Ter TATg-TAG
135 Ala->Val GCA-GTA
136 Asp->His aGAT-CAT
138 Gly->Glu GGA-GAA
138 Gly->Arg tGGA-AGA
141 Thr->Ile ACC-ATC
142 Cys->Arg cTGC-CGC
142 Cys->Tyr TGC-TAC
142 Cys->Ter TGCg-TGA
142 Cys->Trp TGCg-TGG
143 Ala->Thr cGCA-ACA
143 Ala->Pro cGCA-CCA
144 Gly->Val GGC-GTC
146 Pro->Ser cCCT-TCT
147 Gly->Arg tGGG-AGG
148 Ser->Asn AGT-AAT
148 Ser->Arg AGTt-AGG
151 Tyr->Ter TACt-TAG
152 Tyr->Ter TACg-TAA
155 Asp->His tGAT-CAT
156 Ala->Val GCC-GTC
156 Ala->Thr tGCC-ACC
157 Gln->Ter cCAG-TAG
162 Trp->Arg cTGG-CGG
162 Trp->Ter TGG-TAG
162 Trp->Cys TGGg-TGC
163 Gly->Val GGA-GTA
165 Asp->His aGAT-CAT
165 Asp->Val GAT-GTT
166 Leu->Val tCTG-GTG
167 Leu->Pro CTA-CCA
168 Lys->Arg AAA-AGA
169 Phe->Ser TTT-TCT
170 Asp->Val GAT-GTT
170 Asp->His tGAT-CAT
171 Gly->Asp GGT-GAT
171 Gly->Arg tGGT-CGT
172 Cys->Tyr TGT-TAT
172 Cys->Phe TGT-TTT
172 Cys->Trp TGTt-TGG
172 Cys->Arg tTGT-CGT
172 Cys->Gly tTGT-GGT
173 Tyr->Ter TACt-TAA
177 Leu->Ter TTG-TAG
183 Gly->Asp GGT-GAT
183 Gly->Ser tGGT-AGT
187 Met->Thr ATG-ACG
187 Met->Val cATG-GTG
191 Leu->Gln CTG-CAG
191 Leu->Pro CTG-CCG
194 Thr->Ile ACT-ATT
199 Val->Met tGTG-ATG
201 Ser->Tyr TCC-TAC
201 Ser->Phe TCC-TTC
202 Cys->Tyr TGT-TAT
202 Cys->Trp TGTg-TGG
204 Trp->Ter TGG-TAG
205 Pro->Arg CCT-CGT
205 Pro->Leu CCT-CTT
205 Pro->Thr gCCT-ACT
207 Tyr->Ser TAT-TCT
215 Asn->Ser AAT-AGT
216 Tyr->Asp tTAT-GAT
220 Arg->Ter cCGA-TGA
221 Gln->Ter aCAG-TAG
222 Tyr->Ter TACt-TAA
223 Cys->Arg cTGC-CGC
223 Cys->Gly cTGC-GGC
223 Cys->Tyr TGC-TAC
224 Asn->Ser AAT-AGT
224 Asn->Asp cAAT-GAT
225 His->Arg CAC-CGC
226 Trp->Arg cTGG-CGG
226 Trp->Ter TGG-TAG
226 Trp->Cys TGGc-TGT
227 Arg->Gln CGA-CAA
227 Arg->Ter gCGA-TGA
230 Ala->Thr tGCT-ACT
231 Asp->Val GAC-GTC
231 Asp->Asn tGAC-AAC
234 Asp->Glu GATt-GAG
234 Asp->Tyr tGAT-TAT
235 Ser->Cys TCC-TGC
235 Ser->Phe TCC-TTC
236 Trp->Arg cTGG-CGG
236 Trp->Ter TGG-TAG
236 Trp->Leu TGG-TTG
236 Trp->Cys TGGa-TGC
239 Ile->Thr ATA-ACA
242 Ile->Asn ATC-AAC
243 Leu->Phe TTGg-TTC
244 Asp->Asn gGAC-AAC
244 Asp->His gGAC-CAC
245 Trp->Ter TGG-TAG
247 Ser->Pro aTCT-CCT
247 Ser->Cys TCT-TGT
250 Gln->Ter cCAG-TAG
251 Glu->Ter gGAG-TAG
257 Ala->Pro tGCT-CCT
258 Gly->Val GGA-GTA
258 Gly->Arg tGGA-CGA
259 Pro->Arg CCA-CGA
259 Pro->Leu CCA-CTA
260 Gly->Ala GGG-GCG
261 Gly->Asp GGT-GAT
262 Trp->Ter TGG-TAG
262 Trp->Cys TGGa-TGC
263 Asn->Ser AAT-AGT
264 Asp->Val GAC-GTC
264 Asp->Tyr tGAC-TAC
265 Pro->Arg CCA-CGA
265 Pro->Leu CCA-CTA
266 Asp->Asn aGAT-AAT
266 Asp->His aGAT-CAT
266 Asp->Val GAT-GTT
266 Asp->Glu GATa-GAA
267 Met->Arg ATG-AGG
267 Met->Ile ATGg-ATA
268 Leu->Ser TTA-TCA
269 Val->Met aGTG-ATG
269 Val->Ala GTG-GCG
270 Ile->Thr ATT-ACT
271 Gly->Val GGC-GTC
271 Gly->Ser tGGC-AGC
271 Gly->Cys tGGC-TGC
272 Asn->Lys AACt-AAA
276 Ser->Asn AGC-AAC
276 Ser->Gly cAGC-GGC
277 Trp->Ter TGG-TAG
279 Gln->Arg CAG-CGG
279 Gln->His CAGc-CAC
279 Gln->Glu tCAG-GAG
280 Gln->His CAAg-CAT
280 Gln->Lys gCAA-AAA
282 Thr->Ala aACT-GCT
282 Thr->Asn ACT-AAT
283 Gln->Pro CAG-CCG
284 Met->Thr ATG-ACG
285 Ala->Asp GCC-GAC
285 Ala->Pro gGCC-CCC
287 Trp->Gly cTGG-GGG
287 Trp->Ter TGGg-TGA
287 Trp->Cys TGGg-TGT
288 Ala->Asp GCT-GAT
288 Ala->Pro gGCT-CCT
289 Ile->Ser ATC-AGC
289 Ile->Phe tATC-TTC
290 Met->Ile ATGg-ATA
292 Ala->Thr tGCT-ACT
293 Pro->Thr tCCT-ACT
293 Pro->Ala tCCT-GCT
293 Pro->Ser tCCT-TCT
294 Leu->Ter TTA-TGA
296 Met->Ile ATGt-ATA
296 Met->Val cATG-GTG
297 Ser->Cys TCT-TGT
297 Ser->Phe TCT-TTT
298 Asn->Ser AAT-AGT
298 Asn->Lys AATg-AAG
298 Asn->His tAAT-CAT
300 Leu->Phe cCTC-TTC
300 Leu->His CTC-CAC
301 Arg->Gly cCGA-GGA
301 Arg->Ter cCGA-TGA
301 Arg->Gln CGA-CAA
301 Arg->Pro CGA-CCA
303 Ile->Asn ATC-AAC
306 Gln->Ter tCAA-TAA
308 Lys->Asn AAAg-AAT
310 Leu->Phe tCTC-TTC
312 Gln->Arg CAG-CGG
312 Gln->His CAGg-CAT
313 Asp->Tyr gGAT-TAT
316 Val->Glu GTA-GAA
317 Ile->Asn ATT-AAT
317 Ile->Thr ATT-ACT
320 Asn->Ile AAT-ATT
320 Asn->Lys AATc-AAG
320 Asn->Tyr cAAT-TAT
321 Gln->Arg CAG-CGG
321 Gln->Leu CAG-CTG
321 Gln->Glu tCAG-GAG
321 Gln->Ter tCAG-TAG
325 Gly->Asp GGC-GAC
327 Gln->Lys gCAA-AAA
327 Gln->Glu gCAA-GAA
328 Gly->Arg aGGG-AGG
328 Gly->Ala GGG-GCG
328 Gly->Val GGG-GTG
330 Gln->Ter cCAG-TAG
333 Gln->Ter aCAG-TAG
338 Glu->Lys tGAA-AAA
338 Glu->Ter tGAA-TAA
340 Trp->Arg gTGG-CGG
340 Trp->Ter TGGg-TGA
341 Glu->Asp GAAc-GAC
341 Glu->Lys gGAA-AAA
342 Arg->Ter aCGA-TGA
342 Arg->Gln CGA-CAA
344 Leu->Pro CTC-CCC
345 Ser->Pro cTCA-CCA
348 Ala->Pro aGCC-CCC
349 Trp->Ter TGG-TAG
350 Ala->Pro gGCT-CCT
352 Ala->Asp GCT-GAT
355 Asn->Lys AACc-AAA
356 Arg->Trp cCGG-TGG
357 Gln->Ter gCAG-TAG
358 Glu->Ala GAG-GCG
358 Glu->Gly GAG-GGG
358 Glu->Lys gGAG-AAG
360 Gly->Ser tGGT-AGT
360 Gly->Cys tGGT-TGT
361 Gly->Arg tGGA-AGA
362 Pro->Leu CCT-CTT
363 Arg->His CGC-CAC
363 Arg->Cys tCGC-TGC
365 Tyr->Ter TATa-TAA
373 Gly->Ser gGGT-AGT
373 Gly->Asp GGT-GAT
377 Ala->Asp GCC-GAC
378 Cys->Arg cTGT-CGT
378 Cys->Tyr TGT-TAT
382 Cys->Tyr TGC-TAC
382 Cys->Trp TGCt-TGG
384 Ile->Asn ATC-AAC
385 Thr->Pro cACA-CCA
386 Gln->Ter aCAG-TAG
386 Gln->Pro CAG-CCG
398 Glu->Lys tGAA-AAA
398 Glu->Ter tGAA-TAA
399 Trp->Ter TGG-TAG
401 Ser->Ter TCA-TGA
403 Leu->Ser TTA-TCA
407 Ile->Lys ATA-AAA
409 Pro->Thr tCCC-ACC
409 Pro->Ala tCCC-GCC
409 Pro->Ser tCCC-TCC
410 Thr->Lys ACA-AAA
410 Thr->Pro cACA-CCA
410 Thr->Ala cACA-GCA
411 Gly->Asp GGC-GAC
414 Leu->Ser TTG-TCG
415 Leu->Pro CTT-CCT

dbSNP

We used DbSNP to search for silent mutations, as they are not listed in HGMD (because the do not lead to a disease). The query "synonymous-codon"[Function_Class] AND gla[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] only output two results, these are listed in the table below. Insertions and deletions have not been taken into account, because they usually cause frameshifts, that means that the reading frame of the transcription is changed. Thus the mutation does not only course the change of a single amino acid, but it most probably changes the whole protein sequence. Additionally the new reading frame my read a stop codon and the transcription is terminated to early or the normal stop codon is missed and the sequence gets to long. In all cases the mutation can not be mapped to the initial sequence and thus is excluded from this task.


Identifier AA-Position Triplet Allele Residue
rs77934640 143 GCA -> GCG A -> G A -> A
rs74795363 112 CGC -> CGT C -> T R -> R

Mutation Map

The results from the database searches were mapped to the sequence and highlighted with different colors. As many mutations from HGMD occur at the same amino acid position in the sequence, there are less red or blue letters than mutations in the table in the HGMD section.

  • red: missense/nonsense mutation (HGMD)
  • green: silent mutation (DbSNP)
  • blue: as well missense/nonsense as silent mutation

MQLRNPELHL GCALALRFLA LVSWDIPGAR ALDNGLARTP TMGWLHWERF
MCNLDCQEEP DSCISEKLFM EMAELMVSEG WKDAGYEYLC IDDCWMAPQR
DSEGRLQADP QRFPHGIRQL ANYVHSKGLK LGIYADVGNK TCAGFPGSFG
YYDIDAQTFA DWGVDLLKFD GCYCDSLENL ADGYKHMSLA LNRTGRSIVY
SCEWPLYMWP FQKPNYTEIR QYCNHWRNFA DIDDSWKSIK SILDWTSFNQ
ERIVDVAGPG GWNDPDMLVI GNFGLSWNQQ VTQMALWAIM AAPLFMSNDL
RHISPQAKAL LQDKDVIAIN QDPLGKQGYQ LRQGDNFEVW ERPLSGLAWA
VAMINRQEIG GPRSYTIAVA SLGKGVACNP ACFITQLLPV KRKLGFYEWT
SRLRSHINPT GTVLLQLENT MQMSLKDLL


Additionally, we mapped the mutations to the structure of α-galactosidase (see figure 1). For this, we used the PDB structure 1R47. We applied the same color scheme as above. It should be noted that this structure does not include the signal peptide (residues 1 to 31, see UniProt).

Figure 1: The PDB structures 1R47 in cartoon representation. The residues are colored by the following scheme. Gray: no annoated mutation in HGMD or DbSNP, red: a missense/nonsense mutation according to HGMD, green: a silent mutation according to DbSNP, blue: this residue can cause a missense/nonsense mutation or a silent mutation as well.

References

<references />