Difference between revisions of "Mapping SNPs GLA"
m (→Mutation Map) |
(→Introduction) |
||
Line 2: | Line 2: | ||
=Introduction= |
=Introduction= |
||
+ | This task is about point mutations in our protein [[Alpha_galactosidase_reference_amino_acid GLA]]. We searched for SNP data in public databases. The results were mapped to the sequence to get an overview of their distribution. The different properties of the databases have been taken into account. |
||
=Databases= |
=Databases= |
Revision as of 16:08, 29 June 2011
by Benjamin Drexler and Fabian Grandke
Introduction
This task is about point mutations in our protein Alpha_galactosidase_reference_amino_acid GLA. We searched for SNP data in public databases. The results were mapped to the sequence to get an overview of their distribution. The different properties of the databases have been taken into account.
Databases
Two databases have been used to do research about SNPs in our protein: HGMD and DbSNP.
HGMD
Human Gene Mutation Database (HGMD) is a database maintained by the Institute of Medical Genetics in Cardiff.
The table below shows the different types of mutations that are listed in HGMD:
Mutation Type | Number of Mutations | Describtion |
---|---|---|
Missense/nonsense | 344 | different amino acid/stop codon |
Splicing | 26 | different splice site |
Regulatory | 1 | substitutions cause regulatory abnormalities |
Small deletions | 69 | micro-deletions (20 bp or less) |
Small insertions | 28 | micro-insertions (20 bp or less) |
Small indels | 8 | micro-indels (20 bp or less) |
Gross deletions | 14 | deletions in regions of data with variable quality |
Gross insertions/duplications | 1 | insertions/duplications in regions of data with variable quality |
Complex rearrangements | 3 | sequence is not continuous |
Repeat variations | 0 | variations of the sequence are repeated |
The professional version provides many more mutations. The mutations in the free version are at least 2.5 years old (date of publication).
The table below shows all missense and nonsense mutations from HGMD. To gather them, we searched HGMD for gla
, chose the drop-down option "Gene symbol", copied&pasted the text from the website into a file ,and parsed out the required information.
Position | Amino Acids | Codons |
---|---|---|
1 | Met->Thr | ATG-ACG |
1 | Met->Arg | ATG-AGG |
1 | Met->Ile | ATGc-ATA |
14 | Leu->Pro | CTT-CCT |
16 | Leu->Pro | CTT-CCT |
18 | Phe->Ser | TTC-TCC |
19 | Leu->Pro | CTG-CCG |
20 | Ala->Pro | gGCC-CCC |
31 | Ala->Val | GCA-GTA |
32 | Leu->Pro | CTG-CCG |
34 | Asn->Ser | AAT-AGT |
34 | Asn->Lys | AATg-AAG |
35 | Gly->Arg | tGGA-AGA |
40 | Pro->Leu | CCT-CTT |
40 | Pro->Ser | gCCT-TCT |
41 | Thr->Ile | ACC-ATC |
42 | Met->Thr | ATG-ACG |
42 | Met->Leu | cATG-CTG |
42 | Met->Val | cATG-GTG |
43 | Gly->Asp | GGC-GAC |
43 | Gly->Val | GGC-GTC |
43 | Gly->Arg | gGGC-CGC |
44 | Trp->Ter | TGG-TAG |
46 | His->Arg | CAC-CGC |
46 | His->Tyr | gCAC-TAC |
47 | Trp->Gly | cTGG-GGG |
47 | Trp->Leu | TGG-TTG |
48 | Glu->Lys | gGAG-AAG |
49 | Arg->Pro | CGC-CCC |
49 | Arg->Leu | CGC-CTC |
49 | Arg->Ser | gCGC-AGC |
49 | Arg->Gly | gCGC-GGC |
50 | Phe->Cys | TTC-TGC |
51 | Met->Lys | ATG-AAG |
51 | Met->Ile | ATGt-ATA |
52 | Cys->Arg | gTGC-CGC |
52 | Cys->Ser | TGC-TCC |
52 | Cys->Ter | TGCa-TGA |
54 | Leu->Pro | CTT-CCT |
56 | Cys->Gly | cTGC-GGC |
56 | Cys->Tyr | TGC-TAC |
56 | Cys->Phe | TGC-TTC |
56 | Cys->Ter | TGCc-TGA |
59 | Glu->Lys | aGAG-AAG |
63 | Cys->Tyr | TGC-TAC |
66 | Glu->Gly | GAG-GGG |
66 | Glu->Lys | tGAG-AAG |
66 | Glu->Gln | tGAG-CAG |
68 | Leu->Phe | gCTC-TTC |
72 | Met->Arg | ATG-AGG |
72 | Met->Ile | ATGg-ATA |
72 | Met->Val | gATG-GTG |
73 | Ala->Val | GCA-GTA |
76 | Met->Thr | ATG-ACG |
78 | Ser->Ter | TCA-TGA |
79 | Glu->Ter | aGAA-TAA |
81 | Trp->Ter | TGG-TAG |
81 | Trp->Ser | TGG-TCG |
85 | Gly->Asp | GGT-GAT |
86 | Tyr->Cys | TAT-TGT |
86 | Tyr->Ter | TATg-TAG |
88 | Tyr->Asp | gTAC-GAC |
89 | Leu->Pro | CTC-CCC |
89 | Leu->Arg | CTC-CGC |
91 | Ile->Thr | ATT-ACT |
92 | Asp->Asn | tGAT-AAT |
92 | Asp->His | tGAT-CAT |
92 | Asp->Tyr | tGAT-TAT |
93 | Asp->Gly | GAC-GGC |
93 | Asp->Val | GAC-GTC |
93 | Asp->Asn | tGAC-AAC |
94 | Cys->Tyr | TGT-TAT |
94 | Cys->Ser | TGT-TCT |
95 | Trp->Ter | TGG-TAG |
95 | Trp->Ser | TGG-TCG |
97 | Ala->Val | GCT-GTT |
97 | Ala->Pro | gGCT-CCT |
99 | Gln->Ter | cCAA-TAA |
100 | Arg->Lys | AGA-AAA |
100 | Arg->Thr | AGA-ACA |
102 | Ser->Ter | TCA-TGA |
103 | Glu->Gln | aGAA-CAA |
106 | Leu->Arg | CTT-CGT |
107 | Gln->Ter | tCAG-TAG |
112 | Arg->His | CGC-CAC |
112 | Arg->Ser | gCGC-AGC |
112 | Arg->Cys | gCGC-TGC |
113 | Phe->Leu | cTTT-CTT |
113 | Phe->Ser | TTT-TCT |
118 | Arg->Cys | tCGC-TGC |
119 | Gln->Ter | cCAG-TAG |
121 | Ala->Pro | aGCT-CCT |
125 | His->Pro | CAC-CCC |
126 | Ser->Gly | cAGC-GGC |
127 | Lys->Ter | cAAA-TAA |
128 | Gly->Glu | GGA-GAA |
129 | Leu->Pro | CTG-CCG |
131 | Leu->Pro | CTA-CCA |
132 | Gly->Arg | aGGG-AGG |
134 | Tyr->Ser | TAT-TCT |
134 | Tyr->Ter | TATg-TAG |
135 | Ala->Val | GCA-GTA |
136 | Asp->His | aGAT-CAT |
138 | Gly->Glu | GGA-GAA |
138 | Gly->Arg | tGGA-AGA |
141 | Thr->Ile | ACC-ATC |
142 | Cys->Arg | cTGC-CGC |
142 | Cys->Tyr | TGC-TAC |
142 | Cys->Ter | TGCg-TGA |
142 | Cys->Trp | TGCg-TGG |
143 | Ala->Thr | cGCA-ACA |
143 | Ala->Pro | cGCA-CCA |
144 | Gly->Val | GGC-GTC |
146 | Pro->Ser | cCCT-TCT |
147 | Gly->Arg | tGGG-AGG |
148 | Ser->Asn | AGT-AAT |
148 | Ser->Arg | AGTt-AGG |
151 | Tyr->Ter | TACt-TAG |
152 | Tyr->Ter | TACg-TAA |
155 | Asp->His | tGAT-CAT |
156 | Ala->Val | GCC-GTC |
156 | Ala->Thr | tGCC-ACC |
157 | Gln->Ter | cCAG-TAG |
162 | Trp->Arg | cTGG-CGG |
162 | Trp->Ter | TGG-TAG |
162 | Trp->Cys | TGGg-TGC |
163 | Gly->Val | GGA-GTA |
165 | Asp->His | aGAT-CAT |
165 | Asp->Val | GAT-GTT |
166 | Leu->Val | tCTG-GTG |
167 | Leu->Pro | CTA-CCA |
168 | Lys->Arg | AAA-AGA |
169 | Phe->Ser | TTT-TCT |
170 | Asp->Val | GAT-GTT |
170 | Asp->His | tGAT-CAT |
171 | Gly->Asp | GGT-GAT |
171 | Gly->Arg | tGGT-CGT |
172 | Cys->Tyr | TGT-TAT |
172 | Cys->Phe | TGT-TTT |
172 | Cys->Trp | TGTt-TGG |
172 | Cys->Arg | tTGT-CGT |
172 | Cys->Gly | tTGT-GGT |
173 | Tyr->Ter | TACt-TAA |
177 | Leu->Ter | TTG-TAG |
183 | Gly->Asp | GGT-GAT |
183 | Gly->Ser | tGGT-AGT |
187 | Met->Thr | ATG-ACG |
187 | Met->Val | cATG-GTG |
191 | Leu->Gln | CTG-CAG |
191 | Leu->Pro | CTG-CCG |
194 | Thr->Ile | ACT-ATT |
199 | Val->Met | tGTG-ATG |
201 | Ser->Tyr | TCC-TAC |
201 | Ser->Phe | TCC-TTC |
202 | Cys->Tyr | TGT-TAT |
202 | Cys->Trp | TGTg-TGG |
204 | Trp->Ter | TGG-TAG |
205 | Pro->Arg | CCT-CGT |
205 | Pro->Leu | CCT-CTT |
205 | Pro->Thr | gCCT-ACT |
207 | Tyr->Ser | TAT-TCT |
215 | Asn->Ser | AAT-AGT |
216 | Tyr->Asp | tTAT-GAT |
220 | Arg->Ter | cCGA-TGA |
221 | Gln->Ter | aCAG-TAG |
222 | Tyr->Ter | TACt-TAA |
223 | Cys->Arg | cTGC-CGC |
223 | Cys->Gly | cTGC-GGC |
223 | Cys->Tyr | TGC-TAC |
224 | Asn->Ser | AAT-AGT |
224 | Asn->Asp | cAAT-GAT |
225 | His->Arg | CAC-CGC |
226 | Trp->Arg | cTGG-CGG |
226 | Trp->Ter | TGG-TAG |
226 | Trp->Cys | TGGc-TGT |
227 | Arg->Gln | CGA-CAA |
227 | Arg->Ter | gCGA-TGA |
230 | Ala->Thr | tGCT-ACT |
231 | Asp->Val | GAC-GTC |
231 | Asp->Asn | tGAC-AAC |
234 | Asp->Glu | GATt-GAG |
234 | Asp->Tyr | tGAT-TAT |
235 | Ser->Cys | TCC-TGC |
235 | Ser->Phe | TCC-TTC |
236 | Trp->Arg | cTGG-CGG |
236 | Trp->Ter | TGG-TAG |
236 | Trp->Leu | TGG-TTG |
236 | Trp->Cys | TGGa-TGC |
239 | Ile->Thr | ATA-ACA |
242 | Ile->Asn | ATC-AAC |
243 | Leu->Phe | TTGg-TTC |
244 | Asp->Asn | gGAC-AAC |
244 | Asp->His | gGAC-CAC |
245 | Trp->Ter | TGG-TAG |
247 | Ser->Pro | aTCT-CCT |
247 | Ser->Cys | TCT-TGT |
250 | Gln->Ter | cCAG-TAG |
251 | Glu->Ter | gGAG-TAG |
257 | Ala->Pro | tGCT-CCT |
258 | Gly->Val | GGA-GTA |
258 | Gly->Arg | tGGA-CGA |
259 | Pro->Arg | CCA-CGA |
259 | Pro->Leu | CCA-CTA |
260 | Gly->Ala | GGG-GCG |
261 | Gly->Asp | GGT-GAT |
262 | Trp->Ter | TGG-TAG |
262 | Trp->Cys | TGGa-TGC |
263 | Asn->Ser | AAT-AGT |
264 | Asp->Val | GAC-GTC |
264 | Asp->Tyr | tGAC-TAC |
265 | Pro->Arg | CCA-CGA |
265 | Pro->Leu | CCA-CTA |
266 | Asp->Asn | aGAT-AAT |
266 | Asp->His | aGAT-CAT |
266 | Asp->Val | GAT-GTT |
266 | Asp->Glu | GATa-GAA |
267 | Met->Arg | ATG-AGG |
267 | Met->Ile | ATGg-ATA |
268 | Leu->Ser | TTA-TCA |
269 | Val->Met | aGTG-ATG |
269 | Val->Ala | GTG-GCG |
270 | Ile->Thr | ATT-ACT |
271 | Gly->Val | GGC-GTC |
271 | Gly->Ser | tGGC-AGC |
271 | Gly->Cys | tGGC-TGC |
272 | Asn->Lys | AACt-AAA |
276 | Ser->Asn | AGC-AAC |
276 | Ser->Gly | cAGC-GGC |
277 | Trp->Ter | TGG-TAG |
279 | Gln->Arg | CAG-CGG |
279 | Gln->His | CAGc-CAC |
279 | Gln->Glu | tCAG-GAG |
280 | Gln->His | CAAg-CAT |
280 | Gln->Lys | gCAA-AAA |
282 | Thr->Ala | aACT-GCT |
282 | Thr->Asn | ACT-AAT |
283 | Gln->Pro | CAG-CCG |
284 | Met->Thr | ATG-ACG |
285 | Ala->Asp | GCC-GAC |
285 | Ala->Pro | gGCC-CCC |
287 | Trp->Gly | cTGG-GGG |
287 | Trp->Ter | TGGg-TGA |
287 | Trp->Cys | TGGg-TGT |
288 | Ala->Asp | GCT-GAT |
288 | Ala->Pro | gGCT-CCT |
289 | Ile->Ser | ATC-AGC |
289 | Ile->Phe | tATC-TTC |
290 | Met->Ile | ATGg-ATA |
292 | Ala->Thr | tGCT-ACT |
293 | Pro->Thr | tCCT-ACT |
293 | Pro->Ala | tCCT-GCT |
293 | Pro->Ser | tCCT-TCT |
294 | Leu->Ter | TTA-TGA |
296 | Met->Ile | ATGt-ATA |
296 | Met->Val | cATG-GTG |
297 | Ser->Cys | TCT-TGT |
297 | Ser->Phe | TCT-TTT |
298 | Asn->Ser | AAT-AGT |
298 | Asn->Lys | AATg-AAG |
298 | Asn->His | tAAT-CAT |
300 | Leu->Phe | cCTC-TTC |
300 | Leu->His | CTC-CAC |
301 | Arg->Gly | cCGA-GGA |
301 | Arg->Ter | cCGA-TGA |
301 | Arg->Gln | CGA-CAA |
301 | Arg->Pro | CGA-CCA |
303 | Ile->Asn | ATC-AAC |
306 | Gln->Ter | tCAA-TAA |
308 | Lys->Asn | AAAg-AAT |
310 | Leu->Phe | tCTC-TTC |
312 | Gln->Arg | CAG-CGG |
312 | Gln->His | CAGg-CAT |
313 | Asp->Tyr | gGAT-TAT |
316 | Val->Glu | GTA-GAA |
317 | Ile->Asn | ATT-AAT |
317 | Ile->Thr | ATT-ACT |
320 | Asn->Ile | AAT-ATT |
320 | Asn->Lys | AATc-AAG |
320 | Asn->Tyr | cAAT-TAT |
321 | Gln->Arg | CAG-CGG |
321 | Gln->Leu | CAG-CTG |
321 | Gln->Glu | tCAG-GAG |
321 | Gln->Ter | tCAG-TAG |
325 | Gly->Asp | GGC-GAC |
327 | Gln->Lys | gCAA-AAA |
327 | Gln->Glu | gCAA-GAA |
328 | Gly->Arg | aGGG-AGG |
328 | Gly->Ala | GGG-GCG |
328 | Gly->Val | GGG-GTG |
330 | Gln->Ter | cCAG-TAG |
333 | Gln->Ter | aCAG-TAG |
338 | Glu->Lys | tGAA-AAA |
338 | Glu->Ter | tGAA-TAA |
340 | Trp->Arg | gTGG-CGG |
340 | Trp->Ter | TGGg-TGA |
341 | Glu->Asp | GAAc-GAC |
341 | Glu->Lys | gGAA-AAA |
342 | Arg->Ter | aCGA-TGA |
342 | Arg->Gln | CGA-CAA |
344 | Leu->Pro | CTC-CCC |
345 | Ser->Pro | cTCA-CCA |
348 | Ala->Pro | aGCC-CCC |
349 | Trp->Ter | TGG-TAG |
350 | Ala->Pro | gGCT-CCT |
352 | Ala->Asp | GCT-GAT |
355 | Asn->Lys | AACc-AAA |
356 | Arg->Trp | cCGG-TGG |
357 | Gln->Ter | gCAG-TAG |
358 | Glu->Ala | GAG-GCG |
358 | Glu->Gly | GAG-GGG |
358 | Glu->Lys | gGAG-AAG |
360 | Gly->Ser | tGGT-AGT |
360 | Gly->Cys | tGGT-TGT |
361 | Gly->Arg | tGGA-AGA |
362 | Pro->Leu | CCT-CTT |
363 | Arg->His | CGC-CAC |
363 | Arg->Cys | tCGC-TGC |
365 | Tyr->Ter | TATa-TAA |
373 | Gly->Ser | gGGT-AGT |
373 | Gly->Asp | GGT-GAT |
377 | Ala->Asp | GCC-GAC |
378 | Cys->Arg | cTGT-CGT |
378 | Cys->Tyr | TGT-TAT |
382 | Cys->Tyr | TGC-TAC |
382 | Cys->Trp | TGCt-TGG |
384 | Ile->Asn | ATC-AAC |
385 | Thr->Pro | cACA-CCA |
386 | Gln->Ter | aCAG-TAG |
386 | Gln->Pro | CAG-CCG |
398 | Glu->Lys | tGAA-AAA |
398 | Glu->Ter | tGAA-TAA |
399 | Trp->Ter | TGG-TAG |
401 | Ser->Ter | TCA-TGA |
403 | Leu->Ser | TTA-TCA |
407 | Ile->Lys | ATA-AAA |
409 | Pro->Thr | tCCC-ACC |
409 | Pro->Ala | tCCC-GCC |
409 | Pro->Ser | tCCC-TCC |
410 | Thr->Lys | ACA-AAA |
410 | Thr->Pro | cACA-CCA |
410 | Thr->Ala | cACA-GCA |
411 | Gly->Asp | GGC-GAC |
414 | Leu->Ser | TTG-TCG |
415 | Leu->Pro | CTT-CCT |
dbSNP
We used DbSNP to search for silent mutations, as they are not listed in HGMD (because the do not lead to a disease). The query "synonymous-codon"[Function_Class] AND gla[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]
only output two results, these are listed in the table below. Insertions and deletions have not been taken into account, because they usually cause frameshifts, that means that the reading frame of the transcription is changed. Thus the mutation does not only course the change of a single amino acid, but it most probably changes the whole protein sequence. Additionally the new reading frame my read a stop codon and the transcription is terminated to early or the normal stop codon is missed and the sequence gets to long. In all cases the mutation can not be mapped to the initial sequence and thus is excluded from this task.
Identifier | AA-Position | Triplet | Allele | Residue |
---|---|---|---|---|
rs77934640 | 143 | GCA -> GCG | A -> G | A -> A |
rs74795363 | 112 | CGC -> CGT | C -> T | R -> R |
Mutation Map
The results from the database searches were mapped to the sequence and highlighted with different colors. As many mutations from HGMD occur at the same amino acid position in the sequence, there are less red or blue letters than mutations in the table in the HGMD section.
- red: missense/nonsense mutation (HGMD)
- blue: as well missense/nonsense as silent mutation
MQLRNPELHL
GCALALRFLA
LVSWDIPGAR
ALDNGLARTP
TMGWLHWERF
MCNLDCQEEP
DSCISEKLFM
EMAELMVSEG
WKDAGYEYLC
IDDCWMAPQR
DSEGRLQADP
QRFPHGIRQL
ANYVHSKGLK
LGIYADVGNK
TCAGFPGSFG
YYDIDAQTFA
DWGVDLLKFD
GCYCDSLENL
ADGYKHMSLA
LNRTGRSIVY
SCEWPLYMWP
FQKPNYTEIR
QYCNHWRNFA
DIDDSWKSIK
SILDWTSFNQ
ERIVDVAGPG
GWNDPDMLVI
GNFGLSWNQQ
VTQMALWAIM
AAPLFMSNDL
RHISPQAKAL
LQDKDVIAIN
QDPLGKQGYQ
LRQGDNFEVW
ERPLSGLAWA
VAMINRQEIG
GPRSYTIAVA
SLGKGVACNP
ACFITQLLPV
KRKLGFYEWT
SRLRSHINPT
GTVLLQLENT
MQMSLKDLL
Additionally, we mapped the mutations to the structure of α-galactosidase (see figure 1). For this, we used the PDB structure 1R47. We applied the same color scheme as above. It should be noted that this structure does not include the signal peptide (residues 1 to 31, see UniProt).
References
<references />