Difference between revisions of "Mapping SNPs"

From Bioinformatikpedia
(synonymous)
(synonymous)
Line 65: Line 65:
 
| rs114038675 || CTTTAACTTGCTTTTTCTGTTTTAGA[A/G]CCCTCACCGTCTGGCACCCTAGTCA || A/G || 26
 
| rs114038675 || CTTTAACTTGCTTTTTCTGTTTTAGA[A/G]CCCTCACCGTCTGGCACCCTAGTCA || A/G || 26
 
|-
 
|-
| rs62625342 || GTGGAGCCCCGAACTCCATGGGTTTC[C/T]AGTAGAATTTCAAGCCAGATGTGGC || C/T || --
+
| rs62625342 || GTGGAGCCCCGAACTCCATGGGTTTC[C/T]AGTAGAATTTCAAGCCAGATGTGGC || C/T || <font color=ff0000>76</font>
 
|-
 
|-
 
| rs35201683 || TCCACAGGAGGAGCCATGGGGCACTA[C/T]GTCTTAGCTGAACGTGAGTGACACG || C/T || 70
 
| rs35201683 || TCCACAGGAGGAGCCATGGGGCACTA[C/T]GTCTTAGCTGAACGTGAGTGACACG || C/T || 70

Revision as of 18:26, 20 June 2011

HGMD

The Human Genom Mutation Database is a collection of disease related proteins and mutations provided by the Cardiff University.
Mutation types:

  • Missense/nonsense: codon change to code for a different amino acid / premature stop codon
  • Splicing: modification of the number and composition of exons.
  • Regulatory: mutations that effects the regulation of gene expression
  • Small deletions: deletion if amino acids.
  • Small insertions: insertion of amino acids.
  • Small indels: replacement of amino acids.
  • Gross deletions: large deletions caused by DNA structure change
  • Gross insertions/duplications: large insertion and dulications caused by DNA structure change
  • Complex rearrangements: change of gene position
  • Repeat variations: number of microsatelite repeats which can differ in homologous proteins, splicing variants and expression types.


HFE

Main article: Hemochromatosis

Mutation type # of mutations
Missense/nonsense 24
Splicing 3
Regulatory 1
Small deletions 4
Small insertions 1
Small indels 0
Gross deletions 1
Gross insertions/duplications 0
Complex rearrangements 1
Repeat variations 0
Public total 35

The sequence used by HGMD is the DNA sequence of the HFE protein. We used the 'DNA-RNA-protein' (the translator is somewhat buggy and only functional with the IE) to translate the DNA sequence into RNA into the protein sequence. Afterwards we used Jalview 2.6.1 to align the translated protein sequence against the UniProt FASTA: HFE_HUMAN Amino acid sequence. The alignment shows a 100% match. We will use the UniProt sequence for the visualization of the SNPs.

dbSNP

synonymous

'Synonymous' or 'silent' mutation are mutation, that only affect the nucleotide but not the amino acid sequence. Therefore the protein stays the same, because the mutation does not change the encoding codon or the mutated codon does encode the same amino acid. But in special cases, a silent mutation can lead to a change in the intron/exon splice site which can also change the amino acid, but will not be seen in the nucleotide sequence.

We found 4 SNP's in the dbSNP for homo sapiens by using

  • "synonymous-codon"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]

as query.

ID Sequence Mutation Position
rs114758821 GGGGAAATGGGCCCGCGAGCCAGGCC[A/G]GCGCTTCTCCTCCTGATGCTTTTGC A/G 7
rs114038675 CTTTAACTTGCTTTTTCTGTTTTAGA[A/G]CCCTCACCGTCTGGCACCCTAGTCA A/G 26
rs62625342 GTGGAGCCCCGAACTCCATGGGTTTC[C/T]AGTAGAATTTCAAGCCAGATGTGGC C/T 76
rs35201683 TCCACAGGAGGAGCCATGGGGCACTA[C/T]GTCTTAGCTGAACGTGAGTGACACG C/T 70

non synonymous

'Non synonymous' or 'missense' mutation are mutation, that affect the nucleotide and the amino acid sequence. This happens if a codon is changed into a codon, that encodes a different amino acid. Therefore the sequence of the protein changes. This can lead to functional changes or the total breakdown of the function.


We found 13 SNP's in the dbSNP for homo sapiens by using

  • "missense"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]

as query.

ID Sequence Mutation Residue change Position
rs111033563 TGGGGAAGAGCAGAGATATACGTGCC[A/C]GGTGGAGCACCCAGGCCTGGATCAG A/C Q/P 283
rs111033558 CATTGGAATTTTGTTCATAATATTAA[G/T]GAAGAGGCAGGGTTCAAGTGAGTAG G/T R/M 58
rs111033557 TGGGCTACGTGGATGACCAGCTGTTC[A/G]TGTTCTATGATCATGAGAGTCGCCG A/G V/M 59
rs62625346 TGTGACCTCTTCAGTGACCACTCTAC[A/G]GTGTCGGGCCTTGAACTACTACCCC A/G R/Q 224
rs28934889 TTTCCTTGTTTGAAGCTTTGGGCTAC[A/G]TGGATGACCAGCTGTTCGTGTTCTA A/G V/M 53
rs28934597 GGCTGCAGCTGAGTCAGAGTCTGAAA[C/G]GGTGGGATCACATGTTCACTGTTGA C/G G/R 93
rs28934596 CATGTTCACTGTTGACTTCTGGACTA[C/T]TATGGAAAATCACAACCACAGCAAG C/T I/T 105
rs28934595 CAGGTCATCCTGGGCTGTGAAATGCA[A/C]GAAGACAACAGTACCGAGGGCTACT A/C Q/H 127
rs4986950 TTTGGTGAAGGTGACACATCATGTGA[C/T]CTCTTCAGTGACCACTCTACGGTGT C/T T/I 217
rs2242956 TTCACACTCTCTGCACTACCTCTTCA[C/T]GGGTGCCTCAGAGCAGGACCTTGGT C/T M/T 35
rs1800730 AGCTGTTCGTGTTCTATGATCATGAG[A/T]GTCGCCGTGTGGAGCCCCGAACTCC A/T S/C 65
rs1800562 CCCTGGGGAAGAGCAGAGATATACGT[A/G]CCAGGTGGAGCACCCAGGCCTGGAT A/G C/Y 282
rs1799945 ATGACCAGCTGTTCGTGTTCTATGAT[C/G]ATGAGAGTCGCCGTGTGGAGCCCCG C/G H/D 63

Residue exchanges marked in red are not annotated in dbSNP yet. We determined these changes by translating the nucleotide sequence into the amino acid sequence and mapped the different reading frames onto the sequence by using a global alignment methode (Needlemann-Wunsch). If a frame matches the sequence, we assume that this this frame shows the amino acid exchange. We annotated a mutation based on the UniProt sequence as reference. G/R means that the G is placed in the UniProt sequence and R is placed in the mutated sequence.

Mutation map

3 of the 24 missense mutations listed in the HMGD are at the same codon, therefore we marked only the remaining 21 positions below. There are 5 annotated silent SNP's in dbSNP, but rs62625348 is referenced to rs35201683 and the annotation of rs62625342 is incomplete. Therefore it was not possible to map all 5 references onto the sequence.

>sp|Q30201|HFE_HUMAN Hereditary hemochromatosis protein OS=Homo sapiens GN=HFE PE=1 SV=1
MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF 
YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV 
ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR 
AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL 
KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS 
PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE

HGMD (missense/nonsense)
dbSNP (synonymous)

We found no overlap between HGMD and dbSNP, but we found at some positions not the assumed amino acid's. Therefore we assume that at this positions more than one amino acid can take place.

At the moment, we have no informations about functional residues, but I-Tasser predicted a binding site at a beta-sheet/helix region. We saw that most of the SNP's annotated in HGMD and dbSNP are placed in this region.