Difference between revisions of "Mapping SNPs"

Latest revision as of 15:08, 30 August 2011

HGMD

The Human Genom Mutation Database<ref>http://link.springer.de/link/service/journals/00439/papers/6098005/60980629.pdf</ref> is a collection of disease related proteins and mutations provided by the Cardiff University.
Mutation types:

Missense/nonsense: codon change to code for a different amino acid / premature stop codon
Splicing: modification of the number and composition of exons.
Regulatory: mutations that effects the regulation of gene expression
Small deletions: deletion if amino acids.
Small insertions: insertion of amino acids.
Small indels: replacement of amino acids.
Gross deletions: large deletions caused by DNA structure change
Gross insertions/duplications: large insertion and dulications caused by DNA structure change
Complex rearrangements: change of gene position
Repeat variations: number of microsatelite repeats which can differ in homologous proteins, splicing variants and expression types.

HFE

Main article: Hemochromatosis

Mutation type	# of mutations
Missense/nonsense	24
Splicing	3
Regulatory	1
Small deletions	4
Small insertions	1
Small indels	0
Gross deletions	1
Gross insertions/duplications	0
Complex rearrangements	1
Repeat variations	0
Public total	35

The sequence used by HGMD is the DNA sequence of the HFE protein. We used the 'DNA-RNA-protein' (the translator is somewhat buggy and only functional with the IE) to translate the DNA sequence into RNA into the protein sequence. Afterwards we used Jalview 2.6.1 to align the translated protein sequence against the UniProt FASTA: HFE_HUMAN Amino acid sequence. The alignment shows a 100% match. We will use the UniProt sequence for the visualization of the SNPs.

dbSNP<ref>http://www.ncbi.nlm.nih.gov/pmc/articles/PMC29783/?tool=pubmed</ref>

synonymous

'Synonymous' or 'silent' mutation are mutation, that only affect the nucleotide but not the amino acid sequence. Therefore the protein stays the same, because the mutation does not change the encoding codon or the mutated codon does encode the same amino acid. But in special cases, a silent mutation can lead to a change in the intron/exon splice site which can also change the amino acid, but will not be seen in the nucleotide sequence.

We found 4 SNP's in the dbSNP for homo sapiens by using

"synonymous-codon"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]

as query.

ID	Sequence	Mutation	Position
rs114758821	GGGGAAATGGGCCCGCGAGCCAGGCC[A/G]GCGCTTCTCCTCCTGATGCTTTTGC	A/G	7
rs114038675	CTTTAACTTGCTTTTTCTGTTTTAGA[A/G]CCCTCACCGTCTGGCACCCTAGTCA	A/G	26
rs62625342	GTGGAGCCCCGAACTCCATGGGTTTC[C/T]AGTAGAATTTCAAGCCAGATGTGGC	C/T	76
rs35201683	TCCACAGGAGGAGCCATGGGGCACTA[C/T]GTCTTAGCTGAACGTGAGTGACACG	C/T	70

For rs62625342, no position is assigned. We translated the nucleotide sequence into an amino acid sequence and aligned it with the reference sequence to find the correct position with respect to the reading frame.

non synonymous

'Non synonymous' or 'missense' mutation are mutation, that affect the nucleotide and the amino acid sequence. This happens if a codon is changed into a codon, that encodes a different amino acid. Therefore the sequence of the protein changes. This can lead to functional changes or the total breakdown of the function.

We found 13 SNP's in the dbSNP for homo sapiens by using

"missense"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]

as query.

ID	Sequence	Mutation	Residue change	Position
rs111033563	TGGGGAAGAGCAGAGATATACGTGCC[A/C]GGTGGAGCACCCAGGCCTGGATCAG	A/C	Q/P	283
rs111033558	CATTGGAATTTTGTTCATAATATTAA[G/T]GAAGAGGCAGGGTTCAAGTGAGTAG	G/T	R/M	58
rs111033557	TGGGCTACGTGGATGACCAGCTGTTC[A/G]TGTTCTATGATCATGAGAGTCGCCG	A/G	V/M	59
rs62625346	TGTGACCTCTTCAGTGACCACTCTAC[A/G]GTGTCGGGCCTTGAACTACTACCCC	A/G	R/Q	224
rs28934889	TTTCCTTGTTTGAAGCTTTGGGCTAC[A/G]TGGATGACCAGCTGTTCGTGTTCTA	A/G	V/M	53
rs28934597	GGCTGCAGCTGAGTCAGAGTCTGAAA[C/G]GGTGGGATCACATGTTCACTGTTGA	C/G	G/R	93
rs28934596	CATGTTCACTGTTGACTTCTGGACTA[C/T]TATGGAAAATCACAACCACAGCAAG	C/T	I/T	105
rs28934595	CAGGTCATCCTGGGCTGTGAAATGCA[A/C]GAAGACAACAGTACCGAGGGCTACT	A/C	Q/H	127
rs4986950	TTTGGTGAAGGTGACACATCATGTGA[C/T]CTCTTCAGTGACCACTCTACGGTGT	C/T	T/I	217
rs2242956	TTCACACTCTCTGCACTACCTCTTCA[C/T]GGGTGCCTCAGAGCAGGACCTTGGT	C/T	M/T	35
rs1800730	AGCTGTTCGTGTTCTATGATCATGAG[A/T]GTCGCCGTGTGGAGCCCCGAACTCC	A/T	S/C	65
rs1800562	CCCTGGGGAAGAGCAGAGATATACGT[A/G]CCAGGTGGAGCACCCAGGCCTGGAT	A/G	C/Y	282
rs1799945	ATGACCAGCTGTTCGTGTTCTATGAT[C/G]ATGAGAGTCGCCGTGTGGAGCCCCG	C/G	H/D	63

Residue exchanges marked in red are not annotated in dbSNP yet. We determined these changes by translating the nucleotide sequence into the amino acid sequence and mapped the different reading frames onto the sequence by using a global alignment method (Needlemann-Wunsch). If a frame matches the sequence, we assume that this this frame shows the amino acid exchange. We annotated a mutation based on the UniProt sequence as reference. G/R means that the G is placed in the UniProt sequence and R is placed in the mutated sequence.

Mutation map

Figure 1: Overlap of SNP's annotated in HGMD and dbSNP

3 of the 24 missense mutations listed in the HMGD are at the same codon, therefore we marked only the remaining 21 positions below. There are 5 annotated silent SNP's in dbSNP, but rs62625348 is referenced to rs35201683 and the annotation of rs62625342 is incomplete. Therefore we translated the sequence into a amino acid sequence and according to the reading frame, we mapped the SNP onto the sequence.

>sp|Q30201|HFE_HUMAN Hereditary hemochromatosis protein OS=Homo sapiens GN=HFE PE=1 SV=1
MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLK
GWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLE
RDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGE
EQRYTCQVEHPGLDQPLIVIWEPSPSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE

HGMD
dbSNP
dbSNP/HGMD
The overlap between the two databases is shown in Figure 1.

At the moment, we have no informations about functional residues, but I-Tasser predicted a binding site at a beta-sheet/helix region. We saw that most of the SNP's annotated in HGMD and dbSNP are placed in this region. Because HGMD contains just SNPs with a functional influence, and the accumulation of these SNPs in the Alpha/Beta region indicates a functional region. We also see in the 1DE4 pdb file, that the HFE protein binds in this region at transferin. The beta-region seems to interact with B2M. So, the damaging mutations could affect the binding affinity.

References

@@ Line 1: / Line 1: @@
+<sup>by [[User:Greil|Robert Greil]] and [[User:Landerer|Cedric Landerer]]</sup>
 ==HGMD==
-The '''H'''uman '''G'''enom '''M'''utation '''D'''atabase is a collection of disease related proteins and mutations provided by the Cardiff University.<br>
+The '''H'''uman '''G'''enom '''M'''utation '''D'''atabase<ref>http://link.springer.de/link/service/journals/00439/papers/6098005/60980629.pdf</ref> is a collection of disease related proteins and mutations provided by the Cardiff University.<br>
 Mutation types:
 *Missense/nonsense: codon change to code for a different amino acid / premature stop codon
@@ Line 47: / Line 48: @@
 The sequence used by HGMD is the DNA sequence of the HFE protein. We used the [http://www.attotron.com/cybertory/analysis/trans.htm 'DNA-RNA-protein'] (the translator is somewhat buggy and only functional with the IE) to translate the DNA sequence into RNA into the protein sequence. Afterwards we used Jalview 2.6.1 to align the translated protein sequence against the [[HFE_HUMAN_NM|UniProt FASTA: HFE_HUMAN Amino acid sequence]]. The alignment shows a 100% match. We will use the UniProt sequence for the visualization of the SNPs.
+==dbSNP<ref>http://www.ncbi.nlm.nih.gov/pmc/articles/PMC29783/?tool=pubmed</ref>==
-==dbSNP==
 ===synonymous===
 'Synonymous' or 'silent' mutation are mutation, that only affect the nucleotide but not the amino acid sequence. Therefore the protein stays the same, because the mutation does not change the encoding codon or the mutated codon does encode the same amino acid. But in special cases, a silent mutation can lead to a change in the intron/exon splice site which can also change the amino acid, but will not be seen in the nucleotide sequence.
@@ Line 55: / Line 56: @@
 as query.
-{| border layout = 1
+{| border layout = 1 class="sortable"
 !ID
 !Sequence
@@ Line 71: / Line 72: @@
 |}
-For rs62625342, no position is assigned. We translated the nucleotide sequence into an amino acid sequence and aligned it with the reference sequence to find the correct postion with respect to the reading frame.
+For rs62625342, no position is assigned. We translated the nucleotide sequence into an amino acid sequence and aligned it with the reference sequence to find the correct position with respect to the reading frame.
 ===non synonymous===
@@ Line 81: / Line 82: @@
 as query.
-{| border layout = 1
+{| border layout = 1 class="sortable"
 !ID
 !Sequence
@@ Line 115: / Line 116: @@
 |-
 |}
-Residue exchanges marked in <font color=ff0000>red</font> are not annotated in dbSNP yet. We determined these changes by translating the nucleotide sequence into the amino acid sequence and mapped the different reading frames onto the sequence by using a global alignment methode (Needlemann-Wunsch). If a frame matches the sequence, we assume that this this frame shows the amino acid exchange. We annotated a mutation based on the UniProt sequence as reference. G/R means that the G is placed in the UniProt sequence and R is placed in the mutated sequence.
+Residue exchanges marked in <font color=ff0000>red</font> are not annotated in dbSNP yet. We determined these changes by translating the nucleotide sequence into the amino acid sequence and mapped the different reading frames onto the sequence by using a global alignment method (Needlemann-Wunsch). If a frame matches the sequence, we assume that this this frame shows the amino acid exchange. We annotated a mutation based on the UniProt sequence as reference. G/R means that the G is placed in the UniProt sequence and R is placed in the mutated sequence.
 ==Mutation map==
-[[File:Hgmd_dbSNP_venn.png|thumb|200px|right|Overlap of SNP's annotated in HGMD and dbSNP]]
+[[File:Hgmd_dbSNP_venn.png|thumb|200px|right|Figure 1: Overlap of SNP's annotated in HGMD and dbSNP]]
-of the 24 missense mutations listed in the HMGD are at the same codon, therefore we marked only the remaining 21 positions below. There are 5 annotated silent SNP's in dbSNP, but rs62625348 is referenced to rs35201683 and the annotation of rs62625342 is incomplete. Therefore it was not possible to map all 5 references onto the sequence.
+of the 24 missense mutations listed in the HMGD are at the same codon, therefore we marked only the remaining 21 positions below. There are 5 annotated silent SNP's in dbSNP, but rs62625348 is referenced to rs35201683 and the annotation of rs62625342 is incomplete. Therefore we translated the sequence into a amino acid sequence and according to the reading frame, we mapped the SNP onto the sequence.
  >sp|Q30201|HFE_HUMAN Hereditary hemochromatosis protein OS=Homo sapiens GN=HFE PE=1 SV=1
-MGPRA<span style="background:#00FF00">R</span><span style="background:#FF0000">P</span>ALLLLMLLQTAVLQGRLL<span style="background:#FF0000">R</span>SHSLHYLF<span style="background:#FF0000">M</span>
+ MGPRA<span style="background:#00FF00">R</span><span style="background:#FF0000">P</span>ALLLLMLLQTAVLQGRLL<span style="background:#FF0000">R</span>SHSLHYLF<span style="background:#FF0000">M</span>GASEQDLGLSLFEALGY<span style="background:#AAAAFF">V</span>DDQL<span style="background:#FF0000">F</span><span style="background:#AAAAFF">V</span>FYD<span style="background:#AAAAFF">H</span>E<span style="background:#AAAAFF">S</span><span style="background:#00FF00">R</span>RVE<span style="background:#FF0000">P</span><span style="background:#00FF00">R</span>TPWV<span style="background:#FF0000">S</span>SRISSQMWLQLSQSLK
- GASEQDLGLSLFEALGY<span style="background:#AAAAFF">V</span>DDQL<span style="background:#FF0000">F</span><span style="background:#AAAAFF">V</span>FYD<span style="background:#AAAAFF">H</span>E<span style="background:#AAAAFF">S</span><span style="background:#00FF00">R</span>RVE<span style="background:#FF0000">P</span><span style="background:#00FF00">R</span>TPWV<span style="background:#FF0000">S</span>
+ <span style="background:#AAAAFF">G</span>WDHMFTVDFWT<span style="background:#AAAAFF">I</span>MENHNHSKESHTLQVILGCEM<span style="background:#AAAAFF">Q</span>EDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKL<span style="background:#00FF00">E</span><span style="background:#00FF00">W</span>ERHKIR<span style="background:#00FF00">A</span>RQNRAY<span style="background:#00FF00">L</span>E
- SRISSQMWLQLSQSLK<span style="background:#AAAAFF">G</span>WDHMFTVDFWT<span style="background:#AAAAFF">I</span>
+ RDCPAQLQQLLELGRGVLDQQVPPLVKVTHHV<span style="background:#FF0000">T</span>SSVTTL<span style="background:#AAAAFF">R</span>CRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLA<span style="background:#00FF00">V</span>PPGE
- MENHNHSKESHTLQVILGCEM<span style="background:#AAAAFF">Q</span>EDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKL<span style="background:#00FF00">E</span><span style="background:#00FF00">W</span>ERHKIR<span style="background:#00FF00">A</span>RQNRAY<span style="background:#00FF00">L</span>ERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHV<span style="background:#FF0000">T</span>SSVTTL<span style="background:#AAAAFF">R</span>
+ EQRYT<span style="background:#AAAAFF">C</span><span style="background:#AAAAFF">Q</span>VEHPGLDQPLI<span style="background:#00FF00">V</span>IWEPSPSGTLVIGVISGIAVFVVILFIGILFIIL<span style="background:#00FF00">R</span>KRQGSRGAMGHYVLAERE
- CRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLA<span style="background:#00FF00">V</span>PPGEEQRYT<span style="background:#AAAAFF">C</span><span style="background:#AAAAFF">Q</span>VEHPGLDQPLI<span style="background:#00FF00">V</span>IWEPSPSGTLVIGVISGIAVFVVILFIGILFIIL<span style="background:#00FF00">R</span>KRQGSRGAMGHYVLAERE
-<span style="background:#00FF00">HGMD (missense/nonsense)</span><br>
+<span style="background:#00FF00">HGMD</span><br>
-<span style="background:#FF0000">dbSNP </span><br>
+<span style="background:#FF0000">dbSNP</span><br>
 <span style="background:#AAAAFF">dbSNP/HGMD</span><br>
+The overlap between the two databases is shown in Figure 1.
-We found no overlap between HGMD and dbSNP, but we found at some positions not the assumed amino acid's. Therefore we assume that at this positions more than one amino acid can take place.
-At the moment, we have no informations about functional residues, but I-Tasser predicted a binding site at a beta-sheet/helix region. We saw that most of the SNP's annotated in HGMD and dbSNP are placed in this region.
+At the moment, we have no informations about functional residues, but I-Tasser predicted a binding site at a beta-sheet/helix region. We saw that most of the SNP's annotated in HGMD and dbSNP are placed in this region. Because HGMD contains just SNPs with a functional influence, and the accumulation of these SNPs in the Alpha/Beta region indicates a functional region. We also see in the 1DE4 pdb file, that the HFE protein binds in this region at transferin. The beta-region seems to interact with B2M. So, the damaging mutations could affect the binding affinity.
+== References ==
+<references />
+[[Category : Hemochromatosis]]

Difference between revisions of "Mapping SNPs"

Latest revision as of 15:08, 30 August 2011

Contents

HGMD

HFE

dbSNP<ref>http://www.ncbi.nlm.nih.gov/pmc/articles/PMC29783/?tool=pubmed</ref>

synonymous

non synonymous

Mutation map

References

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools