Difference between revisions of "Task 7: Research SNPs"
(→Databases comparison) |
|||
Line 59: | Line 59: | ||
</css> |
</css> |
||
− | |||
== HGMD (The Human Gene Mutation Database) == |
== HGMD (The Human Gene Mutation Database) == |
||
Line 288: | Line 287: | ||
|- |
|- |
||
| by freq || 8 |
| by freq || 8 |
||
− | |+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 4:''' Types of experimental evidence and their |
+ | |+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 4:''' Types of experimental evidence and their occurrence among the 35 mutations associated with HFE. |
|} |
|} |
||
</figtable> |
</figtable> |
||
Line 404: | Line 403: | ||
</figtable> |
</figtable> |
||
− | A complete list of the mutations can be found in <xr id="omim"/>. Two of the mutations are |
+ | A complete list of the mutations can be found in <xr id="omim"/>. Two of the mutations are polymorphisms of the HFE protein and 8 of them cause the disease hemochromatosis. |
== Databases comparison == |
== Databases comparison == |
||
Line 423: | Line 422: | ||
</figtable> |
</figtable> |
||
− | <xr id="comparison"/> contains a comparison of the four databases used in this task. It clearly shows that OMIM is most up to date |
+ | <xr id="comparison"/> contains a comparison of the four databases used in this task. It clearly shows that OMIM is most up to date, but also the private version of HGMD is from 2013. However, dbSNP, SNPdbe and especially the public HGMD have not been updated for over a year. dbSNP is very large and is the only database that also contains synonymous SNPs. |
== Mutation map == |
== Mutation map == |
||
Line 433: | Line 432: | ||
</figure> |
</figure> |
||
− | <xr id="mutation map"/> shows the mutation map for HFE. Disease causing mutations (red) and non disease causing mutations (blue) are nearly equally distributed over the protein sequence. |
+ | <xr id="mutation map"/> shows the mutation map for HFE. Disease causing mutations (red) and non disease causing mutations (blue) are nearly equally distributed over the protein sequence. Nevertheless, there are some region where only dc mutation are located. Those regions are probably especially relevant for the protein function and can not be mutated without a decrease in function. |
<figtable id="mutations struc"> |
<figtable id="mutations struc"> |
||
{| |
{| |
||
| [[File:Mutations_front_HFE.png|center|thumb|300px]] || [[File:Mutations_top_hfe.png|center|thumb|300px]] || [[File:Mutations_side_hfe.png|center|thumb|300px]] |
| [[File:Mutations_front_HFE.png|center|thumb|300px]] || [[File:Mutations_top_hfe.png|center|thumb|300px]] || [[File:Mutations_side_hfe.png|center|thumb|300px]] |
||
− | |+ style="caption-side: bottom; text-align: left" |<font size=1.5>'''Figure 2:''' Visualisation of the location of the different mutations in the pdb structure for 1A6Z chain A (HFE). The structure is shown in three different orientations. The protein domains are |
+ | |+ style="caption-side: bottom; text-align: left" |<font size=1.5>'''Figure 2:''' Visualisation of the location of the different mutations in the pdb structure for 1A6Z chain A (HFE). The structure is shown in three different orientations. The protein domains are colored according to <xr id="mutation map"/> with the MHC antigen-recognition domain in green and the Immunoglobulin domain in blue. Disease causing mutations are marked in red and non disease causing mutations in grey. |
|} |
|} |
||
</figtable> |
</figtable> |
||
− | In addition to the mutation map, <xr id="mutations struc"/> shows a 3D visualisation of the location of the mutated amin acids in the protein structure (1A6Z,A). Most disease causing mutations are located in secondary structure regions, but there are also some mutations that are located in loops. |
+ | In addition to the mutation map, <xr id="mutations struc"/> shows a 3D visualisation of the location of the mutated amin acids in the protein structure (1A6Z,A). Most disease causing mutations are located in secondary structure regions, but there are also some mutations that are located in loops. Therefore, it is difficult to tell which regions of the structure are important for the function when only looking at the the location of the mutations. |
== References == |
== References == |
Latest revision as of 23:23, 1 September 2013
<css> table.colBasic2 { margin-left: auto; margin-right: auto; border: 1px solid black; border-collapse:collapse; width: 60%; }
.colBasic2 th,td { padding: 3px; border: 1px solid black; }
.colBasic2 td { text-align:left; }
/* for orange try #ff7f00 and #ffaa56 for blue try #005fbf and #aad4ff
maria's style blue: #adceff grey: #efefef
- /
.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;} </css>
<css> table.colBasic3 { margin-left: auto; margin-right: auto; border: 1px solid black; border-collapse:collapse; width: 30%; }
.colBasic3 th,td { padding: 3px; border: 1px solid black; }
.colBasic3 td { text-align:left; }
/* for orange try #ff7f00 and #ffaa56 for blue try #005fbf and #aad4ff
maria's style blue: #adceff grey: #efefef
- /
.colBasic3 tr th { background-color:#efefef; color: black;} .colBasic3 tr:first-child th { background-color:#adceff; color:black;}
</css>
Contents
HGMD (The Human Gene Mutation Database)
The search results for the HFE gene contain the different types of mutations that are specified in <xr id="hgmd"/>:
<figtable id="hgmd">
mutation type | definition | number |
---|---|---|
missense, nonsense | mutation that leads to a change of amino acid or a stop codon | 28 |
splicing | mutation that affects mRNA splicing | 3 |
regulatory | substitiution causing abnormal regulation | 1 |
small deletion | micro deletion (<= 20 bp) | 4 |
small insertion | micro insertions (<= 20 bp) | 1 |
small indel | micro indels (<= 20 bp) | 0 |
gross deletion | delition > 20 bp | 2 |
gross insertions/duplications | insertion > 20bp | 0 |
complex rearrangments | rearrangements of stretches of the DNA sequence | 1 |
repeat variations | differences in repeat length | 0 |
</figtable>
In total, we found 40 mutation in the public version of the database and 49 in the non-public version.
<figtable id="hgmd missense">
accession number | codon change | aa change | codon number |
---|---|---|---|
CM032270 | AGGc-AGC | Arg-Ser | 6 |
CM091838 | TTG-TGG | Leu-Trp | 46 |
CM994469 | cGTG-ATG | Val-Met | 53 |
CM994470 | cGTG-ATG | Val-Met | 59 |
HM971246 | CATg-CAC | His-His | 63 |
CM960827 | tCAT-GAT | His-Asp | 63 |
CM990718 | gAGT-TGT | Ser-Cys | 65 |
CM033969 | tCGC-TGC | Arg-Cys | 66 |
CM020721 | cCGA-TGA | Arg-Term | 71 |
CM990719 | aGGG-CGG | Gly-Arg | 93 |
CM990720 | ATT-ACT | Ile-Thr | 105 |
CM990721 | CAAg-CAC | Gln-His | 127 |
CM091839 | aGAC-AAC | Asp-Asn | 129 |
CM091840 | TACg-TAG | Tyr-Term | 138 |
CM004810 | gGAG-CAG | Glu-Gln | 168 |
CM004106 | gGAG-TAG | Glu-Term | 168 |
CM004107 | TGG-TAG | Trp-Term | 169 |
CM015326 | GCC-GTC | Ala-Val | 176 |
CM081301 | CTG-CCG | Leu-Pro | 183 |
CM034097 | CGG-CAG | Arg-Gln | 224 |
CM101181 | cCAG-TAG | Gln-Term | 233 |
CM024530 | tGTA-TTA | Val-Leu | 272 |
CM994771 | aGAG-AAG | Glu-Lys | 277 |
CM960828 | TGC-TAC | Cys-Tyr | 282 |
CM004391 | TGC-TCC | Cys-Ser | 282 |
CM032271 | CAG-CCG | Gln-Pro | 283 |
HM030028 | GTG-GCG | Val-Ala | 295 |
CM990722 | AGG-ATG | Arg-Met | 330 |
</figtable>
The 28 missense and nonsense mutations for HFE are listed in <xr id="hgmd missense"/> together with the amino acid (aa) change and the codon number. They are all connected with the hemochromatosis phenotype.
dbSNP
dbSNP was searched for non-synonymous and silent (synonymous) mutations of the HFE gene. Silent mutations are mutations in the nucleotide sequence that do not lead to a change in the amino acid sequence of the protein.
<figtable id="dbSNP all">
cluster ID | Function | codon number | codon pos | nucleotide change | aa change |
---|---|---|---|---|---|
rs149342416 | missense | 6 | 3 | G -> C | Arg -> Ser |
rs114758821 | synonymous | 7 | 3 | G -> A | Pro -> Pro |
rs368895240 | synonymous | 10 | 3 | C -> T | Leu -> Leu |
rs201657128 | missense | 14 | 1 | C -> G | Leu -> Val |
rs143662783 | missense | 17 | 2 | C -> T | Thr -> Ile |
rs148161858 | missense | 23 | 2 | G -> A | Arg -> His |
rs2242956 | missense | 35 | 2 | T -> C | Met -> Thr |
rs377254261 | missense | 37 | 2 | C -> T | Ala -> Val |
rs147297176 | synonymous | 58 | 3 | C -> T | Phe -> Phe |
rs147426902 | synonymous | 63 | 3 | T -> C | His -> His |
rs139523708 | missense | 67 | 2 | G -> A | Arg -> His |
rs62625342 | synonymous | 76 | 3 | C -> T | Ser -> Ser |
rs376650371 | missense | 97 | 3 | G -> A | Met -> Ile |
rs199988202 | missense | 106 | 2 | T -> C | Met -> Thr |
rs200706856 | missense | 129 | 1 | G -> A | Asp -> Asn |
rs201885016 | missense | 130 | 2 | A -> G | Asn -> Ser |
rs369790080 | synonymous | 132 | 3 | C -> T | Thr -> Thr |
rs372789940 | missense | 141 | 1 | G -> A | Asp -> Asn |
rs199879669 | missense | 157 | 1 | G -> C | Ala -> Pro |
rs145475682 | missense | 162 | 1 | G -> T | Ala -> Ser |
rs148480830 | synonymous | 162 | 3 | C -> G | Ala -> Ala |
rs144170531 | missense | 166 | 1 | A -> G | Lys -> Glu |
rs146519482 | missense | 168 | 1 | G -> C | Glu -> Gln |
rs199916850 | missense | 183 | 2 | T -> C | Leu -> Pro |
rs140957442 | nonsense | 192 | 1 | C -> T | Gln -> [Te |
rs4986950 | missense | 217 | 2 | C -> T | Thr -> Ile |
rs144797937 | missense | 224 | 1 | C -> T | Arg -> Trp |
rs62625346 | missense | 224 | 2 | G -> A | Arg -> Gln |
rs140515012 | missense | 245 | 1 | C -> G | Pro -> Ala |
rs150402693 | missense | 251 | 3 | C -> A | Phe -> Leu |
rs182920795 | synonymous | 253 | 3 | T -> A | Pro -> Pro |
rs202068193 | missense | 256 | 1 | G -> A | Val -> Ile |
rs143846467 | missense | 259 | 2 | A -> G | Asn -> Ser |
rs140080192 | missense | 277 | 1 | G -> A | Glu -> Lys |
rs369354634 | synonymous | 281 | 3 | G -> A | Thr -> Thr |
rs201310322 | synonymous | 292 | 3 | C -> T | Pro -> Pro |
rs143175221 | missense | 295 | 2 | T -> C | Val -> Ala |
rs114038675 | synonymous | 298 | 3 | G -> A | Glu -> Glu |
rs372856303 | synonymous | 301 | 3 | G -> A | Pro -> Pro |
rs147519426 | missense | 315 | 2 | T -> G | Val -> Gly |
rs148632352 | synonymous | 315 | 3 | T -> C | Val -> Val |
rs371192232 | synonymous | 317 | 3 | C -> T | Val -> Val |
rs141229562 | missense | 318 | 1 | G -> A | Val -> Ile |
rs150716212 | missense | 322 | 2 | T -> C | Ile -> Thr |
rs138993448 | missense | 327 | 2 | T -> C | Ile -> Thr |
rs368122334 | missense | 340 | 2 | G -> C | Gly -> Ala |
rs35201683 | synonymous | 342 | 3 | C -> T | Tyr -> Tyr |
rs370285936 | missense | 343 | 2 | T -> A | Val -> Asp |
rs146508927 | missense | 347 | 2 | G -> A | Arg -> His |
</figtable>
In total, we found 49 SNPs in the transcript variant 1 of the HFE gene. They are listed in <xr id="dbSNP all"/>. The column "Function" states if the mutation is synonymous or non-synonymous.
SNPdbe
35 mutations that are associated with the human HFE protein were found in SNPdbe.
<figtable id="exp evidence">
exp. evidence | count |
---|---|
1000Genome,freq,cluster | 6 |
by cluster | 6 |
by cluster,freq | 2 |
Not validated | 13 |
by freq | 8 |
</figtable>
Not all mutations have experimental evidence, over one third is not validated, see <xr id="exp evidence"/>.
<figtable id="snpdbe">
dbSNP | Mutation | Disease association | Experimental evidence |
---|---|---|---|
rs1799945 | H63D | In hereditary haemochromatosis (HH) (PMD) | 1000Genome,freq,cluster |
rs1800562 | C282Y | hemochromatosis (SwissVar) | 1000Genome,freq,cluster |
rs1800730 | S65C | hemochromatosis (SwissVar). In hereditary hemochromatosis patient who had resulted positive to screening for iron overload (PMD) | 1000Genome,freq,cluster |
rs2242956 | M35T | N/A | by cluster |
rs4986950 | T217I | N/A | by cluster,freq |
rs28934595 | Q127H | hemochromatosis (SwissVar). In variegate porphyria (VP) (PMD) | by cluster |
rs28934596 | I105T | hemochromatosis (SwissVar). In hemochromatosis (PMD) | by cluster |
rs28934597 | G93R | hemochromatosis (SwissVar). In hemochromatosis (PMD) | by cluster |
rs28934889 | V53M | N/A | 1000Genome,freq,cluster |
rs62625346 | R224Q | N/A | by cluster,freq |
rs28934890 | V59M | N/A | Not validated |
rs111033558 | R330M | hemochromatosis (SwissVar). In hereditary haemochromatosis (HH) (PMD) | by cluster |
rs111033563 | Q283P | hemochromatosis (SwissVar) | by cluster |
rs149342416 | R6S | hemochromatosis (SwissVar) | by freq |
rs140080192 | E277K | N/A | 1000Genome,freq,cluster |
rs143175221 | V295A | hemochromatosis (SwissVar) | by freq |
rs148161858 | R23H | N/A | 1000Genome,freq,cluster |
N/A | M106T | N/A | Not validated |
rs146519482 | E168Q | N/A | Not validated |
N/A | L183P | N/A | Not validated |
rs138176635 | E252G | N/A | Not validated |
rs138993448 | I327T | N/A | Not validated |
rs139523708 | R67H | N/A | Not validated |
rs140515012 | P245A | N/A | Not validated |
rs141229562 | V318I | N/A | Not validated |
rs143662783 | T17I | N/A | by freq |
rs143846467 | N259S | N/A | Not validated |
rs144170531 | K166E | N/A | by freq |
rs144797937 | R224W | N/A | by freq |
rs145475682 | A162S | N/A | by freq |
rs146508927 | R347H | N/A | by freq |
rs147519426 | V315G | N/A | by freq |
rs149662565 | P160T | N/A | Not validated |
rs150402693 | F251L | N/A | Not validated |
rs150716212 | I322T | N/A | Not validated |
</figtable>
A list of the 35 mutations and their experimental evidence can be found in <xr id="snpdbe"/>.
OMIM
OMIM (Online Mendelian Inheritance in Man) also contains information about HFE, but only a small amount of all known mutations can be found.
<figtable id="omim">
dbSNP accession | Phenotype | Mutation |
---|---|---|
rs1799945 | HEMOCHROMATOSIS | HIS63ASP |
rs1800730 | HEMOCHROMATOSIS | SER65CYS |
rs28934889 | HFE POLYMORPHISM | VAL53MET |
rs28934595 | HEMOCHROMATOSIS | GLN127HIS |
rs111033558 | HEMOCHROMATOSIS | ARG330MET |
rs28934596 | HEMOCHROMATOSIS | ILE105THR |
rs28934597 | HEMOCHROMATOSIS | GLY93ARG |
rs111033563 | HEMOCHROMATOSIS | GLN283PRO |
</figtable>
A complete list of the mutations can be found in <xr id="omim"/>. Two of the mutations are polymorphisms of the HFE protein and 8 of them cause the disease hemochromatosis.
Databases comparison
<figtable id="comparison">
database | last update | version | what information | where from | # entries homo sapiens | # HFE mutations |
---|---|---|---|---|---|---|
HGMD | spring 2013 | public 2013.1 (mainly 3 year old data) | Collection of published gene lesions in the human genome that cause inherited diseases. | Only from publications. Journals are searched manually and by computational means each week. | 99869 | 28 missense/nonsense |
dbSNP | 26.06.2012 | Build 137 | Short nucleotide sequence variations in different organisms (common and rare) | Submissions from laboratories but also private research companies. | 192,678,553 | 10 synonymous, 41 non-synonymous, 10 disease causing SNPs and 162 SNPs in the UTR |
SNPdbe | 05.03.2012 | - | Annotations for single amino acid substitutions (SAASs), e.g. functional effect (experimental, predicted), associated disease, evol. conservation,... | Based on entries from SwissProt, dbSNP, 1000 Genomes, PMD | 967879 | 10 disease associated, 25 other |
OMIM | daily | - | Compendium of human genes, genetic phenotypes and diseases. 3,035 genes with phenotype-causing mutations known. | Information from publications and databases is reviewed and summed up in texts by scientists. | 21,934 | 2 8 disease causing mutations, 2 other |
</figtable>
<xr id="comparison"/> contains a comparison of the four databases used in this task. It clearly shows that OMIM is most up to date, but also the private version of HGMD is from 2013. However, dbSNP, SNPdbe and especially the public HGMD have not been updated for over a year. dbSNP is very large and is the only database that also contains synonymous SNPs.
Mutation map
In total, we could collect a set of 72 point mutations from the different databases, containing missense, nonsense mutations and also synonymous SNPs. The complete list can be viewed in the Lab journal Task 7.
<figure id="mutation map">
</figure>
<xr id="mutation map"/> shows the mutation map for HFE. Disease causing mutations (red) and non disease causing mutations (blue) are nearly equally distributed over the protein sequence. Nevertheless, there are some region where only dc mutation are located. Those regions are probably especially relevant for the protein function and can not be mutated without a decrease in function.
<figtable id="mutations struc">
</figtable>
In addition to the mutation map, <xr id="mutations struc"/> shows a 3D visualisation of the location of the mutated amin acids in the protein structure (1A6Z,A). Most disease causing mutations are located in secondary structure regions, but there are also some mutations that are located in loops. Therefore, it is difficult to tell which regions of the structure are important for the function when only looking at the the location of the mutations.
References
Stenson et al (2009), The Human Gene Mutation Database (HGMD®): 2008 Update. Genome Med 1(1):13
http://www.ncbi.nlm.nih.gov/books/NBK3848/
ftp://ftp.ncbi.nih.gov/pub/factsheets/Factsheet_SNP.pdf
https://www.rostlab.org/services/snpdbe/