Difference between revisions of "Lab journal Task 7"

From Bioinformatikpedia
(HGMD)
 
(22 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== HGMD ==
 
== HGMD ==
The HFE accession number in HGMD is NM_000410.3 (transcript variant 1 http://www.ncbi.nlm.nih.gov/nuccore/NM_000410.3). the nucleotide sequence was downloaded in fasta format and then translated to the protein sequence with http://molbiol.ru/eng/scripts/01_13.html using the first reading frame.
+
The HFE accession number in HGMD is NM_000410.3 (transcript variant 1 http://www.ncbi.nlm.nih.gov/nuccore/NM_000410.3). The nucleotide sequence was downloaded in fasta format and then translated to the protein sequence with http://molbiol.ru/eng/scripts/01_13.html using the first reading frame.
   
 
== dbSNP ==
 
== dbSNP ==
  +
There are several different isoforms for the HFE protein. We decided to look at isoform 1 precursor (http://www.ncbi.nlm.nih.gov/protein/NP_000401.1) because there is the most data for this isoform and also because this isoform is used in HGMD.
We wanted to look for silent (synonymous) mutations in dbSNP. A silent mutation does not lead to a change in the amino acid sequence of the protein.
 
There are several different isoforms for the HFE protein. We decided to look at isoform 1 precursor (http://www.ncbi.nlm.nih.gov/protein/NP_000401.1) because there is the most data for this isoform.
+
The corresponding mRNA sequence for this protein is the transcript variant 1 (http://www.ncbi.nlm.nih.gov/nuccore/NM_000410.3).
transcript variant 1 (NM_000410.3).
 
 
The dbSNP webserver was search for SNP and the following query: "synonymous-codon"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS].
 
All different types of SNPs in the HFE gene can be listed in the geneView. This table was used to extract the number of different SNPs in the HFE gene.
 
 
   
  +
The dbSNP webserver was search for "SNP" and the following query was used: "synonymous-codon"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS].
  +
We used the graphic summary display of the results. Under the sequence line of each hit are several links, among others a link to the "Gene view".
  +
All different types of SNPs in the HFE gene can be listed in the geneView. This table was used to used to extract all mutations for the protein NP_000401 and especially the synonymous ones. and to extract the numbers of the different SNPs in the HFE gene.
   
 
== SNPdbe ==
 
== SNPdbe ==
  +
SNPdbe was searched for the protein "NP_000401". The results were downloaded as textfile and analysed with R.
  +
  +
== OMIM==
  +
Omim was searched for "HFE". The first result links to the page of " * 613609. HFE GENE; HFE". It contains a section "Allelic variants" containing several publicated mutations that can also be viewed as a table (http://omim.org/allelicVariant/613609). We only included results for the protein sequence of "NP_000401" that was also used for SNPdbe.
   
 
== Mutation map ==
 
== Mutation map ==
  +
  +
The reference sequence for the HFE protein in all four databases is the same, because we only collected mutations from one mRNA transcript. Therefore, we did not have to map the codon numbers of different databases.
  +
The two domains were selected using the SCOP annotation: MHC antigen-recognition domain (4-181) and Immunglobulin domain (182-275). Since the reference sequence contains 22 residues before the start of the pdb structure sequence (1A6Z,A), 22 was added to positions of the domains to get the corresponding indices for the reference sequence.
  +
  +
<css>
  +
table.colBasic2 {
  +
margin-left: auto;
  +
margin-right: auto;
  +
border: 1px solid black;
  +
border-collapse:collapse;
  +
width: 60%;
  +
}
  +
  +
.colBasic2 th,td {
  +
padding: 3px;
  +
border: 1px solid black;
  +
}
  +
  +
.colBasic2 td {
  +
text-align:left;
  +
}
  +
  +
/* for orange try #ff7f00 and #ffaa56
  +
for blue try #005fbf and #aad4ff
  +
  +
maria's style
  +
blue: #adceff
  +
grey: #efefef
  +
*/
  +
.colBasic2 tr th { background-color:#efefef; color: black;}
  +
.colBasic2 tr:first-child th { background-color:#adceff; color:black;}
  +
</css>
  +
  +
  +
<figtable id="geneview">
  +
{| class="colBasic2"
  +
!accession || codon position || aa change || disease association || mutation type
  +
|-
  +
| rs149342416, CM032270 || 6 || Arg -> Ser || hemochromatosis || missense
  +
|-
  +
| rs114758821 || 7 || Pro -> Pro || N/A || synonymous
  +
|-
  +
| rs368895240 || 10 || Leu -> Leu || N/A || synonymous
  +
|-
  +
| rs201657128 || 14 || Leu -> Val || N/A || missense
  +
|-
  +
| rs143662783 || 17 || Thr -> Ile || N/A || missense
  +
|-
  +
| rs148161858 || 23 || Arg -> His || N/A || missense
  +
|-
  +
| rs2242956 || 35 || Met -> Thr || N/A || missense
  +
|-
  +
| rs377254261 || 37 || Ala -> Val || N/A || missense
  +
|-
  +
| CM091838 || 46 || Leu -> Trp || hemochromatosis || missense
  +
|-
  +
| rs28934889, CM994469 || 53 || Val -> Met || hemochromatosis || missense
  +
|-
  +
| rs147297176 || 58 || Phe -> Phe || N/A || synonymous
  +
|-
  +
| rs28934890, rs111033557, CM994470 || 59 || Val -> Met || hemochromatosis || missense
  +
|-
  +
| rs1799945, CM960827 || 63 || His -> Asp || In hereditary haemochromatosis (HH) (PMD) || missense
  +
|-
  +
| rs147426902, HM971246, || 63 || His -> His || N/A || synonymous
  +
|-
  +
| rs1800730, CM990718 || 65 || Ser -> Cys || hemochromatosis || missense
  +
|-
  +
| CM033969 || 66 || Arg -> Cys || hemochromatosis || missense
  +
|-
  +
| rs139523708 || 67 || Arg -> His || N/A || missense
  +
|-
  +
| CM020721 || 71 || Arg -> Term || hemochromatosis || nonsense
  +
|-
  +
| rs62625342 || 76 || Ser -> Ser || N/A || synonymous
  +
|-
  +
| rs28934597, CM990719 || 93 || Gly -> Arg || hemochromatosis (SwissVar). In hemochromatosis (PMD) || missense
  +
|-
  +
| rs376650371 || 97 || Met -> Ile || N/A || missense
  +
|-
  +
| rs28934596, CM990720 || 105 || Ile -> Thr || hemochromatosis (SwissVar). In hemochromatosis (PMD) || missense
  +
|-
  +
| rs199988202 || 106 || Met -> Thr || N/A || missense
  +
|-
  +
| rs28934595, CM990721 || 127 || Gln -> His || hemochromatosis (SwissVar). In variegate porphyria (VP) (PMD) || missense
  +
|-
  +
| rs200706856, CM091839 || 129 || Asp -> Asn || hemochromatosis || missense
  +
|-
  +
| rs201885016 || 130 || Asn -> Ser || N/A || missense
  +
|-
  +
| rs369790080 || 132 || Thr -> Thr || N/A || synonymous
  +
|-
  +
| CM091840 || 138 || Tyr -> Term || hemochromatosis || nonsense
  +
|-
  +
| rs372789940 || 141 || Asp -> Asn || N/A || missense
  +
|-
  +
| rs199879669 || 157 || Ala -> Pro || N/A || missense
  +
|-
  +
| rs149662565 || 160 || Pro -> Thr || N/A || missense
  +
|-
  +
| rs145475682 || 162 || Ala -> Ser || N/A || missense
  +
|-
  +
| rs148480830 || 162 || Ala -> Ala || N/A || synonymous
  +
|-
  +
| rs144170531 || 166 || Lys -> Glu || N/A || missense
  +
|-
  +
| rs146519482, CM004810 || 168 || Glu - > Gln || hemochromatosis || missense
  +
|-
  +
| CM004106 || 168 || Glu -> Term || hemochromatosis || nonsense
  +
|-
  +
| CM004107 || 169 || Trp -> Term || hemochromatosis || nonsense
  +
|-
  +
| CM015326 || 176 || Ala -> Val || hemochromatosis || missense
  +
|-
  +
| rs199916850, CM081301 || 183 || Leu -> Pro || hemochromatosis || missense
  +
|-
  +
| rs140957442 || 192 || Gln -> Term || N/A || nonsense
  +
|-
  +
| rs4986950 || 217 || Thr -> Ile || N/A || missense
  +
|-
  +
| rs144797937 || 224 || Arg -> Trp || N/A || missense
  +
|-
  +
| rs62625346, CM034097 || 224 || Arg -> Gln || hemochromatosis || missense
  +
|-
  +
| CM101181 || 233 || Gln -> Term || hemochromatosis || nonsense
  +
|-
  +
| rs140515012 || 245 || Pro -> Ala || N/A || missense
  +
|-
  +
| rs150402693 || 251 || Phe -> Leu || N/A || missense
  +
|-
  +
| rs138176635 || 252 || E -> G || N/A || missense
  +
|-
  +
| rs182920795 || 253 || Pro -> Pro || N/A || synonymous
  +
|-
  +
| rs202068193 || 256 || Val -> Ile || N/A || missense
  +
|-
  +
| rs143846467 || 259 || Asn -> Ser || N/A || missense
  +
|-
  +
| CM024530 || 272 || Val -> Leu || hemochromatosis || missense
  +
|-
  +
| rs140080192, CM994771 || 277 || Glu -> Lys || hemochromatosis || missense
  +
|-
  +
| rs369354634 || 281 || Thr -> Thr || N/A || synonymous
  +
|-
  +
| rs1800562, CM960828 || 282 || Cys -> Tyr || hemochromatosis || missense
  +
|-
  +
| CM004391 || 282 || Cys -> Ser || hemochromatosis || missense
  +
|-
  +
| rs111033563, CM032271 || 283 || Gln-Pro || hemochromatosis (SwissVar) || missense
  +
|-
  +
| rs201310322 || 292 || Pro -> Pro || N/A || synonymous
  +
|-
  +
| rs143175221, HM030028 || 295 || Val -> Ala || hemochromatosis (SwissVar) || missense
  +
|-
  +
| rs114038675 || 298 || Glu -> Glu || N/A || synonymous
  +
|-
  +
| rs372856303 || 301 || Pro -> Pro || N/A || synonymous
  +
|-
  +
| rs147519426 || 315 || Val -> Gly || N/A || missense
  +
|-
  +
| rs148632352 || 315 || Val -> Val || N/A || synonymous
  +
|-
  +
| rs371192232 || 317 || Val -> Val || N/A || synonymous
  +
|-
  +
| rs141229562 || 318 || Val -> Ile || N/A || missense
  +
|-
  +
| rs150716212 || 322 || Ile -> Thr || N/A || missense
  +
|-
  +
| rs138993448 || 327 || Ile -> Thr || N/A || missense
  +
|-
  +
| rs111033558, CM990722 || 330 || Arg -> Met || hemochromatosis (SwissVar). In hereditary haemochromatosis (HH) (PMD) || missense
  +
|-
  +
| rs368122334 || 340 || Gly -> Ala || N/A || missense
  +
|-
  +
| rs35201683 || 342 || Tyr -> Tyr || N/A || synonymous
  +
|-
  +
| rs370285936 || 343 || Val -> Asp || N/A || missense
  +
|-
  +
| rs146508927 || 347 || Arg -> His || N/A || missense
  +
|-
  +
|+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 1:''' Summary of all 71 found mutations.
  +
|}
  +
</figtable>
  +
  +
<xr id="geneview"/> contains a summary of all found 71 mutations. Those mutations were used to create the gene view plot.
  +
We used the following R script for plotting:
  +
  +
<source lang="php">
  +
data <- read.table("/home/kathi/Dokumente/SS13/masterPractical/task7/task7_table_all_filtered.txt", header=TRUE, sep="\t")
  +
  +
xmax = 348
  +
  +
n <- data[data$association!=" N/A",]
  +
d <- data[data$association==" N/A",]
  +
  +
# MHC antigen-recognition domain: 4-181
  +
# immunglobulin: 182-275
  +
# pdb structure is shorter: +22
  +
#-> 22+4 - 22+181
  +
#--> 22+182 - 22+275
  +
  +
png("/home/kathi/Dokumente/SS13/masterPractical/task7/mutation_map.png", width=2000, height=500 )
  +
  +
oldpar <- par()
  +
  +
par( mfrow=c(3,1), mar=c(0.1, 12.1, 0.1, 0.1), oma=c(10, 0.1, 7, 0.1), mgp=c(0,2,0) )
  +
  +
plot(-1,-1, xlim =c(0,xmax) , ylim=c(0,1), axes = FALSE, xlab=NA, ylab=NA)
  +
segments(x0=n$codon.pos, y0=rep(0, length(n$codon.pos)), x1=n$codon.pos, y1=rep(1, length(n$codon.pos)), col="darkblue", lwd=2)
  +
mtext("non dc\nmutations", side=2, line=-0.5 , cex=2.1, las=1)
  +
  +
plot(-1,-1, xlim =c(0,xmax) , ylim=c(0,1), axes = FALSE, , xlab=NA, ylab = NA)
  +
segments(x0=d$codon.pos, y0=rep(0, length(d$codon.pos)), x1=d$codon.pos,y1=rep(1, length(d$codon.pos)), col="firebrick2", lwd=2)
  +
mtext("dc\nmutations", side=2, line=-0.2 , cex=2.1, las=1)
  +
  +
plot(-1,-1, xlim =c(0,xmax) , ylim=c(0,1), axes = FALSE, xlab = NA, ylab = NA, cex=1.5)
  +
axis(1, at=seq(0, xmax, 20), line=0.7, lwd=3, lwd.ticks=3, cex.axis=3 )
  +
rect(0,0,xmax,1, col="lavender")
  +
rect(22+4,0,22+181,1, col="palegreen3")
  +
rect(22+182,0,22+275,1, col="steelblue3")
  +
text(22+4+70, 0.5, "MHC antigen-recognition domain", cex=3.7, col="white")
  +
text(22+182+50, 0.5, "Immunoglobulin domain", cex=3.7, col="white")
  +
mtext("protein\ndomains", side=2, line=-0.2 , cex=2.1, las=1)
  +
  +
mtext("mutation map for HFE", side=3, line=2, cex=3, outer=TRUE, adj=0.5)
  +
mtext("codon position", side=1, line=6.4, cex=2.5, outer=TRUE)
  +
  +
par <- oldpar
  +
dev.off()
  +
</source>

Latest revision as of 20:56, 27 September 2013

HGMD

The HFE accession number in HGMD is NM_000410.3 (transcript variant 1 http://www.ncbi.nlm.nih.gov/nuccore/NM_000410.3). The nucleotide sequence was downloaded in fasta format and then translated to the protein sequence with http://molbiol.ru/eng/scripts/01_13.html using the first reading frame.

dbSNP

There are several different isoforms for the HFE protein. We decided to look at isoform 1 precursor (http://www.ncbi.nlm.nih.gov/protein/NP_000401.1) because there is the most data for this isoform and also because this isoform is used in HGMD. The corresponding mRNA sequence for this protein is the transcript variant 1 (http://www.ncbi.nlm.nih.gov/nuccore/NM_000410.3).

The dbSNP webserver was search for "SNP" and the following query was used: "synonymous-codon"[Function_Class] AND HFE[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]. We used the graphic summary display of the results. Under the sequence line of each hit are several links, among others a link to the "Gene view". All different types of SNPs in the HFE gene can be listed in the geneView. This table was used to used to extract all mutations for the protein NP_000401 and especially the synonymous ones. and to extract the numbers of the different SNPs in the HFE gene.

SNPdbe

SNPdbe was searched for the protein "NP_000401". The results were downloaded as textfile and analysed with R.

OMIM

Omim was searched for "HFE". The first result links to the page of " * 613609. HFE GENE; HFE". It contains a section "Allelic variants" containing several publicated mutations that can also be viewed as a table (http://omim.org/allelicVariant/613609). We only included results for the protein sequence of "NP_000401" that was also used for SNPdbe.

Mutation map

The reference sequence for the HFE protein in all four databases is the same, because we only collected mutations from one mRNA transcript. Therefore, we did not have to map the codon numbers of different databases. The two domains were selected using the SCOP annotation: MHC antigen-recognition domain (4-181) and Immunglobulin domain (182-275). Since the reference sequence contains 22 residues before the start of the pdb structure sequence (1A6Z,A), 22 was added to positions of the domains to get the corresponding indices for the reference sequence.

<css> table.colBasic2 { margin-left: auto; margin-right: auto; border: 1px solid black; border-collapse:collapse; width: 60%; }

.colBasic2 th,td { padding: 3px; border: 1px solid black; }

.colBasic2 td { text-align:left; }

/* for orange try #ff7f00 and #ffaa56 for blue try #005fbf and #aad4ff

maria's style blue: #adceff grey: #efefef

  • /

.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;} </css>


<figtable id="geneview">

accession codon position aa change disease association mutation type
rs149342416, CM032270 6 Arg -> Ser hemochromatosis missense
rs114758821 7 Pro -> Pro N/A synonymous
rs368895240 10 Leu -> Leu N/A synonymous
rs201657128 14 Leu -> Val N/A missense
rs143662783 17 Thr -> Ile N/A missense
rs148161858 23 Arg -> His N/A missense
rs2242956 35 Met -> Thr N/A missense
rs377254261 37 Ala -> Val N/A missense
CM091838 46 Leu -> Trp hemochromatosis missense
rs28934889, CM994469 53 Val -> Met hemochromatosis missense
rs147297176 58 Phe -> Phe N/A synonymous
rs28934890, rs111033557, CM994470 59 Val -> Met hemochromatosis missense
rs1799945, CM960827 63 His -> Asp In hereditary haemochromatosis (HH) (PMD) missense
rs147426902, HM971246, 63 His -> His N/A synonymous
rs1800730, CM990718 65 Ser -> Cys hemochromatosis missense
CM033969 66 Arg -> Cys hemochromatosis missense
rs139523708 67 Arg -> His N/A missense
CM020721 71 Arg -> Term hemochromatosis nonsense
rs62625342 76 Ser -> Ser N/A synonymous
rs28934597, CM990719 93 Gly -> Arg hemochromatosis (SwissVar). In hemochromatosis (PMD) missense
rs376650371 97 Met -> Ile N/A missense
rs28934596, CM990720 105 Ile -> Thr hemochromatosis (SwissVar). In hemochromatosis (PMD) missense
rs199988202 106 Met -> Thr N/A missense
rs28934595, CM990721 127 Gln -> His hemochromatosis (SwissVar). In variegate porphyria (VP) (PMD) missense
rs200706856, CM091839 129 Asp -> Asn hemochromatosis missense
rs201885016 130 Asn -> Ser N/A missense
rs369790080 132 Thr -> Thr N/A synonymous
CM091840 138 Tyr -> Term hemochromatosis nonsense
rs372789940 141 Asp -> Asn N/A missense
rs199879669 157 Ala -> Pro N/A missense
rs149662565 160 Pro -> Thr N/A missense
rs145475682 162 Ala -> Ser N/A missense
rs148480830 162 Ala -> Ala N/A synonymous
rs144170531 166 Lys -> Glu N/A missense
rs146519482, CM004810 168 Glu - > Gln hemochromatosis missense
CM004106 168 Glu -> Term hemochromatosis nonsense
CM004107 169 Trp -> Term hemochromatosis nonsense
CM015326 176 Ala -> Val hemochromatosis missense
rs199916850, CM081301 183 Leu -> Pro hemochromatosis missense
rs140957442 192 Gln -> Term N/A nonsense
rs4986950 217 Thr -> Ile N/A missense
rs144797937 224 Arg -> Trp N/A missense
rs62625346, CM034097 224 Arg -> Gln hemochromatosis missense
CM101181 233 Gln -> Term hemochromatosis nonsense
rs140515012 245 Pro -> Ala N/A missense
rs150402693 251 Phe -> Leu N/A missense
rs138176635 252 E -> G N/A missense
rs182920795 253 Pro -> Pro N/A synonymous
rs202068193 256 Val -> Ile N/A missense
rs143846467 259 Asn -> Ser N/A missense
CM024530 272 Val -> Leu hemochromatosis missense
rs140080192, CM994771 277 Glu -> Lys hemochromatosis missense
rs369354634 281 Thr -> Thr N/A synonymous
rs1800562, CM960828 282 Cys -> Tyr hemochromatosis missense
CM004391 282 Cys -> Ser hemochromatosis missense
rs111033563, CM032271 283 Gln-Pro hemochromatosis (SwissVar) missense
rs201310322 292 Pro -> Pro N/A synonymous
rs143175221, HM030028 295 Val -> Ala hemochromatosis (SwissVar) missense
rs114038675 298 Glu -> Glu N/A synonymous
rs372856303 301 Pro -> Pro N/A synonymous
rs147519426 315 Val -> Gly N/A missense
rs148632352 315 Val -> Val N/A synonymous
rs371192232 317 Val -> Val N/A synonymous
rs141229562 318 Val -> Ile N/A missense
rs150716212 322 Ile -> Thr N/A missense
rs138993448 327 Ile -> Thr N/A missense
rs111033558, CM990722 330 Arg -> Met hemochromatosis (SwissVar). In hereditary haemochromatosis (HH) (PMD) missense
rs368122334 340 Gly -> Ala N/A missense
rs35201683 342 Tyr -> Tyr N/A synonymous
rs370285936 343 Val -> Asp N/A missense
rs146508927 347 Arg -> His N/A missense
Table 1: Summary of all 71 found mutations.

</figtable>

<xr id="geneview"/> contains a summary of all found 71 mutations. Those mutations were used to create the gene view plot. We used the following R script for plotting:

<source lang="php"> data <- read.table("/home/kathi/Dokumente/SS13/masterPractical/task7/task7_table_all_filtered.txt", header=TRUE, sep="\t")

xmax = 348

n <- data[data$association!=" N/A",] d <- data[data$association==" N/A",]

  1. MHC antigen-recognition domain: 4-181
  2. immunglobulin: 182-275
  3. pdb structure is shorter: +22
  4. -> 22+4 - 22+181
  5. --> 22+182 - 22+275

png("/home/kathi/Dokumente/SS13/masterPractical/task7/mutation_map.png", width=2000, height=500 )

oldpar <- par()

par( mfrow=c(3,1), mar=c(0.1, 12.1, 0.1, 0.1), oma=c(10, 0.1, 7, 0.1), mgp=c(0,2,0) )

plot(-1,-1, xlim =c(0,xmax) , ylim=c(0,1), axes = FALSE, xlab=NA, ylab=NA) segments(x0=n$codon.pos, y0=rep(0, length(n$codon.pos)), x1=n$codon.pos, y1=rep(1, length(n$codon.pos)), col="darkblue", lwd=2) mtext("non dc\nmutations", side=2, line=-0.5 , cex=2.1, las=1)

plot(-1,-1, xlim =c(0,xmax) , ylim=c(0,1), axes = FALSE, , xlab=NA, ylab = NA) segments(x0=d$codon.pos, y0=rep(0, length(d$codon.pos)), x1=d$codon.pos,y1=rep(1, length(d$codon.pos)), col="firebrick2", lwd=2) mtext("dc\nmutations", side=2, line=-0.2 , cex=2.1, las=1)

plot(-1,-1, xlim =c(0,xmax) , ylim=c(0,1), axes = FALSE, xlab = NA, ylab = NA, cex=1.5) axis(1, at=seq(0, xmax, 20), line=0.7, lwd=3, lwd.ticks=3, cex.axis=3 ) rect(0,0,xmax,1, col="lavender") rect(22+4,0,22+181,1, col="palegreen3") rect(22+182,0,22+275,1, col="steelblue3") text(22+4+70, 0.5, "MHC antigen-recognition domain", cex=3.7, col="white") text(22+182+50, 0.5, "Immunoglobulin domain", cex=3.7, col="white") mtext("protein\ndomains", side=2, line=-0.2 , cex=2.1, las=1)

mtext("mutation map for HFE", side=3, line=2, cex=3, outer=TRUE, adj=0.5) mtext("codon position", side=1, line=6.4, cex=2.5, outer=TRUE)

par <- oldpar dev.off() </source>