Difference between revisions of "Fabry:Mapping point mutations/Journal"

Revision as of 14:37, 11 June 2012

Fabry Disease » Mapping point mutations » Journal

HGMD

We search for our gene GLA, followed the button to missense/nonsense mutations, and saved to HTML to a local file, e.g. hgmd_fabry_mutations.html. Afterwards, the file was reformatted to a wiki table with

$ bash hgmd2wikitable.sh hgmd_fabry_mutations.html > hgmd.wikitable

and the SNPs were extracted to a generic parseable format. To convert the three letter representation of the amino acids, the perl script three2one.pl was applied.

$ bash hgmd_extract_mutations.sh hgmd_fabry_mutations.html > hgmd_snps_3.txt
$ perl three2one.pl hgmd_snps_3.txt > hgmd_snps.txt

dbSNP

The dbSNP was searched for silent mutations using the following search string

"synonymous-codon"[Function_Class] AND GLA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS]

and the results page was saved as locally as "synonymous-codon"[Function_Class] AND GLA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] - SNP Results.html.

Eventually, the SNPs were extracted to a generic parseable format. To convert the three letter representation of the amino acids, the perl script three2one.pl was applied.

$ bash parse_dbSNP.sh '"synonymous-codon"[Function_Class] AND GLA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] - SNP Results.html' > dbSNP_snps_3.txt
$ perl three2one.pl dbSNP_snps_3.txt > dbSNP_snps.txt

OMIM

In this database we searched for the gene GLA and downloaded the Table View table of the Allelic Variants (Allelic_variants.txt), which can be found in the Table of Contents. This table was parsed with the perl scripts readOMIM.pl and Omim2table.pl, which created all the output we needed for creation of the tables and statistics.

$ perl readOMIM.pl
$ perl Omim2table.pl > omim.wiki

SNPedia

Since there is no special site for the gene GLA, we performed a query with the search term "Gene=GLA". These dbSNP identifiers were downloaded and mapped onto a list of all informations on all identifiers (201) found, when searching for ""snp"[SNP_CLASS] AND GLA[GENE] AND "human"[ORGN]" in the dbSNP (Flat File display). This was done with the perl script parse_dbSNP.pl, which needed the input files snp_result.txt and rsnumber.txt. We realised that the amino acid position in this representation ignores the initial Methionine in the sequence, so we had to add 1 to the read aa positions.
A wiki table was created with SNPedia2table.pl.

$ perl parse_dbSNP.pl
$ perl SNPedia2table.pl > SNPedia.wiki

SNPdbe

The textfile containing all informations gathered in SNPdbe (snps.txt) was downloaded after searching for the gene identifier "NP_000160" in the organism "human" (see). The textfile was parsed with the perl scripts readSNPdbe.pl andSNPdbe2table.pl and vizualised with the R script Hotspots_SNPdbe.R. For now, a SNP was considered disease causing if there was a disease listed, if it was a "N/A" it was assigned "non-disease". Later we added those SNPs that were also found in HGMD to the disease causing SNPs (see SNPdbe_manuallyCur.Rtable).

$ perl readSNPdbe.pl
$ perl SNPdbe2table.pl > SNPdbe.wiki
$ R CMD BATCH Hotspots_SNPdbe.R

Mapping

First of all, we created the plots showing the distribution of SNPs along the sequence of the gene GLA with the R Script snp_distr.R. For this we prepared the data with map.pl. The non-redundant data was then used to build a histogram with Histogramm.R and to map the different groups of SNPs onto the structure 1R46 (Pymol script created with pymol_mark_positions.sh).

$ perl map.pl
$ R CMD BATCH snp_distr.R
$ R CMD BATCH Histogramm.R
$ pymol <(bash pymol_mark_positions.sh)

Afterwards we tried to vizualise disease and non-disease causing point mutations in one plot with Hotspots_all.R. We needed R tables of all databases for this, so some preparation had to be done with readHGMDforR.pl (input hgmd_snps.txt) and readdbSNPforR.pl (input dbSNP_snps.txt).

$ perl readHGMDforR.pl
$ perl readdbSNPforR.pl
$ R CMD BATCH Hotspots_all.R

@@ Line 6: / Line 6: @@
 We search for our gene [http://www.hgmd.cf.ac.uk/ac/gene.php?gene=GLA GLA], followed the button to missense/nonsense mutations, and saved to HTML to a local file, e.g. hgmd_fabry_mutations.html. Afterwards, the file was reformatted to a [[Fabry:Mapping_point_mutations/HGMD|wiki table]] with
- $ bash <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/hgmd2wikitable.sh.html hgmd2wikitable.sh]</span> hgmd_fabry_mutations.html > hgmd.wikitable
+ $ bash <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/hgmd2wikitable.sh.html hgmd2wikitable.sh]</span> hgmd_fabry_mutations.html > hgmd.wikitable
-and the SNPs were extracted to a generic parseable format. To convert the three letter representation of the amino acids, the perl script [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl] was applied.
+and the SNPs were extracted to a generic parseable format. To convert the three letter representation of the amino acids, the perl script [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl] was applied.
- $ bash <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/hgmd_extract_mutations.sh.html hgmd_extract_mutations.sh]</span> hgmd_fabry_mutations.html > hgmd_snps_3.txt
+ $ bash <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/hgmd_extract_mutations.sh.html hgmd_extract_mutations.sh]</span> hgmd_fabry_mutations.html > hgmd_snps_3.txt
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl]</span> hgmd_snps_3.txt > hgmd_snps.txt
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl]</span> hgmd_snps_3.txt > hgmd_snps.txt
 == dbSNP ==
@@ Line 21: / Line 21: @@
 and the results page was saved as locally as "synonymous-codon"[Function_Class] AND GLA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] - SNP Results.html.
-Eventually, the SNPs were extracted to a generic parseable format. To convert the three letter representation of the amino acids, the perl script [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl] was applied.
+Eventually, the SNPs were extracted to a generic parseable format. To convert the three letter representation of the amino acids, the perl script [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl] was applied.
- $ bash <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/parse_dbSNP.sh.html parse_dbSNP.sh]</span> '"synonymous-codon"[Function_Class] AND GLA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] - SNP Results.html' > dbSNP_snps_3.txt
+ $ bash <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/parse_dbSNP.sh.html parse_dbSNP.sh]</span> '"synonymous-codon"[Function_Class] AND GLA[GENE] AND "human"[ORGN] AND "snp"[SNP_CLASS] - SNP Results.html' > dbSNP_snps_3.txt
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl]</span> dbSNP_snps_3.txt > dbSNP_snps.txt
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/three2one.pl.html three2one.pl]</span> dbSNP_snps_3.txt > dbSNP_snps.txt
 == OMIM ==
-In this database we searched for the gene [http://omim.org/entry/300644 GLA] and downloaded the Table View table of the Allelic Variants ([https://www.dropbox.com/s/xpehn5e0l9bz1jz/Allelic_variants.txt Allelic_variants.txt]), which can be found in the Table of Contents. This table was parsed with the perl scripts [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readOMIM.pl.html readOMIM.pl] and [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Omim2table.pl.html Omim2table.pl], which created all the output we needed for creation of the tables and statistics.
+In this database we searched for the gene [http://omim.org/entry/300644 GLA] and downloaded the Table View table of the Allelic Variants ([https://www.dropbox.com/s/xpehn5e0l9bz1jz/Allelic_variants.txt Allelic_variants.txt]), which can be found in the Table of Contents. This table was parsed with the perl scripts [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readOMIM.pl.html readOMIM.pl] and [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Omim2table.pl.html Omim2table.pl], which created all the output we needed for creation of the tables and statistics.
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readOMIM.pl.html readOMIM.pl]</span>
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readOMIM.pl.html readOMIM.pl]</span>
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Omim2table.pl.html Omim2table.pl]</span> > omim.wiki
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Omim2table.pl.html Omim2table.pl]</span> > omim.wiki
 == SNPedia ==
-Since there is no special site for the gene GLA, we performed a query with the search term "Gene=GLA". These dbSNP identifiers were downloaded and mapped onto a list of all informations on all identifiers (201) found, when searching for ""snp"[SNP_CLASS] AND GLA[GENE] AND "human"[ORGN]" in the dbSNP (Flat File display). This was done with the perl script [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/parse_dbSNP.pl.html parse_dbSNP.pl], which needed the input files [https://www.dropbox.com/s/94h35owfykiivzd/snp_result.txt snp_result.txt] and [https://www.dropbox.com/s/q8pyhshb6e38un7/rsnumber.txt rsnumber.txt]. We realised that the amino acid position in this representation ignores the initial Methionine in the sequence, so we had to add 1 to the read aa positions.
+Since there is no special site for the gene GLA, we performed a query with the search term "Gene=GLA". These dbSNP identifiers were downloaded and mapped onto a list of all informations on all identifiers (201) found, when searching for ""snp"[SNP_CLASS] AND GLA[GENE] AND "human"[ORGN]" in the dbSNP (Flat File display). This was done with the perl script [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/parse_dbSNP.pl.html parse_dbSNP.pl], which needed the input files [https://www.dropbox.com/s/94h35owfykiivzd/snp_result.txt snp_result.txt] and [https://www.dropbox.com/s/q8pyhshb6e38un7/rsnumber.txt rsnumber.txt]. We realised that the amino acid position in this representation ignores the initial Methionine in the sequence, so we had to add 1 to the read aa positions.
 <br>
-A wiki table was created with [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPedia2table.pl.html SNPedia2table.pl].
+A wiki table was created with [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPedia2table.pl.html SNPedia2table.pl].
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/parse_dbSNP.pl.html parse_dbSNP.pl]</span>
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/parse_dbSNP.pl.html parse_dbSNP.pl]</span>
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPedia2table.pl.html SNPedia2table.pl]</span> > SNPedia.wiki
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPedia2table.pl.html SNPedia2table.pl]</span> > SNPedia.wiki
 == SNPdbe ==
-The textfile containing all informations gathered in SNPdbe ([https://www.dropbox.com/s/eeboc9nratgeuj5/snps.txt snps.txt]) was downloaded after searching for the gene identifier "NP_000160" in the organism "human" ([http://www.rostlab.org/services/snpdbe/dosearch.php?id=name&val=NP_000160&organism2=human&organism1= see]). The textfile was parsed with the perl scripts [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readSNPdbe.pl.html readSNPdbe.pl] and[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPdbe2table.pl.html SNPdbe2table.pl] and vizualised with the R script [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_SNPdbe.R.html Hotspots_SNPdbe.R]. For now, a SNP was considered disease causing if there was a disease listed, if it was a "N/A" it was assigned "non-disease". Later we added those SNPs that were also found in HGMD to the disease causing SNPs (see [https://www.dropbox.com/s/ca1hkcr1ighozx3/SNPdbe_manuallyCur.Rtable SNPdbe_manuallyCur.Rtable]).
+The textfile containing all informations gathered in SNPdbe ([https://www.dropbox.com/s/eeboc9nratgeuj5/snps.txt snps.txt]) was downloaded after searching for the gene identifier "NP_000160" in the organism "human" ([http://www.rostlab.org/services/snpdbe/dosearch.php?id=name&val=NP_000160&organism2=human&organism1= see]). The textfile was parsed with the perl scripts [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readSNPdbe.pl.html readSNPdbe.pl] and[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPdbe2table.pl.html SNPdbe2table.pl] and vizualised with the R script [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_SNPdbe.R.html Hotspots_SNPdbe.R]. For now, a SNP was considered disease causing if there was a disease listed, if it was a "N/A" it was assigned "non-disease". Later we added those SNPs that were also found in HGMD to the disease causing SNPs (see [https://www.dropbox.com/s/ca1hkcr1ighozx3/SNPdbe_manuallyCur.Rtable SNPdbe_manuallyCur.Rtable]).
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readSNPdbe.pl.html readSNPdbe.pl]</span>
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readSNPdbe.pl.html readSNPdbe.pl]</span>
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPdbe2table.pl.html SNPdbe2table.pl]</span> > SNPdbe.wiki
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/SNPdbe2table.pl.html SNPdbe2table.pl]</span> > SNPdbe.wiki
- $ R CMD BATCH <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_SNPdbe.R.html Hotspots_SNPdbe.R]</span>
+ $ R CMD BATCH <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_SNPdbe.R.html Hotspots_SNPdbe.R]</span>
 == Mapping ==
-First of all, we created the plots showing the distribution of SNPs along the sequence of the gene GLA with the R Script [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/snp_distr.R.html snp_distr.R]. For this we prepared the data with [https://www.dropbox.com/s/7gmegsynrbxj56x/map.pl map.pl]. The non-redundant data was then used to build a histogram with [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Histogramm.R.html Histogramm.R] and to map the different groups of SNPs onto the structure 1R46 (Pymol script created with [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/pymol_mark_positions.sh.html pymol_mark_positions.sh]).
+First of all, we created the plots showing the distribution of SNPs along the sequence of the gene GLA with the R Script [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/snp_distr.R.html snp_distr.R]. For this we prepared the data with [https://www.dropbox.com/s/7gmegsynrbxj56x/map.pl map.pl]. The non-redundant data was then used to build a histogram with [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Histogramm.R.html Histogramm.R] and to map the different groups of SNPs onto the structure 1R46 (Pymol script created with [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/pymol_mark_positions.sh.html pymol_mark_positions.sh]).
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/map.pl.html map.pl]</span>
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/map.pl.html map.pl]</span>
- $ R CMD BATCH <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/snp_distr.R.html snp_distr.R]</span>
+ $ R CMD BATCH <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/snp_distr.R.html snp_distr.R]</span>
- $ R CMD BATCH <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Histogramm.R.html Histogramm.R]</span>
+ $ R CMD BATCH <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Histogramm.R.html Histogramm.R]</span>
- $ pymol <(bash <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/pymol_mark_positions.sh.html pymol_mark_positions.sh]</span>)
+ $ pymol <(bash <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/pymol_mark_positions.sh.html pymol_mark_positions.sh]</span>)
-Afterwards we tried to vizualise disease and non-disease causing point mutations in one plot with [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_all.R.html Hotspots_all.R]. We needed R tables of all databases for this, so some preparation had to be done with [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readHGMDforR.pl.html readHGMDforR.pl] (input [https://www.dropbox.com/s/i2vvunjz1kbxmfy/hgmd_snps.txt hgmd_snps.txt]) and [http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readdbSNPforR.pl.html readdbSNPforR.pl] (input [https://www.dropbox.com/s/i96c1wscdxqyb01/dbSNP_snps.txt dbSNP_snps.txt]).
+Afterwards we tried to vizualise disease and non-disease causing point mutations in one plot with [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_all.R.html Hotspots_all.R]. We needed R tables of all databases for this, so some preparation had to be done with [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readHGMDforR.pl.html readHGMDforR.pl] (input [https://www.dropbox.com/s/i2vvunjz1kbxmfy/hgmd_snps.txt hgmd_snps.txt]) and [https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readdbSNPforR.pl.html readdbSNPforR.pl] (input [https://www.dropbox.com/s/i96c1wscdxqyb01/dbSNP_snps.txt dbSNP_snps.txt]).
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readHGMDforR.pl.html readHGMDforR.pl]</span>
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readHGMDforR.pl.html readHGMDforR.pl]</span>
- $ perl <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/readdbSNPforR.pl.html readdbSNPforR.pl]</span>
+ $ perl <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/readdbSNPforR.pl.html readdbSNPforR.pl]</span>
- $ R CMD BATCH <span style="plain">[http://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_all.R.html Hotspots_all.R]</span>
+ $ R CMD BATCH <span style="plain">[https://dl.dropbox.com/u/13796643/fabry/snp_scripts/Hotspots_all.R.html Hotspots_all.R]</span>

Difference between revisions of "Fabry:Mapping point mutations/Journal"

Revision as of 14:37, 11 June 2012

Contents

HGMD

dbSNP

OMIM

SNPedia

SNPdbe

Mapping

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools