Difference between revisions of "Fabry:Sequence-based mutation analysis/Journal"

From Bioinformatikpedia
(Substitution matrices)
(Multiple sequence alignment)
 
(14 intermediate revisions by 2 users not shown)
Line 4: Line 4:
   
 
== Amino acid properties ==
 
== Amino acid properties ==
  +
IN:
IN: aa_properties.txt<ref>Wikipedia, Amino Acid (June 11th, 2012), [http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties]. June 11th, 2012</ref>, resMass.txt<ref>ExPASy. The amino acid masses [http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html]. June 12th, 2012</ref>, ip.txt<ref>Wikipedia, Proteinogenic amino acid (May 20th, 2012‎) [http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties]. June 12th, 2012</ref>, [https://www.dropbox.com/s/qxaqmw62hkyfv3d/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt]<br>
 
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/aa_properties.txt aa_properties.txt]<ref>Wikipedia, Amino Acid (June 11th, 2012), [http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties]. June 11th, 2012</ref>, [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/resMass.txt resMass.txt]<ref>ExPASy. The amino acid masses [http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html]. June 12th, 2012</ref>,
OUT: SNP_aaProps.txt, SNP_aaProps.wiki
 
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pI.txt pI.txt]<ref>Wikipedia, Proteinogenic amino acid (May 20th, 2012‎) [http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties]. June 12th, 2012</ref>,
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt]
  +
  +
OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_aaProps.txt SNP_aaProps.txt], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_aaProps.wiki SNP_aaProps.wiki]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/read_AAProp.pl.html read_AAProp.pl]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/read_AAProp.pl.html read_AAProp.pl]
   
 
== Secondary Structure ==
 
== Secondary Structure ==
IN: [https://www.dropbox.com/s/qxaqmw62hkyfv3d/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt], P06280.dssp.ss, P06280.reprof.ss, P06280.psipred.ss (from [[Fabry:Sequence-based_analyses#Alpha-galactosidase_A_.28P06280.29 |Task 3]])<br>
+
IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt],
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280.dssp.ss P06280.dssp.ss],
OUT: SNP_secStruc.txt, SNP_secStruc.wiki
 
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280.reprof.ss P06280.reprof.ss],
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280.psipred.ss P06280.psipred.ss] (from [[Fabry:Sequence-based_analyses#Alpha-galactosidase_A_.28P06280.29 |Task 3]])<br>
  +
OUT:
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_secStruc.txt SNP_secStruc.txt],
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_secStruc.wiki SNP_secStruc.wiki]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/compare_seq_struc.pl.html compare_seq_struc.pl]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/compare_seq_struc.pl.html compare_seq_struc.pl]
   
 
== Substitution matrices ==
 
== Substitution matrices ==
  +
IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt],
IN: [https://www.dropbox.com/s/qxaqmw62hkyfv3d/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt], BLOSUM62.csv<ref>Koders BLOSUM 62 [http://www.koders.com/noncode/fidCE20DD761F1B6E8647B7EA1E8F4FCFF64C96F3B4.aspx]. June 12th. 2012</ref>, PAM250.csv<ref>Koders PAM250 [http://www.koders.com/noncode/fid383C7F1CDF7804C56F879E5725F36292EA24DEDE.aspx]. June 12th. 2012</ref><br>
 
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/BLOSUM62.csv BLOSUM62.csv]<ref>Koders BLOSUM 62 [http://www.koders.com/noncode/fidCE20DD761F1B6E8647B7EA1E8F4FCFF64C96F3B4.aspx]. June 12th. 2012</ref>,
OUT: SNP_substMatr.txt, SNP_substMatr.wiki
 
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/PAM250.csv PAM250.csv]<ref>Koders PAM250 [http://www.koders.com/noncode/fid383C7F1CDF7804C56F879E5725F36292EA24DEDE.aspx]. June 12th. 2012</ref>,
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/PAM1.csv PAM1.csv] (thank you [[CD_task6_protocol#PAM1 | Canavan Disease Group]])<br>
  +
OUT:
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_substMatr.txt SNP_substMatr.txt],
  +
[https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_substMatr.wiki SNP_substMatr.wiki]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/subst.pl.html subst.pl]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/subst.pl.html subst.pl]
   
 
== PSSM ==
 
== PSSM ==
  +
bash [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/run_psi_blast.sh.html run_psi_blast.sh]
  +
 
== Multiple sequence alignment ==
 
== Multiple sequence alignment ==
  +
[http://www.uniprot.org/blast/uniprot/2012061770B0OSJCOA Blast result]
  +
 
== Scoring methods ==
 
== Scoring methods ==
 
=== SIFT ===
 
=== SIFT ===
IN: Prediction.txt<br>
+
IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/Prediction.txt Prediction.txt]<br>
  +
OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_Sift.txt SNP_Sift.txt], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_Sift.wiki SNP_Sift.wiki]
OUT: SNP_Sift.txt, SNP_Sift.wiki
 
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readSift.pl.html readSift.pl]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readSift.pl.html readSift.pl]
   
 
=== Polyphen2 ===
 
=== Polyphen2 ===
IN: [https://www.dropbox.com/s/qxaqmw62hkyfv3d/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt]<br>
+
IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt]<br>
OUT: Polyphen.batch
+
OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/Polyphen.batch Polyphen.batch]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/prepBatchPolyphen2.pl.html prepBatchPolyphen2.pl]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/prepBatchPolyphen2.pl.html prepBatchPolyphen2.pl]
   
IN: pph2-full.txt<br>
+
IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pph2-full.txt pph2-full.txt]<br>
  +
OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_pph2.txt SNP_pph2.txt], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_pph2.wiki SNP_pph2.wiki]
OUT: SNP_pph2.txt, SNP_pph2.wiki
 
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readPolyphen.pl.html readPolyphen.pl]
 
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readPolyphen.pl.html readPolyphen.pl]
   
 
=== SNAP ===
 
=== SNAP ===
  +
In order to gain predictions for every (non-silent) mutation at the positions of the selected SNPs, a new mutation file had to be generated. Since we are only interested in ten positions, SNAPs ''all'' keyword, was not an option. The [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/snap_mutations.txt mutations file] was generated by the script [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/snap_generate_mutations.sh.html snap_generate_mutations.sh].
  +
  +
bash [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/snap_generate_mutations.sh.html snap_generate_mutations.sh]
  +
  +
We used the following command to obtain the SNAP2 predictions (see [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/run_snap2.sh.html run_snap2.sh]).
  +
  +
snap2 --tolerate -i P06280.fasta -m [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/snap_mutations.txt snap_mutations.txt] -o [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280_snap2_result.txt P06280_snap2_result.txt] |& tee snap2.log
  +
  +
Since snap2 exited with an error without the ''tolerate'' option, we had to add it to the command.
  +
  +
--tolerate
  +
Tolerate failures from external programs. Failures will trigger snap2 to
  +
switch into fallback mode (predictions will have lower accuracy)
  +
  +
== Results and Conclusion ==
  +
IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_aaProps_final.wiki SNP_aaProps_final.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_secStruc.wiki SNP_secStruc.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_substMatr.wiki SNP_substMatr.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_Sift.wiki SNP_Sift.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_pph2.wiki SNP_pph2.wiki]
  +
  +
OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/overview.wiki overview.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/overview.txt overview.txt]
  +
  +
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/createOverviewTable.pl.html createOverviewTable.pl]
  +
  +
== References ==
  +
<references/>

Latest revision as of 22:24, 20 June 2012

Fabry Disease » Sequence-based mutation analysis » Journal


Amino acid properties

IN: aa_properties.txt<ref>Wikipedia, Amino Acid (June 11th, 2012), http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties. June 11th, 2012</ref>, resMass.txt<ref>ExPASy. The amino acid masses http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html. June 12th, 2012</ref>, pI.txt<ref>Wikipedia, Proteinogenic amino acid (May 20th, 2012‎) http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties. June 12th, 2012</ref>, pickedSNPsNOINFO.txt

OUT: SNP_aaProps.txt, SNP_aaProps.wiki

perl read_AAProp.pl

Secondary Structure

IN: pickedSNPsNOINFO.txt, P06280.dssp.ss, P06280.reprof.ss, P06280.psipred.ss (from Task 3)
OUT: SNP_secStruc.txt, SNP_secStruc.wiki

perl compare_seq_struc.pl

Substitution matrices

IN: pickedSNPsNOINFO.txt, BLOSUM62.csv<ref>Koders BLOSUM 62 [1]. June 12th. 2012</ref>, PAM250.csv<ref>Koders PAM250 [2]. June 12th. 2012</ref>, PAM1.csv (thank you Canavan Disease Group)
OUT: SNP_substMatr.txt, SNP_substMatr.wiki

perl subst.pl

PSSM

bash run_psi_blast.sh

Multiple sequence alignment

Blast result

Scoring methods

SIFT

IN: Prediction.txt
OUT: SNP_Sift.txt, SNP_Sift.wiki

 perl readSift.pl

Polyphen2

IN: pickedSNPsNOINFO.txt
OUT: Polyphen.batch

 perl prepBatchPolyphen2.pl

IN: pph2-full.txt
OUT: SNP_pph2.txt, SNP_pph2.wiki

 perl readPolyphen.pl

SNAP

In order to gain predictions for every (non-silent) mutation at the positions of the selected SNPs, a new mutation file had to be generated. Since we are only interested in ten positions, SNAPs all keyword, was not an option. The mutations file was generated by the script snap_generate_mutations.sh.

bash snap_generate_mutations.sh

We used the following command to obtain the SNAP2 predictions (see run_snap2.sh).

snap2 --tolerate -i P06280.fasta -m snap_mutations.txt -o P06280_snap2_result.txt |& tee snap2.log

Since snap2 exited with an error without the tolerate option, we had to add it to the command.

--tolerate
    Tolerate failures from external programs. Failures will trigger snap2 to
    switch into fallback mode (predictions will have lower accuracy)

Results and Conclusion

IN: SNP_aaProps_final.wiki, SNP_secStruc.wiki, SNP_substMatr.wiki, SNP_Sift.wiki, SNP_pph2.wiki

OUT: overview.wiki, overview.txt

perl createOverviewTable.pl

References

<references/>