Difference between revisions of "Fabry:Sequence-based mutation analysis/Journal"
Rackersederj (talk | contribs) |
Staniewski (talk | contribs) (→Multiple sequence alignment) |
||
(13 intermediate revisions by 2 users not shown) | |||
Line 4: | Line 4: | ||
== Amino acid properties == |
== Amino acid properties == |
||
+ | IN: |
||
− | IN: aa_properties.txt<ref>Wikipedia, Amino Acid (June 11th, 2012), [http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties]. June 11th, 2012</ref>, resMass.txt<ref>ExPASy. The amino acid masses [http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html]. June 12th, 2012</ref>, ip.txt<ref>Wikipedia, Proteinogenic amino acid (May 20th, 2012) [http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties]. June 12th, 2012</ref>, [https://www.dropbox.com/s/qxaqmw62hkyfv3d/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt]<br> |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/aa_properties.txt aa_properties.txt]<ref>Wikipedia, Amino Acid (June 11th, 2012), [http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties]. June 11th, 2012</ref>, [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/resMass.txt resMass.txt]<ref>ExPASy. The amino acid masses [http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html]. June 12th, 2012</ref>, |
||
− | OUT: SNP_aaProps.txt, SNP_aaProps.wiki |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pI.txt pI.txt]<ref>Wikipedia, Proteinogenic amino acid (May 20th, 2012) [http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties]. June 12th, 2012</ref>, |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt] |
||
+ | |||
+ | OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_aaProps.txt SNP_aaProps.txt], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_aaProps.wiki SNP_aaProps.wiki] |
||
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/read_AAProp.pl.html read_AAProp.pl] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/read_AAProp.pl.html read_AAProp.pl] |
||
== Secondary Structure == |
== Secondary Structure == |
||
− | IN: [https:// |
+ | IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt], |
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280.dssp.ss P06280.dssp.ss], |
||
− | OUT: SNP_secStruc.txt, SNP_secStruc.wiki |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280.reprof.ss P06280.reprof.ss], |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280.psipred.ss P06280.psipred.ss] (from [[Fabry:Sequence-based_analyses#Alpha-galactosidase_A_.28P06280.29 |Task 3]])<br> |
||
+ | OUT: |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_secStruc.txt SNP_secStruc.txt], |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_secStruc.wiki SNP_secStruc.wiki] |
||
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/compare_seq_struc.pl.html compare_seq_struc.pl] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/compare_seq_struc.pl.html compare_seq_struc.pl] |
||
== Substitution matrices == |
== Substitution matrices == |
||
+ | IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt], |
||
− | IN: [https://www.dropbox.com/s/qxaqmw62hkyfv3d/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt], BLOSUM62.csv<ref>Koders BLOSUM 62 [http://www.koders.com/noncode/fidCE20DD761F1B6E8647B7EA1E8F4FCFF64C96F3B4.aspx]. June 12th. 2012</ref>, PAM250.csv<ref>Koders PAM250 [http://www.koders.com/noncode/fid383C7F1CDF7804C56F879E5725F36292EA24DEDE.aspx]. June 12th. 2012</ref><br> |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/BLOSUM62.csv BLOSUM62.csv]<ref>Koders BLOSUM 62 [http://www.koders.com/noncode/fidCE20DD761F1B6E8647B7EA1E8F4FCFF64C96F3B4.aspx]. June 12th. 2012</ref>, |
||
− | OUT: SNP_substMatr.txt, SNP_substMatr.wiki |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/PAM250.csv PAM250.csv]<ref>Koders PAM250 [http://www.koders.com/noncode/fid383C7F1CDF7804C56F879E5725F36292EA24DEDE.aspx]. June 12th. 2012</ref>, |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/PAM1.csv PAM1.csv] (thank you [[CD_task6_protocol#PAM1 | Canavan Disease Group]])<br> |
||
+ | OUT: |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_substMatr.txt SNP_substMatr.txt], |
||
+ | [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_substMatr.wiki SNP_substMatr.wiki] |
||
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/subst.pl.html subst.pl] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/subst.pl.html subst.pl] |
||
== PSSM == |
== PSSM == |
||
+ | bash [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/run_psi_blast.sh.html run_psi_blast.sh] |
||
+ | |||
== Multiple sequence alignment == |
== Multiple sequence alignment == |
||
+ | [http://www.uniprot.org/blast/uniprot/2012061770B0OSJCOA Blast result] |
||
+ | |||
== Scoring methods == |
== Scoring methods == |
||
=== SIFT === |
=== SIFT === |
||
− | IN: Prediction.txt<br> |
+ | IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/Prediction.txt Prediction.txt]<br> |
+ | OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_Sift.txt SNP_Sift.txt], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_Sift.wiki SNP_Sift.wiki] |
||
− | OUT: SNP_Sift.txt, SNP_Sift.wiki |
||
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readSift.pl.html readSift.pl] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readSift.pl.html readSift.pl] |
||
=== Polyphen2 === |
=== Polyphen2 === |
||
− | IN: [https:// |
+ | IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pickedSNPsNOINFO.txt pickedSNPsNOINFO.txt]<br> |
− | OUT: Polyphen.batch |
+ | OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/Polyphen.batch Polyphen.batch] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/prepBatchPolyphen2.pl.html prepBatchPolyphen2.pl] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/prepBatchPolyphen2.pl.html prepBatchPolyphen2.pl] |
||
− | IN: pph2-full.txt<br> |
+ | IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/pph2-full.txt pph2-full.txt]<br> |
+ | OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_pph2.txt SNP_pph2.txt], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_pph2.wiki SNP_pph2.wiki] |
||
− | OUT: SNP_pph2.txt, SNP_pph2.wiki |
||
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readPolyphen.pl.html readPolyphen.pl] |
perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/readPolyphen.pl.html readPolyphen.pl] |
||
=== SNAP === |
=== SNAP === |
||
+ | In order to gain predictions for every (non-silent) mutation at the positions of the selected SNPs, a new mutation file had to be generated. Since we are only interested in ten positions, SNAPs ''all'' keyword, was not an option. The [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/snap_mutations.txt mutations file] was generated by the script [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/snap_generate_mutations.sh.html snap_generate_mutations.sh]. |
||
+ | |||
+ | bash [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/snap_generate_mutations.sh.html snap_generate_mutations.sh] |
||
+ | |||
+ | We used the following command to obtain the SNAP2 predictions (see [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/run_snap2.sh.html run_snap2.sh]). |
||
+ | |||
+ | snap2 --tolerate -i P06280.fasta -m [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/snap_mutations.txt snap_mutations.txt] -o [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/P06280_snap2_result.txt P06280_snap2_result.txt] |& tee snap2.log |
||
+ | |||
+ | Since snap2 exited with an error without the ''tolerate'' option, we had to add it to the command. |
||
+ | |||
+ | --tolerate |
||
+ | Tolerate failures from external programs. Failures will trigger snap2 to |
||
+ | switch into fallback mode (predictions will have lower accuracy) |
||
+ | |||
+ | == Results and Conclusion == |
||
+ | IN: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_aaProps_final.wiki SNP_aaProps_final.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_secStruc.wiki SNP_secStruc.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_substMatr.wiki SNP_substMatr.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_Sift.wiki SNP_Sift.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/SNP_pph2.wiki SNP_pph2.wiki] |
||
+ | |||
+ | OUT: [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/overview.wiki overview.wiki], [https://dl.dropbox.com/u/13796643/fabry/seqbased/data/overview.txt overview.txt] |
||
+ | |||
+ | perl [https://dl.dropbox.com/u/13796643/fabry/seqbased/scripts/createOverviewTable.pl.html createOverviewTable.pl] |
||
== References == |
== References == |
Latest revision as of 22:24, 20 June 2012
Fabry Disease » Sequence-based mutation analysis » Journal
Contents
Amino acid properties
IN: aa_properties.txt<ref>Wikipedia, Amino Acid (June 11th, 2012), http://en.wikipedia.org/wiki/Amino_acid#Table_of_standard_amino_acid_abbreviations_and_properties. June 11th, 2012</ref>, resMass.txt<ref>ExPASy. The amino acid masses http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html. June 12th, 2012</ref>, pI.txt<ref>Wikipedia, Proteinogenic amino acid (May 20th, 2012) http://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties. June 12th, 2012</ref>, pickedSNPsNOINFO.txt
OUT: SNP_aaProps.txt, SNP_aaProps.wiki
perl read_AAProp.pl
Secondary Structure
IN: pickedSNPsNOINFO.txt,
P06280.dssp.ss,
P06280.reprof.ss,
P06280.psipred.ss (from Task 3)
OUT:
SNP_secStruc.txt,
SNP_secStruc.wiki
perl compare_seq_struc.pl
Substitution matrices
IN: pickedSNPsNOINFO.txt,
BLOSUM62.csv<ref>Koders BLOSUM 62 [1]. June 12th. 2012</ref>,
PAM250.csv<ref>Koders PAM250 [2]. June 12th. 2012</ref>,
PAM1.csv (thank you Canavan Disease Group)
OUT:
SNP_substMatr.txt,
SNP_substMatr.wiki
perl subst.pl
PSSM
bash run_psi_blast.sh
Multiple sequence alignment
Scoring methods
SIFT
IN: Prediction.txt
OUT: SNP_Sift.txt, SNP_Sift.wiki
perl readSift.pl
Polyphen2
IN: pickedSNPsNOINFO.txt
OUT: Polyphen.batch
perl prepBatchPolyphen2.pl
IN: pph2-full.txt
OUT: SNP_pph2.txt, SNP_pph2.wiki
perl readPolyphen.pl
SNAP
In order to gain predictions for every (non-silent) mutation at the positions of the selected SNPs, a new mutation file had to be generated. Since we are only interested in ten positions, SNAPs all keyword, was not an option. The mutations file was generated by the script snap_generate_mutations.sh.
bash snap_generate_mutations.sh
We used the following command to obtain the SNAP2 predictions (see run_snap2.sh).
snap2 --tolerate -i P06280.fasta -m snap_mutations.txt -o P06280_snap2_result.txt |& tee snap2.log
Since snap2 exited with an error without the tolerate option, we had to add it to the command.
--tolerate Tolerate failures from external programs. Failures will trigger snap2 to switch into fallback mode (predictions will have lower accuracy)
Results and Conclusion
IN: SNP_aaProps_final.wiki, SNP_secStruc.wiki, SNP_substMatr.wiki, SNP_Sift.wiki, SNP_pph2.wiki
OUT: overview.wiki, overview.txt
perl createOverviewTable.pl
References
<references/>