Difference between revisions of "Fabry:Sequence-based analyses/Journal"
Staniewski (talk | contribs) (→DSSP) |
Staniewski (talk | contribs) |
||
Line 62: | Line 62: | ||
./<span class="plainlinks">[https://dl.dropbox.com/u/13796643/fabry/seq_analysis_scripts/call_polyphobius.sh.html call_polyphobius.sh]</span> |
./<span class="plainlinks">[https://dl.dropbox.com/u/13796643/fabry/seq_analysis_scripts/call_polyphobius.sh.html call_polyphobius.sh]</span> |
||
Most of the pictures were autogenerated by the databases and programs. We executed an extra Polyphobius online search for each protein to obtain additional data and plots.<br> |
Most of the pictures were autogenerated by the databases and programs. We executed an extra Polyphobius online search for each protein to obtain additional data and plots.<br> |
||
− | For plotting the comparison of the length distribution, we used the R script [https:// |
+ | For plotting the comparison of the length distribution, we used the R script [https://dl.dropbox.com/u/13796643/fabry/seq_analysis_scripts/length_distr.R.html length_distr.R] |
− | R CMD BATCH [https:// |
+ | R CMD BATCH <span class="plainlinks">[https://dl.dropbox.com/u/13796643/fabry/seq_analysis_scripts/length_distr.R.html length_distr.R]</span> |
== Signal peptides == |
== Signal peptides == |
Latest revision as of 15:02, 11 June 2012
Fabry Disease » Sequence-based analyses » Journal
Quidquid latine dictum sit, altum sonatur.
- - Whatever is said in Latin sounds profound.
Contents
Secondary structure
ReProf
bash run_reprof.sh bash parse_reprof.sh *.reprof
The *.reprof files contain the raw output from ReProf whereas the *.reprof.ss files only contain the extracted precidcted secondary structure of the protein.
PsiPred
For the PsiPred prediction, a PsiPred Server was used. The results were parsed using the scripts parse_psipred_ss2.sh for the PSIPRED HFORMAT and parse_psipred_txt.sh for the PSIPRED VFORMAT. The resulting secondary strcutres were consistent with each other.
bash parse_psipred_ss2.sh *.ss2 bash parse_psipred_txt.sh *_psipass2.txt
DSSP
For the DSSP anaylsis, the following PDB files were obtained from pdb.org:
UniProt-AC | PDB-ID |
---|---|
P06280 | 1R46 |
P10775 | 2BNH |
Q08209 | 1AUI |
Q9X0E6 | 1KR4 |
After the DSSP results where fetched from the DSSP Server, the secondary structure was extraced from the output:
perl parse_dssp.pl *.dssp
Disorder
IUPred was run on all the proteins in each result type mode and the output is stored in files named <UniProt-AC>.<type>, e.g. P06280.long.
bash run_iupred.sh
Transmembrane helices
First we obtained the fasta sequences with the help of the bash script get_sequences.sh
./get_sequences.sh
With these sequences we carried out a blast search, using Polyphobius' blastget. The output we used for a multiple sequence alignment derived from Kalign and used this MSA as input for PolyPhobius. All this was done with the script call_polyphobius.sh
./call_polyphobius.sh
Most of the pictures were autogenerated by the databases and programs. We executed an extra Polyphobius online search for each protein to obtain additional data and plots.
For plotting the comparison of the length distribution, we used the R script length_distr.R
R CMD BATCH length_distr.R
Signal peptides
Again, we obtained the fasta files for the given proteins and our own desease causing AGAL with a script called get_sequences_sp.sh
./get_sequences_sp.sh
Since by the time we worked on SignalP, the version 4.0 was not working yet, we used the SignalP server, version 4.0.
To gain additional information we performed another Polyphobius prediction with all proteins.
./call_polyphobius_sp.sh
For better demonstration, we depicted Serum Albumin's signal peptide in Pymol using following commands:
orient hide everything,all show cartoon, all color cyan, ss h color yellow, ss s color purple, ss "" select signPep, resi 1-18 color red, signPep bg_color gray70 ray (see create_pic_pdb.txt)
Besides from that all pictures were obtained from databases and programs.
GO terms
Since all predictions were performed online, no script was used. We used the R Script protfun_category_barplots.R to illustrate the results.
R CMD BATCH protfun_category_barplots.R