Fabry:Sequence-based analyses/Journal

From Bioinformatikpedia

Fabry Disease » Sequence-based analyses » Journal



Quidquid latine dictum sit, altum sonatur.

- Whatever is said in Latin sounds profound.


Secondary structure

ReProf

bash run_reprof.sh
bash parse_reprof.sh *.reprof

The *.reprof files contain the raw output from ReProf whereas the *.reprof.ss files only contain the extracted precidcted secondary structure of the protein.

PsiPred

For the PsiPred prediction, a PsiPred Server was used. The results were parsed using the scripts parse_psipred_ss2.sh for the PSIPRED HFORMAT and parse_psipred_txt.sh for the PSIPRED VFORMAT. The resulting secondary strcutres were consistent with each other.

bash parse_psipred_ss2.sh *.ss2
bash parse_psipred_txt.sh *_psipass2.txt

DSSP

For the DSSP anaylsis, the following PDB files were obtained from pdb.org:

UniProt-AC PDB-ID
P06280 1R46
P10775 2BNH
Q08209 1AUI
Q9X0E6 1KR4

After the DSSP results where fetched from the DSSP Server, the secondary structure was extraced from the output:

perl parse_dssp.pl *.dssp

Disorder

IUPred was run on all the proteins in each result type mode and the output is stored in files named <UniProt-AC>.<type>, e.g. P06280.long.

bash run_iupred.sh

Transmembrane helices

First we obtained the fasta sequences with the help of the bash script get_sequences.sh

./get_sequences.sh

With these sequences we carried out a blast search, using Polyphobius' blastget. The output we used for a multiple sequence alignment derived from Kalign and used this MSA as input for PolyPhobius. All this was done with the script call_polyphobius.sh

./call_polyphobius.sh

Most of the pictures were autogenerated by the databases and programs. We executed an extra Polyphobius online search for each protein to obtain additional data and plots.
For plotting the comparison of the length distribution, we used the R script length_distr.R

R CMD BATCH length_distr.R

Signal peptides

Again, we obtained the fasta files for the given proteins and our own desease causing AGAL with a script called get_sequences_sp.sh

./get_sequences_sp.sh

Since by the time we worked on SignalP, the version 4.0 was not working yet, we used the SignalP server, version 4.0.
To gain additional information we performed another Polyphobius prediction with all proteins.

./call_polyphobius_sp.sh

For better demonstration, we depicted Serum Albumin's signal peptide in Pymol using following commands:

orient
hide everything,all
show cartoon, all

color cyan, ss h
color yellow, ss s
color purple, ss "" 

select signPep, resi 1-18
color red, signPep

bg_color gray70
ray 
(see create_pic_pdb.txt)

Besides from that all pictures were obtained from databases and programs.

GO terms

Since all predictions were performed online, no script was used. We used the R Script protfun_category_barplots.R to illustrate the results.

R CMD BATCH protfun_category_barplots.R