Lab Journal - Task 3 (PAH)

From Bioinformatikpedia
Revision as of 13:40, 17 August 2013 by Waldraffs (talk | contribs) (Secondary structure)

Secondary structure

For P10775 three calls were done using ReProf. For the first one a FASTA file is used as input, whereas PSSM matrices are delivered for the other two. One created with PSI-Blast against the big80 database the other against swissprot. PSI-Blast is used with the same parameter like in Task2 with two iterations and an e-value cutoff of 10e-10(for big80: blastpgp -i /mnt/home/student/waldraffs/Masterpraktikum/Task3/secondary_structure/<UniprotID>.fasta -d /mnt/project/rost_db/data/big/big_80 -j 2 -h 10e-10 -b 2000 -v 2000 -o check_out_files/<UniprotID>.out -Q swiss_matrix_<UniprotID>.pssm , for swissprot only the database is changed: -d /mnt/project/pracstrucfunc13/data/swissprot/uniprot_sprot). Simple call of ReProf:

  • reprof -i <query>.fasta
  • reprof -i <query>.pssm

Scripts created for this task:

  1. filter_secStruc.pl
  2. SecStrucComparison.jar

Disorder

IUPred

Before using this tool for the prediction, we had to compile IUPred with following command:

cc /opt/iupred/iupred.c -o /mnt/home/student/.../iupred

Afterwards one can invoke the programm as shown here:

iupred sequence.fasta long|short|glob > output.txt

Since the output is only given to Standard Out, we had to save the output into a file.

MD (MetaDisorder)

To invoke the programm one can use following command:

predictprotein --seqfile sequence.fasta --target metadisorder -p output_name -o output-directory

Transmembrane helices

PolyPhobius

Before using PolyPhobius, we had to do some steps:

  1. We generated a fasta file with all homologous sequences to the query sequence inside. Therefore, we used the blastget perl script and the swissprot database as followed:
     /mnt/project/pracstrucfunc13/polyphobius/blastget -db /mnt/project/pracstrucfunc13/data/swissprot/uniprot_sprot 
     -ix /mnt/project/pracstrucfunc13/data/index_pp/uniprot_sprot.idx sequence.fasta > sequence-blast.fasta 
  2. Afterwards, we used kalign for the MSA generation as shown here:
     /mnt/opt/T-Coffee/bin/kalign -i sequence-blast.fasta -o sequence-kalign.fasta -f fasta
     
  3. Now, we can run PolyPhobius with following command:
     /mnt/project/pracstrucfunc13/polyphobius/jphobius -poly sequence-kalign.fasta > sequence-polyphobius.txt
     

Signal peptides

We tried two different parameters for our predictions:
First we simple run SignalP without any constraints. The only thing, which has to be stated is -t euk as all four sequences are eukaryotic. Otherwise SignalP only would accept Gran+ or Gran-. -o can be set, so the output is written automatically in output.txt or it can be set with '>'.

  • signalp -t euk <UniprotID>.fasta > <UniprotID>_output.out

In our second run we choose only the N-terminal with 70 residues as it is recommended in the manual page of SignalP to avoid false positives.

  • signalp -trunc 70 -t euk <UniprotID>.fasta > <UniprotID>_trunc.out

In our case there are only few differences between the runs for the whole sequence or only the N-terminal. For example for the whole sequence the NN result of P47863 gives also a YES for C and not only for max.S. Table 15 shows the results of the N-terminal run only.