Difference between revisions of "Sequence-based predictions Protocol TSD"

From Bioinformatikpedia
m (Secondary Structure)
(Secondary Structure)
Line 12: Line 12:
 
wget http://www.uniprot.org/uniprot/Q08209.fasta
 
wget http://www.uniprot.org/uniprot/Q08209.fasta
 
wget http://www.uniprot.org/uniprot/P06865.fasta
 
wget http://www.uniprot.org/uniprot/P06865.fasta
</source>
 
 
Start the predictions
 
<source lang="bash">
 
#!/bin/bash
 
 
cd ../input/
 
 
for file in `ls | grep .fasta` ; do
 
reprof -i $file -o ../prediction/
 
done
 
 
</source>
 
</source>
   
Line 43: Line 32:
 
wget http://www.pdb.org/pdb/files/1AUI.pdb
 
wget http://www.pdb.org/pdb/files/1AUI.pdb
 
wget http://www.pdb.org/pdb/files/2GJX.pdb
 
wget http://www.pdb.org/pdb/files/2GJX.pdb
  +
</source>
  +
  +
Start the predictions
  +
<source lang="bash">
  +
#!/bin/bash
  +
  +
cd ../input/
  +
  +
for file in `ls | grep .fasta` ; do
  +
reprof -i $file -o ../prediction/
  +
done
  +
  +
for file in `ls | grep .pdb` ; do
  +
./../bin/dssp-2.0.4-linux-amd64 -i $file -o ../prediction/$file.dssp
  +
done
 
</source>
 
</source>
   

Revision as of 16:41, 15 May 2012

Back to Task.

Secondary Structure

Get the sequences <source lang="bash">

  1. !/bin/bash

cd ../input

wget http://www.uniprot.org/uniprot/P10775.fasta wget http://www.uniprot.org/uniprot/Q9X0E6.fasta wget http://www.uniprot.org/uniprot/Q08209.fasta wget http://www.uniprot.org/uniprot/P06865.fasta </source>

For PSIPred use the webserver

For first get the executable <source lang="bash"> wget ftp://ftp.cmbi.ru.nl/pub/software/dssp/dssp-2.0.4-linux-amd64 chmod +x dssp-2.0.4-linux-amd64 </source>

Get the PDB files for the according Uniprot entries <source lang="bash">

  1. !/bin/bash

cd ../input

wget http://www.pdb.org/pdb/files/2BNH.pdb wget http://www.pdb.org/pdb/files/1KR4.pdb wget http://www.pdb.org/pdb/files/1AUI.pdb wget http://www.pdb.org/pdb/files/2GJX.pdb </source>

Start the predictions <source lang="bash">

  1. !/bin/bash

cd ../input/

for file in `ls | grep .fasta` ; do

       reprof -i $file -o ../prediction/

done

for file in `ls | grep .pdb` ; do

       ./../bin/dssp-2.0.4-linux-amd64 -i $file -o ../prediction/$file.dssp

done </source>

Disorder

Get the required sequences <source lang="bash">

  1. !/bin/bash

cd ../input

wget http://www.uniprot.org/uniprot/P10775.fasta wget http://www.uniprot.org/uniprot/Q9X0E6.fasta wget http://www.uniprot.org/uniprot/Q08209.fasta wget http://www.uniprot.org/uniprot/P06865.fasta </source>

Start the predictions <source lang="bash">

  1. !/bin/bash

cd /opt/iupred/ END=.iupred

for file in `ls /mnt/home/student/reeb/3_SeqBasedPred/2_DISO/input | grep .fasta` ; do

       IFS="."
       array=($file)
       unset IFS
       ./iupred /mnt/home/student/reeb/3_SeqBasedPred/2_DISO/input/$file long > /mnt/home/student/reeb/3_SeqBasedPred/2_DISO/predictions/${array[0]}$END

done

</source>

Transmembrane helices

Get the required sequence and our reference sequence <source lang="bash"> cd ../input/

wget http://www.uniprot.org/uniprot/P35462.fasta wget http://www.uniprot.org/uniprot/Q9YDF8.fasta wget http://www.uniprot.org/uniprot/P47863.fasta wget http://www.uniprot.org/uniprot/P06865.fasta </source>

Script for running polyphobius and creating everything needed in advance <source lang="bash">

  1. !/bin/bash
  2. $ -S /bin/sh


BLASTDB=$1 #/mnt/project/pracstrucfunc12/data/swissprot/uniprot_sprot BLASTINDEX=$2 #/mnt/project/pracstrucfunc12/data/index_pp/uniprot_sprot.idx WD=$3 OUT=$4 EXEC=/mnt/project/pracstrucfunc12/polyphobius/jphobius EXECBG=/mnt/project/pracstrucfunc12/polyphobius/blastget EXECKA=/mnt/opt/T-Coffee/bin/kalign END=.pred ENDBG=.bg ENDKA=.msa PARAMS=-poly PARAMSKA="-f fasta" PARAMSBG="-db $BLASTDB -ix $BLASTINDEX"

PATH=$PATH:/mnt/project/pracstrucfunc12/polyphobius/ export PATH


mkdir -p $OUT

cd $WD

pwd

`rm $OUT/log &> /dev/null`


for file in `ls | grep ".fasta"`; do

   echo "Processing $file" &>> $OUT/log
   IFS="."
   array=($file)
   unset IFS
   
   `perl $EXECBG $PARAMSBG $file > $OUT/${array[0]}$ENDBG`

wait

if [ `grep "^>" $OUT/${array[0]}$ENDBG | wc -l` -gt 1 ]; then

   	`$EXECKA $PARAMSKA -input $OUT/${array[0]}$ENDBG -output $OUT/${array[0]}$ENDKA`

wait

   	`perl $EXEC $PARAMS $OUT/${array[0]}$ENDKA &> $OUT/${array[0]}$END`

wait else

`perl $EXEC $PARAMS $OUT/${array[0]}$ENDBG &> $OUT/${array[0]}$END` fi done </source>

Start the predictions

<source lang="bash"> ./callPolyPhobius.sh /mnt/project/pracstrucfunc12/data/swissprot/uniprot_sprot /mnt/project/pracstrucfunc12/data/index_pp/uniprot_sprot.idx ../input/ ../prediction/sp/ </source>

Signal peptides

<source lang="bash">

  1. !/bin/bash

for file in /mnt/home/student/reeb/3_SeqBasedPred/4_SIGP/input/*fasta; do

       prot=${file##*/}
       protein=${prot%.*}
       signalp -t euk -graphics gif -d /mnt/home/student/reeb/3_SeqBasedPred/4_SIGP/prediction_v3/gif_$protein -trunc 70 $file > /mnt/home/student/reeb/3_SeqBasedPred/4_SIGP/prediction_v3/$protein.out

done

</source>

GO terms

Start the predictions for the methods by going to their webservers. For GOPet the most recent model, program version and database were used. We also incresed the maximum number of reported GO-Terms to the maxmimum of 100.