Link back to Task 03: Sequence-based Predictions

Task 3 Working Log

Secondary structure prediction

creation of pssm files via psi-blast

blastpgp -i /mnt/home/student/.../data/P45381.fasta -o /mnt/home/student/.../aspa_big80.out 
-d /mnt/project/pracstrucfunc13/data/big/big_80 -C /mnt/home/student/.../aspa_big80.chk -Q /mnt/home/student/.../aspa_big80.pssm 
-h 10e-10 -j 3) 

   where:
-i input file
-o outfile
-d database to search against
-C checkfile
-Q pssm file
-h eVaule cutoff
-j number of iterations

PSSMs from PSI-Blast generated with 3 iterations and an e-Value-cutoff of 10e-10
combinations:
- pssm from big 80
- pssm from swissprot
- without pssm
dssp-output and psipred-output generated via websites
dssp shows following problem for ASPA: 2I3C/P45381 is present as a homo dimer in crystal structure. As a result of this the PDB file that is used to generate the dssp output contains "the fasta sequence" twice. Additional DSSP starts assigning secondary structure at position 10 of the amino acid sequence and stops at position 301. Therefore the DSSP-output and the output of the remaining tools has to be properly aligned.
statistics where generated via sec_struc_pred_statistics.py
the statistics showed for the asparto-acylase:
- DSSP taken as "truth" as DSSP assigns secondary structure (does not predict) from the atomic coordinates
- the precision for each prediction method was calculated
- the results show that PSI-Pred shows the best precision
- how ever within ReProf ReProf with sequence profile from Big_80 shows the best result
- Reprof with Big_80 PSSM used for further predictions, however for comparison PSI-PRED runs are made as well
P10775 -> 1DFJ (attention only Chain I): DSSP assignment from PDB-file of 1DFJ however for comparison only the assignment of chain I in the PDB-file is acceptable
Q08209 -> 1AUI (attention only Chain A): DSSP assignment from PDB-file of 1AUI however for comparison only the assignment of chain A in the PDB-file is acceptable, additional to that parts of the sequence are not crystallized in the PDB-file therefore a couple of positions had to be manually inserted marked with '*'
Q9X0E6 -> 1O5J: DSSP assignment from PDB-file of 1O5J spans the complete protein therefore the DSSP assignment should be okay (except the frist 3 residues)

Disorder

creation of the IUPred predictions via run_iupred.sh
creation of analysis via disorder_statistics.py
applicable for both IUPred and Metadisorder
finding the right match for P10775 in disprot a sequence search had to be initiated (swiss-Waterman and PSI-Pred on the disprot-website)

TMH prediction

Polyphobius

blastget index file creation

/mnt/project/pracstrucfunc13/polyphobius/blastget -ix swiss_p.idx -create /mnt/project/pracstrucfunc13/data/swissprot/uniprot_sprot.fasta

blast get for the single files

/mnt/project/pracstrucfunc13/polyphobius/blastget -ix swiss_p.idx -db /mnt/project/pracstrucfunc13/data/swissprot/uniprot_sprot -ix swiss_p.idx ../data/query_seqs/TMH/P45381.fasta >> P45381.blastget.out

kaling for the single files

/mnt/opt/T-Coffee/bin/kalign -i P45381.blastget.out -o P45381.kalign.out

polyphobius

/mnt/project/pracstrucfunc13/polyphobius/jphobius -poly P45381.kalign.out >> P45381.polyph.out

MEMSAT
- Your job is in the queue under the name: P45381 with the job ID: cc3e3788-c45a-11e2-add6-00163e110593
- Your job is in the queue under the name: P35462 with the job ID: 4cf4b7b0-c3c7-11e2-840f-00163e110593
- Your job is in the queue under the name: P47863 with the job ID: 6799d884-c3c7-11e2-8b61-00163e110593
- Your job is in the queue under the name: Q9YDF8 with the job ID: 81e7ddda-c3c7-11e2-8b61-00163e110593
- http://bioinf.cs.ucl.ac.uk/psipred/result/81e7ddda-c3c7-11e2-8b61-00163e110593
- Difficulty to find the right protein to compare Q9YDF8 to (in OMP/PDBTM) -> how that was achieved is explained in the wiki

SignalP

for creation of the signalP outfiles SginalP version 4.1 was used (the web server)

GOterms:

GoPet see xml file
Protfun used via the webserver
PFam see: http://pfam.sanger.ac.uk/protein/P45381

Canavan Disease: Task 03 - Journal

Contents

Task 3 Working Log

Secondary structure prediction

Disorder

TMH prediction

SignalP

GOterms:

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools