Difference between revisions of "Task 3 - Sequence-based predictions"

From Bioinformatikpedia
(Transmembrane helices)
(Some questions and what to put into the Wiki)
Line 43: Line 43:
 
== Some questions and what to put into the Wiki ==
 
== Some questions and what to put into the Wiki ==
 
* What features are predicted?
 
* What features are predicted?
* Results for your protein and the example proteins.
+
* Discuss the results for your protein and the example proteins.
 
* Briefly characterize the example proteins.
 
* Briefly characterize the example proteins.
* Look for other methods to get an idea how many different methods are available. (You can also try out more methods.)
+
* Look for other methods to get an idea how many different are available. (You can also try out more methods.)
 
* What else could be predicted from protein sequence alone?
 
* What else could be predicted from protein sequence alone?

Revision as of 02:32, 8 May 2012

In contrast to the vast amount of known protein sequences, information about structure and function is available for only very few proteins. Sequence-based predictions of protein features aim to decrease this gap. Many sequence-based preditiction methods use evolutionary information. Sequence alignments are therefore often a prerequisite for the predictions.

Theoretical background talks

The talks will give an introduction to sequence-based protein predictions. In particular:

  • secondary structure
  • disorder
  • transmembrane helices
  • GO terms

Where to run the jobs

  • You can log in to the student computer pool: i12k-biolab??.informatik.tu-muenchen.de, where ?? goes from 01 to 10.
  • Work in the student computer pool.
  • You can also install the programs on your own computer.

Secondary structure

Use ReProf (available as Debian package on rostlab.org) to predict secondary structure for your protein. Apply ReProf also to these proteins (given are UniProt IDs):

  • P10775
  • Q9X0E6
  • Q08209

Use fasta sequences for the prediction. You can find out about Reprof usage by running reprof or reading the man page (man reprof). Peter Hoenigschmig (hoenigschmid@rostlab.org) would like to hear about anything that would improve the description or if anything seems unclear. For help, you can always ask us first.

Compare the ReProf results to PsiPred and [ http://mrs.cmbi.ru.nl/hsspsoap/ DSSP_server] (DSSP). Before you use DSSP, find out more about the example proteins (and yours) using UniProt and the PDB.

Disorder

Use IUPred to predict disorder for your protein. Apply IUPred to the example proteins given above, too (run iupred). You can find a README here: /opt/iupred/README.

Compare the results to the information in the DisProt database.

Transmembrane helices

Use PolyPhobius to predict transmembrane helices for your protein and for the follwoing proteins (UniProt IDs given):

  • P35462
  • Q9YDF8

PolyPhobius is installed in /mnt/project/pracstrucfunc12/polyphobius/.

In contrast to its precursor Phobius, PolyPhobius uses homology information for the prediction. First, you have to execute a blast search. PolyPhobius distributed its own perl script for this purpose: blastget (/mnt/project/pracstrucfunc12/polyphobius/blastget). Usage: blastget -h. Use only the -db and -ix parameters. Input is the fasta sequence of the above given proteins. Use SwissProt (/mnt/project/pracstrucfunc12/data/swissprot/uniprot_sprot) as database and /mnt/project/pracstrucfunc12/data/index_pp/uniprot_sprot.idx as index.

Use the blastget output to create a MSA using Kalign (/mnt/opt/T-Coffee/bin/kalign).

Use the MSA as input for PolyPhobius (/mnt/project/pracstrucfunc12/polyphobius/jphobius). Usage: jphobius -h. Do not forget the -poly parameter.

Some questions and what to put into the Wiki

  • What features are predicted?
  • Discuss the results for your protein and the example proteins.
  • Briefly characterize the example proteins.
  • Look for other methods to get an idea how many different are available. (You can also try out more methods.)
  • What else could be predicted from protein sequence alone?