Difference between revisions of "Sequence-based predictions (Phenylketonuria)"

From Bioinformatikpedia
(Secondary structure)
(Secondary structure)
Line 14: Line 14:
 
We wrote a program to filter the ReProf, PsiPred and DSSP outputs for the secondary structure: [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Phenylketonuria/Task3/Scripts filter_seqStruc.pl]
 
We wrote a program to filter the ReProf, PsiPred and DSSP outputs for the secondary structure: [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Phenylketonuria/Task3/Scripts filter_seqStruc.pl]
   
For DSSP PDB files are needed. Empty positions are converted to '-'.
+
For DSSP PDB files are needed. Empty positions are converted to '-'.
  +
The PDB IDs are:
  +
*P10775: 2BNH
  +
*Q9X0E6: 1VHF
  +
*Q08209: ...
  +
*P00439: 1PAH
   
 
{| border="1" cellpadding="5" cellspacing="0" align="center"
 
{| border="1" cellpadding="5" cellspacing="0" align="center"
Line 32: Line 37:
 
|-
 
|-
 
|}
 
|}
  +
   
 
<br>
 
<br>

Revision as of 20:24, 12 May 2013

Page is still under construction!!!

Summary

Sequence-based prediction approaches are useful to predict a variety of structural and functional properties of proteins. Here, we used different methods to provide useful information about our protein sequence of phenylalanine hydroxylase (PAH - P00439) and in some cases likewise for other given proteins (in brackets):

  • ReProf for secondary structure prediction (P10775, Q9X0E6, Q08209)
  • IUPred and MD (MetaDisorder) for the prediction of the disorder (P10775, Q9X0E6, Q08209)
  • PolyPhobius and MEMSAT-SVM to predict transmembrane helices (P35462, Q9YDF8, P47863)
  • SignalP to predict signal peptides (P02768, P47863, P11279)
  • GOPET and ProtFun2.0 to predict GO terms
  • Pfam with a sequence search to find out more about the Pfam family of our protein

The results are here presented and discussed in detail.

Secondary structure

We wrote a program to filter the ReProf, PsiPred and DSSP outputs for the secondary structure: filter_seqStruc.pl

For DSSP PDB files are needed. Empty positions are converted to '-'. The PDB IDs are:

  • P10775: 2BNH
  • Q9X0E6: 1VHF
  • Q08209: ...
  • P00439: 1PAH
"Secondary Structure"
Type ReProf PsiPred DSSP
Helix (alpha) H H GHI
Extended strand (beta) E E BE
Loops/Turns L C ST




P10775 (RNH1)

"Sensitivity of predicted secondary structures against DSSP structures."
Letter FASTA PSSM-Big PSSM-Swissprot PsiPred Uniprot
E 21.0 63.0 81.0 84.0 100.0
H 72.0 95.0 92.0 83.0 100.0
L 72.0 85.0 79.0 95.0 0.0
total 63.0 87.0 87.0 86.0 96.0
"Sensitivity of predicted secondary structures against Uniprot structures."
Letter FASTA PSSM-Big PSSM-Swissprot PsiPred Uniprot
E 22.0 56.0 71.0 73.0 80.0
H 74.0 97.0 95.0 89.0 100.0
L 0.0 0.0 0.0 0.0 0.0
total 64.0 89.0 90.0 86.0 96.0


Q9X0E6 (CUTA)

...

Q08209 (PPP3CA)

...

P00439 (PAH)

...

Disorder

...

IUPred

...

MD (MetaDisorder)

...

Transmembrane helices

...

PolyPhobius

...

P00439 (PAH)

...

P35462 (DRD3)

...

Q9YDF8 (KVAP)

...

P47863 (AQP4)

...

MEMSAT-SVM

...

Signal peptides

...

P00439 (PAH)

...

P02768 (ALB)

...

P47863 (AQP4)

...

P11279 (LAMP1)

...

GO terms

...

Pfam

...

Discussion

Questions:

  • What features are predicted?
  • Discuss the results for your protein and the example proteins. Using the predictions, what could you learn about your protein and the example proteins? Compare to the available knowledge in UniProt, PDB, DisProt, OPM, PDBTM, Pfam...
  • Look for other methods to get an idea how many different tools are available to predict: secondary structure, disorder, transmembrane, signal peptides and GO terms. You should be able to name several more methods in the discussion. (You can also try out more methods.)
  • What else can/is be predicted from protein sequence alone?
  • Which predictions can be improved considerably by structure-based approaches?