Canavan Disease: Task 03 - Sequence-based Predictions
Secondary Structure
To determine which approach to follow we examined the proposed run-combinations for reprof, where prediction only from FASTA-sequence vs. prediction from PSSM generated by PSI-Blast was looked at. Additionally the prediction of the secondary structure by reprof with PSSM was further divided into PSSM generated by using big_80 and PSSM generated by using SwissProt. For further comparison a secondary structure prediction via PSI-Pred was initiated as well as a secondary structure assignment by DSSP. As DSSP assigns the secondary structure using the atom coordinates stored in PDB, we assume that we can use the DSSP assignment as the "true secondary structure" and compare the prediction methods in terms of performance to DSSP as reference. For the secondary structure prediction of Aspartoacylase (P45381|ACY2_HUMAN) the results are displayed in Table 1. As Psi-Pred predictions when run via the official webserver take up much more time than running ReProf locally on the students lab, the decission to further use Reprof was made. More specifically Reprof with a position specific scoring matrix derived from big_80 was chosen (PSSM created with Psi-Blast, cut-off e-10 and 3 iterations)
<figtable id="ACY_2 statistics">
Secondary Structure Prediction Statistics for ACY_2 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Precission | Recall | F-Meassure | ||||||||||
Type | H | E | L | all | H | E | L | all | H | E | L | all |
Reprof (FASTA) | 0.773 | 0.822 | 0.562 | 0.689 | 0.829 | 0.446 | 0.808 | 0.689 | 0.800 | 0.578 | 0.663 | 0.689 |
Reprof (big_80) | 0.878 | 0.889 | 0.644 | 0.782 | 0.793 | 0.675 | 0.89 | 0.782 | 0.833 | 0.767 | 0.747 | 0.782 |
Reprof (SwissProt) | 0.853 | 0.937 | 0.62 | 0.777 | 0.780 | 0.711 | 0.849 | 0.777 | 0.815 | 0.809 | 0.717 | 0.777 |
Psi-Pred | 0.914 | 0.970 | 0.647 | 0.815 | 0.780 | 0.771 | 0.904 | 0.815 | 0.842 | 0.859 | 0.754 | 0.815 |
</figtable>