Difference between revisions of "Task 3 (MSUD)"

From Bioinformatikpedia
(Result)
(Result)
Line 15: Line 15:
   
   
For P10775, ReProf was run with the protein sequence fasta file and the PSSMs (see <code>/mnt/home/student/schillerl/MasterPractical/task3/pssm/</code>) derived from big_80 and SwissProt as input. The following tables show the comparison of the prediction results to the secondary structure assignment of DSSP.
+
For P10775, ReProf was run with the protein sequence fasta file and position specific scoring matrices (PSSM) derived from big_80 and SwissProt (see <code>/mnt/home/student/schillerl/MasterPractical/task3/pssm/</code>) as input. The following tables show the comparison of the prediction results to the secondary structure assignment of DSSP. The f-measure is the harmonic mean of recall and precision, it gives a good indication for the quality of a classificator.
 
Recall and Precision are defined as follows:
 
 
* recall = TP / (TP + FN)
 
 
* precision = TP / (TP + FP)
 
 
* f-measure = 2 * recall * precision / (recall + precision)
 
 
where TP means true positive, FP false positive and FN false negative. The f-measure is the harmonic mean of recall and precision, it gives a good indication for the quality of a classificator.
 
   
   
Line 66: Line 56:
 
|}
 
|}
   
  +
  +
Predictions using a PSSM instead of a simple sequence have a considerably better quality. All methods predict helices better than loops and these better than beta sheets. The results of the run with the big_80 PSMM are better for E and L and only slightly worse than those using the SwissProt PSMM.
   
 
The percentages of correctly identified secondary structure (H, E or L) for the three methods are 61 %, 86 % and 82 %. So for the remaining sequences, the method with the best performance (usage of PSSM derived from big_80 as input for ReProf) is used.
 
The percentages of correctly identified secondary structure (H, E or L) for the three methods are 61 %, 86 % and 82 %. So for the remaining sequences, the method with the best performance (usage of PSSM derived from big_80 as input for ReProf) is used.

Revision as of 13:46, 16 May 2013

Secondary structure

Lab journal

Result

The results for ReProf and PsiPred predictions and the DSSP assignments are in the following folders:

/mnt/home/student/schillerl/MasterPractical/task3/reprof/

/mnt/home/student/schillerl/MasterPractical/task3/psipred/

/mnt/home/student/schillerl/MasterPractical/task3/dssp/


For P10775, ReProf was run with the protein sequence fasta file and position specific scoring matrices (PSSM) derived from big_80 and SwissProt (see /mnt/home/student/schillerl/MasterPractical/task3/pssm/) as input. The following tables show the comparison of the prediction results to the secondary structure assignment of DSSP. The f-measure is the harmonic mean of recall and precision, it gives a good indication for the quality of a classificator.


Comparison of ReProf prediction (fasta input) to DSSP assignment
secondary structure element recall precision f-measure
H 0.719 0.585 0.645
E 0.211 0.500 0.296
L 0.616 0.654 0.635


Comparison of ReProf prediction (big_80 PSSM input) to DSSP assignment
secondary structure element recall precision f-measure
H 0.944 0.889 0.916
E 0.649 0.685 0.667
L 0.826 0.866 0.846


Comparison of ReProf prediction (SwissProt PSSM input) to DSSP assignment
secondary structure element recall precision f-measure
H 0.923 0.914 0.919
E 0.807 0.523 0.634
L 0.719 0.859 0.782


Predictions using a PSSM instead of a simple sequence have a considerably better quality. All methods predict helices better than loops and these better than beta sheets. The results of the run with the big_80 PSMM are better for E and L and only slightly worse than those using the SwissProt PSMM.

The percentages of correctly identified secondary structure (H, E or L) for the three methods are 61 %, 86 % and 82 %. So for the remaining sequences, the method with the best performance (usage of PSSM derived from big_80 as input for ReProf) is used.

Discussion

Disordered protein

Lab journal

Result

Discussion

Transmembrane helices

Lab journal

Result

Discussion

Signal peptides

Lab journal

Result

Discussion

GO terms

Lab journal

Result

Discussion