Difference between revisions of "Task 6 - EVfold"
From Bioinformatikpedia
(Blanked the page) |
|||
Line 1: | Line 1: | ||
− | For the proteins used in this practical, structures have been determined. However, in real-life projects, you often do not have protein structures only sequences. However, structure often provides crucial information and furthers understanding of the proteins function. |
||
− | During this practical, you already predicted secondary structure elements from protein sequence and generated homology models from protein structures with sequence similarity to your protein. This week, we will use evolutionary couplings or correlated mutations to predict structures from protein sequence alignments. |
||
− | |||
− | == Theoretical background talk == |
||
− | The talk will give an introduction to structure prediction from correlated mutations. In particular EVcouplings and EVfold are introduced. |
||
− | |||
− | |||
− | |||
− | |||
− | correlated mutations |
||
− | |||
− | local method: mutual information |
||
− | |||
− | |||
− | your protein |
||
− | + example P01112 (RASH_HUMAN) |
||
− | http://pfam.sanger.ac.uk/family/Ras |
||
− | |||
− | |||
− | calculating the evolutionary couplings |
||
− | |||
− | 0. aligment (clustalw.... Pfam alignment good) Many sequences |
||
− | |||
− | 1. a2m2lm.... => aligment |
||
− | 2. freecontact -> standard (installed on student computers) |
||
− | |||
− | Output -> all couplings + evolutionary coupling score (last column) |
||
− | |||
− | rank by score => look at distrubution, values, range |
||
− | |||
− | Meaning of score unclear |
||
− | |||
− | Take only scores for i+6, i.e. neighboring residues neglected, minimal 5 residues between coupled residues |
||
− | |||
− | Take ranking, check for each coupled pair the actual distance in the structure. TP: distance <= 5 AA (minimal distance of all pairs of all atoms of both residues) |
||
− | |||
− | |||
− | EVcoupling |
||
− | Check evolutionary hot spots, i.e. relevant residues, functionally important sites. |
||
− | Take L couplings (L=length of protein sequence), sum scores for each residue. Analyze. |
||
− | (Cell paper) |
||
− | |||
− | - compare to conservation, single site conservation |
||
− | |||
− | |||
− | EVfold.org |
||
− | |||
− | create model |
||
− | |||
− | choose number of contacts: |
||
− | optimum ~ 60-70% of L |
||
− | 40% of L |
||
− | 100% of L |
||
− | |||
− | => RMSD will be calculated by server, if you give PDB ID |
||
− | |||
− | PLM |