Difference between revisions of "Task Structural Alignments"
From Bioinformatikpedia
m (→Explore structural alignments) |
m (→Explore structural alignments) |
||
Line 18: | Line 18: | ||
* Assemble a set of 8 to 9 structures related to your protein. These structures should span the range of similarities from almost identical to completely unrelated. You can take structures found in the sequence search and you can go to CATH. E.g. |
* Assemble a set of 8 to 9 structures related to your protein. These structures should span the range of similarities from almost identical to completely unrelated. You can take structures found in the sequence search and you can go to CATH. E.g. |
||
+ | ** one reference structure of your protein |
||
** one or two structure with identical sequence (ideally once with filled binding site, once unfilled, so you can make one pair with similar binding site status, one with different) |
** one or two structure with identical sequence (ideally once with filled binding site, once unfilled, so you can make one pair with similar binding site status, one with different) |
||
** one similar sequence (>60% seq. identity) |
** one similar sequence (>60% seq. identity) |
||
Line 27: | Line 28: | ||
** on arbitrary structure from a different CATH class |
** on arbitrary structure from a different CATH class |
||
− | * Apply different structural alignment methods to these structures: |
+ | * Apply different structural alignment methods to these structures (only superimpose to your reference structure, not all against all): |
− | ** use Pymol (will only work on more closely related structures) |
+ | ** use Pymol superimpose (will only work on more closely related structures) |
+ | *** if you have a defined binding site, see what changes if you use all atoms / only C_alpha / only binding site atoms |
||
− | ** |
||
+ | ** LGA |
||
+ | ** the one used by CATH (CATHEDRAL) |
||
+ | ** Topmatch |
||
+ | ** SAP or CE |
||
+ | * List the alignment scores the methods give you (e.g. RMSD) |
||
+ | * If numerically equivalent alignment scores differ (e.g. RMSD), think about why -- e.g. different sets of atoms used for superimposition. |
||
+ | * Qualitatively evaluate which methods give you the best feeling for structural relatedness. This might depend on the level of relatedness of the structures. In order to do this, look at some of the alignments in 3D. |
||
== Use structural alignments to evaluate sequence alignments == |
== Use structural alignments to evaluate sequence alignments == |
Revision as of 04:44, 27 May 2013
In order to evaluate the similarity between protein structures, the structures have to be superimposed in 3D. A multitude of methods are available to achieve this task. Also, there are many different measures to quantify structural similarity. In this task we will explore different methods and compare different measures to get a feeling for the structural similarity they imply. We will then apply structural alignment to evaluate some sequence-based alignments generated in Task 2 (Run sequence searches on the disease gene product and produce alignments).
Theoretical background talk
The introductory talks should given an overview of
- short review of SCOP und CATH
- Alignment methods:
- the one used by CATH (CATHEDRAL)
- Topmatch
- SAP or CE
- LGA (see http://proteinmodel.org/AS2TS/LGA/lga_format.html for documentation)
- Modelling scores:
- RMSD
- Topmatch scores
- GDT
- LCS
Explore structural alignments
- Assemble a set of 8 to 9 structures related to your protein. These structures should span the range of similarities from almost identical to completely unrelated. You can take structures found in the sequence search and you can go to CATH. E.g.
- one reference structure of your protein
- one or two structure with identical sequence (ideally once with filled binding site, once unfilled, so you can make one pair with similar binding site status, one with different)
- one similar sequence (>60% seq. identity)
- one rather unrelated sequence (<30% seq. identity)
- one arbitrary structure with a CATH code which is identical to your protein at each of these levels:
- CAT
- CA
- C
- on arbitrary structure from a different CATH class
- Apply different structural alignment methods to these structures (only superimpose to your reference structure, not all against all):
- use Pymol superimpose (will only work on more closely related structures)
- if you have a defined binding site, see what changes if you use all atoms / only C_alpha / only binding site atoms
- LGA
- the one used by CATH (CATHEDRAL)
- Topmatch
- SAP or CE
- use Pymol superimpose (will only work on more closely related structures)
- List the alignment scores the methods give you (e.g. RMSD)
- If numerically equivalent alignment scores differ (e.g. RMSD), think about why -- e.g. different sets of atoms used for superimposition.
- Qualitatively evaluate which methods give you the best feeling for structural relatedness. This might depend on the level of relatedness of the structures. In order to do this, look at some of the alignments in 3D.