Difference between revisions of "Task 4: Structural Alignments"
Line 89: | Line 89: | ||
| align="right" | [[File:pymol_1qvo_ca.png|thumb|200px|1QVO_A (red) aligned to 1A6Z_A (green). Both proteins share a high sequence identity and could be aligned quite good. ]] |
| align="right" | [[File:pymol_1qvo_ca.png|thumb|200px|1QVO_A (red) aligned to 1A6Z_A (green). Both proteins share a high sequence identity and could be aligned quite good. ]] |
||
| align="right" | [[File:pymol_1s7x_ca.png|thumb|200px|1S7X_A (red) aligned to 1A6Z_A (green). Although the two sequences only share 29% sequence identity, they could be aligned very good. This can be explained by the fact that the proteins are orthologs from two different species.]] |
| align="right" | [[File:pymol_1s7x_ca.png|thumb|200px|1S7X_A (red) aligned to 1A6Z_A (green). Although the two sequences only share 29% sequence identity, they could be aligned very good. This can be explained by the fact that the proteins are orthologs from two different species.]] |
||
− | | align="right" | [[File:pymol_2ia1_ca.png|thumb|200px|2IA1_A (red) aligned to 1A6Z_A (green).]] |
+ | | align="right" | [[File:pymol_2ia1_ca.png|thumb|200px|2IA1_A (red) aligned to 1A6Z_A (green). The two proteins could not be aligned well despite the fat that they share the same CAT numbers.]] |
− | | align="right" | [[File:pymol_3nci_ca.png|thumb|200px|3NCI_A (red) aligned to 1A6Z_A (green).]] |
+ | | align="right" | [[File:pymol_3nci_ca.png|thumb|200px|3NCI_A (red) aligned to 1A6Z_A (green). The alignment was not successful.]] |
− | | align="right" | [[File:pymol_1vzy_ca.png|thumb|200px|1VZY_A (red) aligned to 1A6Z_A (green).]] |
+ | | align="right" | [[File:pymol_1vzy_ca.png|thumb|200px|1VZY_A (red) aligned to 1A6Z_A (green). Because the proteins only share the same C number, the alignment is not good.]] |
− | | align="right" | [[File:pymol_1mus_ca.png|thumb|200px|1MUS_A (red) aligned to 1A6Z_A (green).]] |
+ | | align="right" | [[File:pymol_1mus_ca.png|thumb|200px|1MUS_A (red) aligned to 1A6Z_A (green). The proteins have completely different CATH annotations and therefore different structrues that cannot be aligned.]] |
|- |
|- |
||
|+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 3:''' Visualisation of the pariwise structural alignments of all selected proteins to the template 1A6Z_A. The template is shown in green and the target in red. |
|+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 3:''' Visualisation of the pariwise structural alignments of all selected proteins to the template 1A6Z_A. The template is shown in green and the target in red. |
Revision as of 13:35, 12 August 2013
<css> table.colBasic2 { margin-left: auto; margin-right: auto; border: 2px solid black; border-collapse:collapse; width: 70%; }
.colBasic2 th,td { padding: 3px; border: 2px solid black; }
.colBasic2 td { text-align:left; }
.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;}
</css>
PDB structures selection
We first selected a set of structures that span different ranges of sequence identity to the reference structure (1A6Z). The domain A of the reference structure has the CATH annotation 3.30.500.10.9 (Murine Class I Major Histocompatibility Complex H2-DB subunit A domain 1) and the domain b 2.60.40.10 (immunoglobulins). We decided to take the domain A as template and only searched for structures with a similar annotation to 3.30.500.10.9, since the immunglobulin domain is only bound to the protein and not directly connected. Also, because the disease causing mutations are all located in the MHC domain. <xr id="selected structures"/> list the structures, their CATH numbers and percent sequence idenity to the reference. Unfortunately, we could not find a structure with a sequence identity over 60%. The most similar structure we could find was 1qvo with 39% identity.
<figtable id="selected structures">
category | ID | chain | domain | CATH number | Sequence identity (%) | protein (organism) |
---|---|---|---|---|---|---|
reference | 1A6Z | A | 1 | 3.30.500.10 | - | HFE (Homo sapiens) |
identical sequence | 1DE4 | A | 1 | 3.30.500.10 | 100 | HFE (Homo sapiens) |
> 30% SeqID | 1QVO | A | 01 | 3.30.500.10 | 39 | HLA class I histocompatibility antigen, A-11 alpha chain (Homo sapiens) |
< 30% SeqID | 1S7X | A | 00 | 3.30.500.10 | 29 | H-2 class I histocompatibility antigen, D-B alpha chain (Mus musculus) |
CAT | 2IA1 | A | 01 | 3.30.500.20 | 11.1 | BH3703 protein (Bacillus halodurans) |
CA | 3NCI | A | 01 | 3.30.342.10 | 5.8 | DNA polymerase (Enterobacteria phage RB69) |
C | 1VZY | A | 01 | 3.55.30.10 | 2.8 | 33 KDA CHAPERONIN (Bacillus subtilis) |
different CATH | 1MUS | A | 01 | 1.10.246.40 | 12.6 | Tn5 transposase (Escherichia coli) |
</figtable>
Results
In Pymol, each structure from <xr id="selected structures"/> was aligned to the reference 1A6Z_A using only the C_alpha atoms and also using all the atoms. The resulting RMSD values are specified in table <xr id="score results"/>.
<figtable id="score results">
PDB ID | Seq. identity (%) | Pymol | LGA | SSAP | TopMatch | CE | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSD (only C_alpha) | RMSD (all atom) | RMSD | LGA_S | RMSD | SSAP_Score | RMSD | S | S_r | RMSD | Score | ||
1DE4_A | 100 | 0.675 (237) | 0.767 (1836) | 1.14 (267) | 95.77 | 1.60 (272) | 93.07 | 1.08 | 260 | 1.03 | 1.19 (267) | 543 |
1QVO_A | 39 | 2.165 (233) | 2.279 (1565) | 2.29 (259) | 67.86 | 2.58 (268) | 86.39 | 2.62 | 228 | 2.50 | 2.44 (266) | 432 |
1S7X_A | 29 | 1.889 (233) | 2.049 (1557) | 2.12 (256) | 71.90 | 2.36 (267) | 86.25 | 2.66 | 227 | 2.56 | 2.29 (265) | 342 |
2IA1_A | 11.1 | 18.132 (74) | 18.283 (501) | 2.83 (86) | 19.44 | 15.85 (140) | 56.19 | 2.91 | 76 | 2.82 | 3.93 (93) | 300 |
3NCI_A | 5.8 | 16.561 (26) | 17.329 (178) | 3.11 (84) | 17.19 | 14.54 (168) | 30.18 | 3.05 | 53 | 2.94 | 4.47 (75) | 333 |
1VZY_A | 2.8 | 6.260 (29) | 6.951 (168) | 3.25 (63) | 13.44 | 26.34 (208) | 58.01 | 2.61 | 68 | 2.53 | 5.80 (91) | 245 |
1MUS_A | 12.6 | 23.521 (180) | 23.891 (1143) | 2.82 (69) | 16.02 | 18.53 (215) | 46.30 | 3.58 | 69 | 3.43 | 6.61 (78) | 379 |
</figtable>
Images of the pairwise alignments using the C_alpha atoms are shown in <xr id="pymol str. al.">.
<figtable id="pymol str. al.">
</figtable>
Different structural alignments were applied to superimpose all the structures to the reference structure . The resulting alignments scores are specified in <xr id="score results"/>. The numbers in bracket after the RMSD values indicate the number of aligned residues that were used to compute the corresponding values.
- Pymol only uses a subset of atoms for the computation of the RMSD.
- LGA computes the RMSD from all atoms under distance cutoff. It therfore uses more atoms than Pymol if the proteins are similar, but few if the structures are more divergent.
- SSAP
- TopMatch
- CE