Difference between revisions of "Task 4: Structural Alignments"
Line 27: | Line 27: | ||
We first selected a set of structures that span different ranges of sequence identity to the reference structure (1A6Z). |
We first selected a set of structures that span different ranges of sequence identity to the reference structure (1A6Z). |
||
The domain A of the reference structure has the CATH annotation 3.30.500.10.9 (Murine Class I Major Histocompatibility Complex H2-DB subunit A domain 1) and the domain b 2.60.40.10 (immunoglobulins). We decided to take the domain A as template and only searched for structures with a similar annotation to 3.30.500.10.9, since the immunglobulin domain is only bound to the protein and not directly connected. Also, because the disease causing mutations are all located in the MHC domain. |
The domain A of the reference structure has the CATH annotation 3.30.500.10.9 (Murine Class I Major Histocompatibility Complex H2-DB subunit A domain 1) and the domain b 2.60.40.10 (immunoglobulins). We decided to take the domain A as template and only searched for structures with a similar annotation to 3.30.500.10.9, since the immunglobulin domain is only bound to the protein and not directly connected. Also, because the disease causing mutations are all located in the MHC domain. |
||
− | + | <xr id="sekected structures"/> list the structures, their CATH numbers and percent sequence idenity to the reference. Unfortunately, we could not find a structure with a sequence identity over 60%. The most similar structure we could find was 1qvo with 39% identity. |
|
+ | <figtable id="selected structures"> |
||
{|class="colBasic2" |
{|class="colBasic2" |
||
! category || ID || chain || domain || CATH number || Sequence identity (%) || protein (organism) |
! category || ID || chain || domain || CATH number || Sequence identity (%) || protein (organism) |
||
Line 49: | Line 50: | ||
|+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 1:''' Table of the selected pdb structures, the chain, the CATH annotation, their sequence identity to the refeerence 1A6Z_A and the protein type. |
|+ style="caption-side: bottom; text-align: left" |<font size=2>'''Table 1:''' Table of the selected pdb structures, the chain, the CATH annotation, their sequence identity to the refeerence 1A6Z_A and the protein type. |
||
|} |
|} |
||
+ | <\figtable> |
||
The pairwise sequence alignments for computing the pairwise sequence identity were constructed with Emboss Needle http://www.ebi.ac.uk/Tools/psa/emboss_needle/ using default parameters. |
The pairwise sequence alignments for computing the pairwise sequence identity were constructed with Emboss Needle http://www.ebi.ac.uk/Tools/psa/emboss_needle/ using default parameters. |
||
Line 79: | Line 81: | ||
− | <figtable id=" |
+ | <figtable id="pymol str. al."> |
{| class="wikitable" style="float: left; margin: 1em 0 0 0; border: 1px solid black;" cellpadding="0" |
{| class="wikitable" style="float: left; margin: 1em 0 0 0; border: 1px solid black;" cellpadding="0" |
||
! scope="row" align="left" | |
! scope="row" align="left" | |
Revision as of 12:11, 12 August 2013
<css> table.colBasic2 { margin-left: auto; margin-right: auto; border: 2px solid black; border-collapse:collapse; width: 70%; }
.colBasic2 th,td { padding: 3px; border: 2px solid black; }
.colBasic2 td { text-align:left; }
.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;}
</css>
PDB structures selection
We first selected a set of structures that span different ranges of sequence identity to the reference structure (1A6Z). The domain A of the reference structure has the CATH annotation 3.30.500.10.9 (Murine Class I Major Histocompatibility Complex H2-DB subunit A domain 1) and the domain b 2.60.40.10 (immunoglobulins). We decided to take the domain A as template and only searched for structures with a similar annotation to 3.30.500.10.9, since the immunglobulin domain is only bound to the protein and not directly connected. Also, because the disease causing mutations are all located in the MHC domain. <xr id="sekected structures"/> list the structures, their CATH numbers and percent sequence idenity to the reference. Unfortunately, we could not find a structure with a sequence identity over 60%. The most similar structure we could find was 1qvo with 39% identity.
<figtable id="selected structures">
category | ID | chain | domain | CATH number | Sequence identity (%) | protein (organism) |
---|---|---|---|---|---|---|
reference | 1A6Z | A | 1 | 3.30.500.10 | - | HFE (Homo sapiens) |
identical sequence | 1DE4 | A | 1 | 3.30.500.10 | 100 | HFE (Homo sapiens) |
> 30% SeqID | 1QVO | A | 01 | 3.30.500.10 | 39 | HLA class I histocompatibility antigen, A-11 alpha chain (Homo sapiens) |
< 30% SeqID | 1S7X | A | 00 | 3.30.500.10 | 29 | H-2 class I histocompatibility antigen, D-B alpha chain (Mus musculus) |
CAT | 2IA1 | A | 01 | 3.30.500.20 | 11.1 | BH3703 protein (Bacillus halodurans) |
CA | 3NCI | A | 01 | 3.30.342.10 | 5.8 | DNA polymerase (Enterobacteria phage RB69) |
C | 1VZY | A | 01 | 3.55.30.10 | 2.8 | 33 KDA CHAPERONIN (Bacillus subtilis) |
different CATH | 1MUS | A | 01 | 1.10.246.40 | 12.6 | Tn5 transposase (Escherichia coli) |
<\figtable>
The pairwise sequence alignments for computing the pairwise sequence identity were constructed with Emboss Needle http://www.ebi.ac.uk/Tools/psa/emboss_needle/ using default parameters.
Results
Differenct structural alignments were applied to superimpose the reference structure to all other structures. The resulting alignments scores are specified in table ????. The numbers in bracket after the RMSD values indicate the number of aligned residues that were used to compute the corresponding value.
PDB ID | Seq. identity (%) | Pymol | LGA | SSAP | TopMatch | CE | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSD (only C_alpha) | RMSD (all atom) | RMSD | LGA_S | RMSD | SSAP_Score | RMSD | S | S_r | RMSD | Score | ||
1DE4_A | 100 | 0.675 (237) | 0.767 (1836) | 1.14 (267) | 95.77 | 1.60 (272) | 93.07 | 1.08 | 260 | 1.03 | 1.19 (267) | 543 |
1QVO_A | 39 | 2.165 (233) | 2.279 (1565) | 2.29 (259) | 67.86 | 2.58 (268) | 86.39 | 2.62 | 228 | 2.50 | 2.44 (266) | 432 |
1S7X_A | 29 | 1.889 (233) | 2.049 (1557) | 2.12 (256) | 71.90 | 2.36 (267) | 86.25 | 2.66 | 227 | 2.56 | 2.29 (265) | 342 |
2IA1_A | 11.1 | 18.132 (74) | 18.283 (501) | 2.83 (86) | 19.44 | 15.85 (140) | 56.19 | 2.91 | 76 | 2.82 | 3.93 (93) | 300 |
3NCI_A | 5.8 | 16.561 (26) | 17.329 (178) | 3.11 (84) | 17.19 | 14.54 (168) | 30.18 | 3.05 | 53 | 2.94 | 4.47 (75) | 333 |
1VZY_A | 2.8 | 6.260 (29) | 6.951 (168) | 3.25 (63) | 13.44 | 26.34 (208) | 58.01 | 2.61 | 68 | 2.53 | 5.80 (91) | 245 |
1MUS_A | 12.6 | 23.521 (180) | 23.891 (1143) | 2.82 (69) | 16.02 | 18.53 (215) | 46.30 | 3.58 | 69 | 3.43 | 6.61 (78) | 379 |
<figtable id="pymol str. al.">
</figtable>
- Pymol only uses a subset of atoms for the computation of the RMSD.
- LGA computes the RMSD from all atoms under distance cutoff. It therfore uses more atoms than Pymol if the proteins are similar, but few if the structures are more divergent.
- SSAP
- TopMatch
- CE