Task 4: Structural Alignments

From Bioinformatikpedia
Revision as of 15:35, 11 August 2013 by Betza (talk | contribs) (PDB structures selection)

<css> table.colBasic2 { margin-left: auto; margin-right: auto; border: 2px solid black; border-collapse:collapse; width: 70%; }

.colBasic2 th,td { padding: 3px; border: 2px solid black; }

.colBasic2 td { text-align:left; }

.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;}

</css>

lab journal task 4

PDB structures selection

We first selected a set of structures that span different ranges of sequence identity to the reference structure (1A6Z,A). The reference structure has the CATH numbers 3.30.500.10.9 (Murine Class I Major Histocompatibility Complex H2-DB subunit A domain 1) and also 2.60.40.10 (immunoglobulins). We decided to search for structures with a similar annotation to 3.30.500.10.9, since the immunglobulin domain is only bound to the protein. Also, because the disease causing mutations are all located in the MHC domain. Table ??? list the structures, their CATH numbers and percent sequence iden Edittity to the reference. Unfortunately, we could not find a structure with a sequence identity over 60%. The most similar structure we could find was 1qvo with 39% identity.

category ID chain domain CATH number Sequence identity (%)
reference 1A6Z A 1 3.30.500.10
identical sequence 1DE4 A 1 3.30.500.10 100
> 30% SeqID 1QVO A 01 3.30.500.10 39
< 30% SeqID 1S7X A 00 3.30.500.10 29
CAT 2IA1 A 01 3.30.500.20 11.1
CA 3NCI A 01 3.30.342.10 5.8
C 1VZY A 01 3.55.30.10 2.8
different CATH 1MUS A 01 1.10.246.40 12.6
Table 1: Table of the selected structures with their sequence identity to the refeerence 1A6Z_A.

The pairwise sequence alignments for computing the pairwise sequence identity were constructed with Emboss Needle http://www.ebi.ac.uk/Tools/psa/emboss_needle/ using default parameters.

Results

Differenct structural alignments were applied to superimpose the reference structure to all other structures. The resulting alignments scores are specified in table ????. The numbers in bracket after the RMSD values indicate the number of aligned residues that were used to compute the corresponding value.

PDB ID Seq. identity (%) Pymol LGA SSAP TopMatch CE
RMSD (only C_alpha) RMSD (all atom) RMSD LGA_S RMSD SSAP_Score RMSD S S_r RMSD Score
1DE4_A 100 0.675 (237) 0.767 (1836) 1.14 (267) 95.77 1.60 (272) 93.07 1.08 260 1.03 1.19 (267) 543
1QVO_A 39 2.165 (233) 2.279 (1565) 2.29 (259) 67.86 2.58 (268) 86.39 2.62 228 2.50 2.44 (266) 432
1S7X_A 29 1.889 (233) 2.049 (1557) 2.12 (256) 71.90 2.36 (267) 86.25 2.66 227 2.56 2.29 (265) 342
2IA1_A 11.1 18.132 (74) 18.283 (501) 2.83 (86) 19.44 15.85 (140) 56.19 2.91 76 2.82 3.93 (93) 300
3NCI_A 5.8 16.561 (26) 17.329 (178) 3.11 (84) 17.19 14.54 (168) 30.18 3.05 53 2.94 4.47 (75) 333
1VZY_A 2.8 6.260 (29) 6.951 (168) 3.25 (63) 13.44 26.34 (208) 58.01 2.61 68 2.53 5.80 (91) 245
1MUS_A 12.6 23.521 (180) 23.891 (1143) 2.82 (69) 16.02 18.53 (215) 46.30 3.58 69 3.43 6.61 (78) 379
Table 2: Results of the structural alignments of the selected proteins to the template 1A6Z_A. The different alignment scores are listed for each method and the numbers of equivalen residues are stated in brackets after the RMSD.


<figtable id="Structural alignments with Pymol to 1A6Z_A">

1DE4_A
1QVO_A
1S7X_A
2IA1_A
3NCI_A
1VZY_A
1MUS_A
Table 3: Visualisation of the pariwise structural alignments of all selected proteins to the template 1A6Z_A. The template is shown in green and the target in red.

</figtable>


  • Pymol only uses a subset of atoms for the computation of the RMSD.
  • LGA computes the RMSD from all atoms under distance cutoff. It therfore uses more atoms than Pymol if the proteins are similar, but few if the structures are more divergent.
  • SSAP
  • TopMatch
  • CE

Structural alignments for evaluating sequence alignments