Difference between revisions of "Canavan Disease: Task 04 - Structural Alignments"

From Bioinformatikpedia
(Comparison of SSAP, Topmatch, CE & LGA)
(Comparison of SSAP, Topmatch, CE & LGA)
Line 93: Line 93:
 
! colspan="2" style="background:#BFBFBF;" | CE
 
! colspan="2" style="background:#BFBFBF;" | CE
 
|-
 
|-
! style="background:#bfbfbf;" align="center" | Protein
+
! style="background:#E5E5E5;" align="center" | protein
 
! style="background:#E5E5E5;" align="center" width="40"| RMSD
 
! style="background:#E5E5E5;" align="center" width="40"| RMSD
 
! style="background:#E5E5E5;" align="center" width="40"| SeqId
 
! style="background:#E5E5E5;" align="center" width="40"| SeqId
Line 103: Line 103:
 
! style="background:#E5E5E5;" align="center" width="40"| SeqId
 
! style="background:#E5E5E5;" align="center" width="40"| SeqId
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 2O4H
+
| align="center" | 2O4H
 
| align="center" | 3.33
 
| align="center" | 3.33
 
| align="center" | 9.62
 
| align="center" | 9.62
Line 113: Line 113:
 
| align="center" | 100
 
| align="center" | 100
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 2Q51
+
| align="center" | 2Q51
 
| align="center" | 1.04
 
| align="center" | 1.04
 
| align="center" | 100
 
| align="center" | 100
Line 123: Line 123:
 
| align="center" | 100
 
| align="center" | 100
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 2GU2
+
| align="center" | 2GU2
 
| align="center" | 0.97
 
| align="center" | 0.97
 
| align="center" | 86.29
 
| align="center" | 86.29
Line 133: Line 133:
 
| align="center" | 84.59
 
| align="center" | 84.59
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 2QJ8
+
| align="center" | 2QJ8
 
| align="center" | 2.57
 
| align="center" | 2.57
 
| align="center" | 21.53
 
| align="center" | 21.53
Line 143: Line 143:
 
| align="center" | 13.86
 
| align="center" | 13.86
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 1AYE
+
| align="center" | 1AYE
 
| align="center" | 2.58
 
| align="center" | 2.58
 
| align="center" | 15.24
 
| align="center" | 15.24
Line 153: Line 153:
 
| align="center" | 7.14
 
| align="center" | 7.14
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 1BKJ
+
| align="center" | 1BKJ
 
| align="center" | 3.26
 
| align="center" | 3.26
 
| align="center" | 1.64
 
| align="center" | 1.64
Line 163: Line 163:
 
| align="center" | 4.04
 
| align="center" | 4.04
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 1BD0
+
| align="center" | 1BD0
 
| align="center" | 3.36
 
| align="center" | 3.36
 
| align="center" | 5.48
 
| align="center" | 5.48
Line 173: Line 173:
 
| align="center" | 1.71
 
| align="center" | 1.71
 
|-
 
|-
| style="background:#bfbfbf;" align="center" | 1B3U
+
| align="center" | 1B3U
 
| align="center" | 3.57
 
| align="center" | 3.57
 
| align="center" | 11.11
 
| align="center" | 11.11

Revision as of 15:01, 5 August 2013

LabJournal

Dataset

To gain the dataset as desired first a reference sequence (2I3C) was chosen. Then the dataset was generated using this sequence fulfilling the required criteria. The full composition and additional information can be found in <xr id="dataset"> Table </xr>. <figtable id="dataset">

Dataset composition
PDB-id Description Criterium
2I3C ASPA from Human reference structure
2O4H ASPA from Human with bound N-phosphonomethyl-L-aspartate sequence identity 100% & bound active centre
2Q51 ASPA from Human (Ensemble refinement) sequence identity 100% & unbound active centre
2GU2 ASPA from Rat seq. identity >60%
2QJ8 ASPA family protein from mesorhizobium loti sequence identity <30%
1AYE Procarbooxypeptidase from Human similar CATH classification for CAT
1BKJ FMN Oxireductase from vibrio harveyi similar CATH classification for CA
1BD0 Alanine racemase similar CATH classification for C
1B3U Regulatory domain of human PP2A completely different CATH classification
Overview of the dataset composition for Task 04, containing a brief description of the the chosen structures. Sequence identity and CATH classification similarities with respect to reference sequence 2I3C.

</figtable>


Structural Alignment Exploration

Pymol

2O4H vs. 2I3C

2O4H was found via the sequence search tab for the reference sequence 2I3C. The structure was chosen due to the fact that it is contained in the 100% sequence identity cluster. Additionally it has a bound compound at the active site however it is not N-acetyl-L-aspartate, but N-Hydroxy(methyl)phosphoric-L-aspartate binding to the same active center. This compound is not degraded through the enzymatic activity of the protein but "blocks" the active center and therefore the potential change in conformation of the protein can be captured by X-Ray crystallography.

Due to the fact that 2O4H and 2I3C have 100% sequence identity, the structural alignment via Pymol works very accurate. Both structures are within the bounds of the accuracy of X-ray crystallography the same. The RMSD between 2OH4 and 2I3C, calculated by the alignment process of Pymol is 0.445Å. As the measure for the divergence is smaller than possible to reach resolution of the structure they can be safely considered to be identical. The visual representation of the structural alignment is displayed in <xr id="2O4H_pymol">Figure </xr>. Additionally the possible conformational change of the protein due to the bound substrate in the active site can not be observed. <figure id="2O4H_pymol">

Representation of 2OH4 aligned to 2I3C. Both structures are displayed as carton, 2OH4 in black, 2I3C in orange. The zinc atom at the active site is represented as gray sphere, and the N-Hydroxy(methyl)phosphoric-L-aspartate is represented as balls and sticks at the active site. With a calculated RMSD of 0.445Å both structures can be considered the same as the divergence is even lower than the possible resolution of the crystal structure.

</figure>

2Q51 vs 2I3C

2Q51 was chosen to complement 2OH4 as it is annotated as the same sequence (100% sequence identity to 2I3C) but without a bound compound in the active center. Assuming that both 2Q51 and 2I3C share the same sequence and the property that both crystallized structures have no bound compound at the active site the result should show that both of them are identical in 3D structure as well (within resolution boundaries). However if comparing both structures with the aid of Pymol (see <xr id="2Q51_pymol">Figure </xr>) it is visible that they in fact differ at least slightly. They share a RMSD of 0.223Å which is smaller compared to the RMSD between 2O4H and 2I3C (0.314Å), nevertheless if compared visually they show different lengths of beta-strands and small variance in their conformation. Double checking the experimental origin of the PDB structure 2Q51 revealed that the atom coordinates and the conformation of the secondary structure elements were derived as a mean of multiple experimental 3D structure assignments, using X-ray crystallography. This fact is most certain the reason why the difference of the RMSD between the C-alpha atoms is that small, but the visual difference is bigger between 2Q51 and 2I3C than between 2OH4 and 2I3C (see above). <figure id="2Q51_pymol">

Representation of 2Q51 aligned to 2I3C. Both structures are displayed as carton, 2Q51 in blue, 2I3C in orange. The zinc atom at the active site is represented as gray sphere. The calculated RMSD between the C-alpha atoms of the two structures is as small as 0.223Å, however the displayed secondary structure elements vary in length and sterical conformation (see the beta-strands in the lefter loop region of the protein). The reason may be that the atom coordinates of 2Q51 represent a mean of multiple X-ray crystallography experiments to determine the structure of ASPA.

</figure>

2GU2 vs 2I3C

2GU2 is the ASPA ortholog in rat. Due to its sequence similarity of 84% to the human ASPA protein (2I3C) this protein was chosen as to represent the group of protein structures with a sequence similarity between 60% and 100%. Performing the structural alignment with the aid of Pymol reveals that the difference in the sequence between both proteins is the result of an extension of the N- and C-terminal ends of the protein which form a beta sheet in 2GU2 that is not present in 2I3C (see <xr id="2GU2_pymol">Figure </xr>). Otherwise the sequences are (in terms of 3D structure) identical within the borders of resolution. This is also reflected in the the RMSD of 0.493Å between the two aligned structures. In this example one important dogma, namely that Structure is better conserved that sequence can be observed very well. Despite having only about 80% sequence similarity the three dimensional of the two proteins is nearly identical. <figure id="2GU2_pymol">

Representation of 2GU2 aligned to 2I3C. Both structures are displayed as carton, 2GU2 in turquoise, 2I3C in orange. The zinc atom at the active site is represented as gray sphere. Apart form the N- and C-terminal ends of the 2GU2 peptide chain which form a beta-sheet both structures are identical. The calculated RMSD between the two peptides is 0.493Å.

</figure>

2QJ8 vs 2I3C

<figure id="2QJ8_pymol">

Representation of 2qJ8 aligned to 2I3C. Both structures are displayed as carton, 2QJ8 in green, 2I3C in orange. The zinc atom at the active site is represented as gray sphere.

</figure>

Remaining Proteins

<figure id="Remaining_pymol">

a) 1AYE vs 2I3C
b) 1BKJ vs 2I3C
c) 1BD0 vs 2I3C
d) 1B3U vs 2I3C
Foo Bar

</figure>

Comparison of SSAP, Topmatch, CE & LGA

<figtable id="comp">

Comparison of LGA, SSAP, Topmatch & CE
LGA SSAP (CATH) Topmatch CE
protein RMSD SeqId RMSD SeqId RMSD SeqId RMSD SeqId
2O4H 3.33 9.62 1.04 99 0.65 100 1.02 100
2Q51 1.04 100 1.04 100 1.04 100 1.00 100
2GU2 0.97 86.29 1.23 86 0.91 87 0.97 84.59
2QJ8 2.57 21.53 8.39 9 3.51 17 3.14 13.86
1AYE 2.58 15.24 4.19 8 2.85 13 3.83 7.14
1BKJ 3.26 1.64 18.59 3 2.87 6 4.27 4.04
1BD0 3.36 5.48 20.38 3 2.91 11 5.74 1.71
1B3U 3.57 11.11 28.05 6 3.34 9 6.05 6.30
text

</figtable>

Structural Alignment Evaluation

Tasks