Gaucher Disease: Task 05 - Lab Journal

From Bioinformatikpedia
Revision as of 20:46, 3 September 2013 by Kalemanovm (talk | contribs) (All atom RMSD calculation near the binding sites)

Calculation of models

Structures set

We assembled the following structures in task 4, now we divide them into two groups: at > 60% and at < 30% sequence identity to our template protein, P04062 (536 aa long). PIDE was calculated as follows:

  • first we aligned the two fasta sequences with ClustalW
  • then calculated the PIDE (pairwise sequence identity) using SIAS with default options

Selected structures for single-template modelling are written in bold.

Homologous structures to P04062
PDB ID PIDE with target (%) Length (aa)
2XWD_A 91.51 505
2NSX_A 92.53 497
2NT1_A 92.53 497
2GEP_A 10.74 497
2F7K_A 8.95 327
2QGU 5.59 211
2ISB_A 5.59 192
2DJF_A 3.17 119
2DJF_B 2.98 164
2DJF_C 4.47 69

As the PIDE with P04062 in the low-PIDE group we have selected in task 4 is too low and 10.74% PIDE with 2GEP_A was not enough for Swiss-Model to align the sequences (with BLAST or HHsearch), we looked again at the found hit list in task 2 (HHblits, 2 iterations against Uniprot20 followed by one iteration against pdb_full with E-value cutoff 10E-10). The set of structures finally used is:

Homologous structures to P04062
PDB ID PIDE of alignment with target (%) Length (aa) Aligned columns (aa) Query coverage
3KE0_A 100 497 496 41-536
2XWD_A 100 505 497 40-536
2WKL_A 100 497 496 41-536
2NSX_A 100 497 496 41-536
2WNW_A 29 447 440 75-534
1VFF_A 22 423 98 151-228
3II1_A 20 535 84 452-514


We sued the command line executable to run Modeller. The necessary scripts for the alignments (pairwise and MSAs) and the modelling (with single and multiple templates) were prepared using this tutorial.


We executed Swiss-Model online using the 'Automatic Modelling Mode'. In the advanced options, the specific template was specified.


We also used iTasser from the web-server. In "Option I" a specific template was specified.

Evaluation of models

RMSD and GDT-score calculation

We used the TM-score tool from Zhang lab to calculate the RMSD and GDT-score. We also looked at the TM score. Execution:

TMscore <model> <native>

C_alpha RMSD calculation

We visualized all created models with Pymol. For this we aligned each model with the reference structures, 1OGS_A and 2V3E_B, and calculated the RMSD between the corresponding C_alpha atoms like in task 4:

align <native> and resi 1-497 and name ca, <model> and resi <from>-<to> and name ca

where "native" is the reference structure (1OGS_A or 2V3E_B) and "model" is the name of the model, "from" is its first residue and "to" its last residue.

All atom RMSD calculation near the binding sites

We resumed to work on the Pymol session, where the two references and the best model were aligned to each other (and represented as cartoons).

First, we selected the ligands and renamed the selections to:

  • NAG_1OGS for the ligand of 1OGS_A
  • ligands_2V3E for the ligands of 2V3E_B (3 NAG molecules, NAG, FUC and NND)

Second, we removed the waters from all structures.

Then, we selected all atoms in each reference structure within the distance of 6 Angstrom from the respective ligands, except for the atoms in the ligands themselves with the following commands:

select 1OGS_within6ofNAG, 1OGS_A and (not NAG_1OGS) within 6 of NAG_1OGS

-> 42 atoms selected

select 2V3E_within6ofligands, 2V3E_B and (not ligands_2V3E) within 6 of ligands_2V3E

-> 174 atoms selected The same was done for the model:

select model_within6ofNAG, P04062.model and (not NAG_1OGS) within 6 of NAG_1OGS

-> 40 atoms selected

select model_within6ofligands, P04062.model and (not ligands_2V3E) within 6 of ligands_2V3E

-> 178 atoms were selected (We represented the selections as sticks and colored differently.)

Next, we aligned all atoms in the respective selections and notated the RMS values: