Gaucher Disease: Task 05 - Lab Journal
Calculation of models
We assembled the following structures in task 4, now we divide them into two groups: at > 60% and at < 30% sequence identity to our template protein, P04062 (536 aa long). PIDE was calculated as follows:
- first we aligned the two fasta sequences with ClustalW
- then calculated the PIDE (pairwise sequence identity) using SIAS with default options
Selected structures for single-template modelling are written in bold.
|Homologous structures to P04062|
|PDB ID||PIDE with target (%)||Length (aa)|
As the PIDE with P04062 in the low-PIDE group we have selected in task 4 is too low and 10.74% PIDE with 2GEP_A was not enough for Swiss-Model to align the sequences (with BLAST or HHsearch), we looked again at the found hit list in task 2 (HHblits, 2 iterations against Uniprot20 followed by one iteration against pdb_full with E-value cutoff 10E-10). The set of structures finally used is:
|Homologous structures to P04062|
|PDB ID||PIDE of alignment with target (%)||Length (aa)||Aligned columns (aa)||Query coverage|
We sued the command line executable to run Modeller. The necessary scripts for the alignments (pairwise and MSAs) and the modelling (with single and multiple templates) were prepared using this tutorial.
We executed Swiss-Model online using the 'Automatic Modelling Mode'. In the advanced options, the specific template was specified.
We also used iTasser from the web-server. In "Option I" a specific template was specified.
Evaluation of models
RMSD and GDT-score calculation
We used the TM-score tool from Zhang lab to calculate the RMSD and GDT-score. We also looked at the TM score. Execution:
TMscore <model> <native>
C_alpha RMSD calculation
We visualized all created models with Pymol. For this we aligned each model with the reference structures, 1OGS_A and 2V3E_B, and calculated the RMSD between the corresponding C_alpha atoms like in task 4:
align <native> and resi 1-497 and name ca, <model> and resi <from>-<to> and name ca
where "native" is the reference structure (1OGS_A or 2V3E_B) and "model" is the name of the model, "from" is its first residue and "to" its last residue.
All atom RMSD calculation near the binding sites
We resumed to work on the Pymol session, where the two references and the best model were aligned to each other (and represented as cartoons).
First, we selected the ligands and renamed the selections to:
- NAG_1OGS for the ligand of 1OGS_A
- ligands_2V3E for the ligands of 2V3E_B (3 NAG molecules, NAG, FUC and NND)
Second, we removed the waters from all structures.
Then, we selected all atoms in each reference structure within the distance of 6 Angstrom from the respective ligands, except for the atoms in the ligands themselves with the following commands:
select 1OGS_within6ofNAG, 1OGS_A and (not NAG_1OGS) within 6 of NAG_1OGS
-> 42 atoms selected
select 2V3E_within6ofligands, 2V3E_B and (not ligands_2V3E) within 6 of ligands_2V3E
-> 174 atoms selected The same was done for the model:
select model_within6ofNAG, P04062.model and (not NAG_1OGS) within 6 of NAG_1OGS
-> 40 atoms selected
select model_within6ofligands, P04062.model and (not ligands_2V3E) within 6 of ligands_2V3E
-> 178 atoms were selected (We represented the selections as sticks and colored differently.)
Next, we aligned all atoms in the respective selections and notated the RMS values: