Molecular Dynamics Simulations Gaucher Disease

From Bioinformatikpedia

In this task, we carried out molecular dynamics simulations on the LRZ supercomputer which were analysed in the subsequent task. Simulating the the motions of proteins under different conditions can give insights about the functions of proteins and is used to study diseases which are caused by misfolded proteins. Here, we employed molecular dynamics simulations to investigate the impact of two point-mutations in glycosylceramidase.

Input structures

We applied molecular dynamics simulations for three structures of glycosylceramidase: the wildtype 2nt0_A which served as reference and the two mutants W209R and L470P which were generated by FoldX. W209R is a disease-causing mutation which might severely change the protein structure since the site is part of a alpha-helix and the negative arginine is likely to interfere with the hydrophobic environment. L470P is not listed in the HGMD and is therefore not disease-causing. However, both the sequence-based mutation analysis and the structure-based mutation analysis suggested that this mutation might be deleterious. By simulating structural changes caused by this mutations, we hope to get a better understanding about its impact on the phenotype.

Preparation

The input structures and output files are stored in a repository which can be checked out by:

git clone /mnt/home/student/angermue/mp/tasks/task08

We followed this description for preparing the molecular dynamics simulation:

  1. We checked out AGroS from https://github.com/offmarc/AGroS.
  2. We downloaded Scwrl4 from http://dunbrack.fccc.edu/scwrl4 and installed it into the same directory as AGroS.
  3. We prepared the job scripts.
  4. We submitted the scripts via sbatch SCRIPT-FILE on host lxa1 which runs the SLURM scheduler

Each job performs a molecular dynamics simulation via AGroS.

The AGroS molecular dynamics simulation pipeline

AGroS is a program that controls the three major steps of a molecular dynamics stimulation via GROMACS:

  1. Preparation
    • Cleaning the PDB
    • Adding missing side-chains
    • Adding the solvent: H2O + NaCl
  2. Equilibration
    • Minimizing the potential energy by varying
      • the solvent
      • the solvent and side-chains
    • Short NVT run (50 ps)
    • Short NPT run (50 ps)
  3. Production MD simulation
    • The main MD simulation (10 ns)

In the first steps, the structure is modified such that it is tolerated by GROMACS. In the equilibrium stage, the potential energy is minimized by slightly changing the position of solvent molecules and side-chains. The short NVT and NPT run brings the system into a state which meets certain input conditions, i.e. the starting temperature. In essence, equilibrating the system should avoid strong forces acting acting on particles at the beginning of the simulation. The equilibrated system is then used to carry out the main simulation.

Intermediate PDB files

AGroS creates several intermediate PDB files in the following order:

1. NAME.pdb The input structure
2. NAME_repair.pdb NAME.pdb with common PDB problems repaired by repairPDB
3. NAME_br.pdb Just the protein in NAME.pdb
4. NAME_water.pdb Just the water in NAME.pdb
5. NAME_dna.pdb Just the DNA in NAME.pdb
6. NAME_sc.pdb NAME.pdb with side-chains optimized by SCWRL
7. NAME_nh.pdb NAME_sc.pdb without hydrogen atoms with the water as well as DNA molecules of NAME.pdb included
8. NAME_solv.pdb NAME_nh.pdb with the solvent H20 and 0.1 NaCl included
9. NAME_solv_min.pdb NAME_solv.pdb with the potential energy of the solvent minimized
10. NAME_solv_min2.pdb NAME_solv_min.pdb with the potential energy of the solvent and side-chains minimized
11. NAME_solv_min3.pdb NAME_solv_min2.pdb with the potential energy of the solvent and side-chains minimized again

Our input structure 2nt0 was not modified by repairPDB such that 2nt0.png and 2nt0_repair.pdb were identical. Since 2nt0 contained neither water nor DNA, 2nt0_water.pdb and 2nt0_dna.pdb were empty. The side-chain optimization by SCWRL led to slightly different secondary structure assignments (cf. <xr id="fig:pdb_repair_sc"/>). <xr id="fig:pdb_solv"/> depicts 2nt0_solv.pdb and the added Na+ and Cl- ions without water. The minimization of the solvent changed the position of Na+ and Cl- ions but not the protein (cf. <xr id="fig:pdb_solv_min"/>). The subsequent second minimization of the solvent and the side-chains did not alter the position of Na+ and Cl- ions any more. However, minimal deviations between the side-chains of 2nt0_solv_min.pdb and 2nt0_solv_min2.pdb could be observed (cf. <xr id="fig:pdb_min_min2"/>). <xr id="fig:pdb_min2_min3"/> shows that the third minimization run marginally changed some side-chains.

</figure> </figure> </figure> </figure> </figure>
<figure id="fig:pdb_repair_sc">
2nt0_repair.pdb (green) vs. 2nt0_sc.db (blue).
<figure id="fig:pdb_solv">
2nt0_solv.pdb (yellow) along with Na+ (blue) and Cl- (cyan).
<figure id="fig:pdb_solv_min">
2nt0_solv.pdb (yellow) vs. 2nt0_solv_min.pdb (orange).
<figure id="fig:pdb_min_min2">
2nt0_solv_min.pdb (orange) vs. 2nt0_solv_min2.pdb (red).
<figure id="fig:pdb_min2_min3">
2nt0_solv_min2.pdb (red) vs. 2nt0_solv_min3.pdb (purple).

Runtime

The jobs were queued for about 48h. The actual runtime is listed in <xr id="tab:runtime"/>. <figtable id="tab:runtime">

Wiltype 16h 51min
L470P 9h 36min
W209P 9h 35min

Runtime of the MD simulations. </figtable>

Validation

We did a course validation by looking at the output files and calling the tool gmxcheck -f 2nt0_A_md.xtc. According to the output, all simulations terminated successfully. All simulations produced 2001 frames.