Difference between revisions of "Gaucher Disease: Task 09 - Structure-based mutation analysis"

From Bioinformatikpedia
(foldX)
(2. Visualization of the mutations to work with)
Line 53: Line 53:
 
<figtable id="mutations">
 
<figtable id="mutations">
 
{|class="colBasic2"
 
{|class="colBasic2"
! Reference || Codon change || Codon Number (UniProt) || Codon Number (PDB) || Amino Acid change || Polarity || Charge (pH) || Disease causing?
+
! Reference || Codon Number (UniProt) || Codon Number (PDB) || Codon change || Amino Acid change || Polarity || Charge (pH) || Disease causing?
 
|-
 
|-
| [http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=368786234 rs368786234]|| AG<span style="color:#FF0040">'''C'''</span> ⇒ AG<span style="color:#FF0040">'''A'''</span> || 77 || 38 || Ser ⇒ Arg (S77R) ||polar ⇒ polar || neutral ⇒ positive || FALSE
+
| [http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=368786234 rs368786234]|| 77 || 38 || AG<span style="color:#FF0040">'''C'''</span> ⇒ AG<span style="color:#FF0040">'''A'''</span> || Ser ⇒ Arg (S77R) ||polar ⇒ polar || neutral ⇒ positive || FALSE
 
|-
 
|-
| [http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=374003673 rs374003673]|| A<span style="color:#FF0040">'''A'''</span>T ⇒ A<span style="color:#FF0040">'''G'''</span>T || 141 || 102 || Asn ⇒ Ser (N141S) || polar ⇒ polar || neutral ⇒ neutral || FALSE
+
| [http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=374003673 rs374003673]|| 141 || 102 || A<span style="color:#FF0040">'''A'''</span>T ⇒ A<span style="color:#FF0040">'''G'''</span>T || Asn ⇒ Ser (N141S) || polar ⇒ polar || neutral ⇒ neutral || FALSE
 
|-
 
|-
| CM992894|| G<span style="color:#FF0040">'''G'''</span>A ⇒ G<span style="color:#FF0040">'''A'''</span>A|| 241 || 202 || Gly ⇒ Glu (G241E) || nonpolar ⇒ polar || neutral ⇒ negative || TRUE
+
| CM992894|| 241 || 202 || G<span style="color:#FF0040">'''G'''</span>A ⇒ G<span style="color:#FF0040">'''A'''</span>A|| Gly ⇒ Glu (G241E) || nonpolar ⇒ polar || neutral ⇒ negative || TRUE
 
|-
 
|-
| CM880036|| A<span style="color:#FF0040">'''A'''</span>C ⇒ A<span style="color:#FF0040">'''G'''</span>C || 409 || 370 || Asn ⇒ Ser (N409S) || polar ⇒ polar || neutral ⇒ neutral || TRUE
+
| CM880036|| 409 || 370 || A<span style="color:#FF0040">'''A'''</span>C ⇒ A<span style="color:#FF0040">'''G'''</span>C || Asn ⇒ Ser (N409S) || polar ⇒ polar || neutral ⇒ neutral || TRUE
 
|-
 
|-
| CM870010|| C<span style="color:#FF0040">'''T'''</span>G ⇒ C<span style="color:#FF0040">'''C'''</span>G || 483 || 444 || Leu ⇒ Pro (L483P) || nonpolar ⇒ nonpolar || neutral ⇒ neutral || TRUE
+
| CM870010|| 483 || 444 || C<span style="color:#FF0040">'''T'''</span>G ⇒ C<span style="color:#FF0040">'''C'''</span>G || Leu ⇒ Pro (L483P) || nonpolar ⇒ nonpolar || neutral ⇒ neutral || TRUE
 
|}
 
|}
 
<center><small>'''<caption>''' Selected mutations of GBA sequence P04062. Mapping of the UniProt positions onto the PDB ATOM sequence is given. </caption></small></center>
 
<center><small>'''<caption>''' Selected mutations of GBA sequence P04062. Mapping of the UniProt positions onto the PDB ATOM sequence is given. </caption></small></center>

Revision as of 23:40, 29 September 2013

<css>

table.colBasic2 { margin-left: auto; margin-right: auto; border: 1px solid black; border-collapse:collapse; }

.colBasic2 th,td { padding: 3px; border: 1px solid black; }

.colBasic2 td { text-align:left; }

/* for orange try #ff7f00 and #ffaa56 for blue try #005fbf and #aad4ff

maria's style blue: #adceff grey: #efefef

  • /

.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;}

</css>

Preparation

1. Choice of a structure to work with

We chose the structure 2V3E, chain B, which has the following properties:

<figtable id="2V3E">

PDB-ID Resolution (Å) Chain Covered residues (UniProt seq.) Missing residues (ATOM seq.) Covered residues (ATOM seq.) R-Value(obs.) R-Free pH Temperature (K)
2V3E 2.0 A/B 40-536 (92.7%) A: 31, (498-503), B: (-1), (498-503) A: -1-30, 32-497, B: 0-497 0.163 0.220 7.5 100
Properties of 2V3E, chain B, the chosen reference structure of GBA sequence P04062.

</figtable>

For more information about other candidates and the missing residues, see the lab journal.

2. Visualization of the mutations to work with

We selected the following five mutation from the mutations selected in for this task:

<figtable id="mutations">

Reference Codon Number (UniProt) Codon Number (PDB) Codon change Amino Acid change Polarity Charge (pH) Disease causing?
rs368786234 77 38 AGC ⇒ AGA Ser ⇒ Arg (S77R) polar ⇒ polar neutral ⇒ positive FALSE
rs374003673 141 102 AAT ⇒ AGT Asn ⇒ Ser (N141S) polar ⇒ polar neutral ⇒ neutral FALSE
CM992894 241 202 GGA ⇒ GAA Gly ⇒ Glu (G241E) nonpolar ⇒ polar neutral ⇒ negative TRUE
CM880036 409 370 AAC ⇒ AGC Asn ⇒ Ser (N409S) polar ⇒ polar neutral ⇒ neutral TRUE
CM870010 483 444 CTG ⇒ CCG Leu ⇒ Pro (L483P) nonpolar ⇒ nonpolar neutral ⇒ neutral TRUE
Selected mutations of GBA sequence P04062. Mapping of the UniProt positions onto the PDB ATOM sequence is given.

</figtable>

The following figure visualizes the five residues we are going to mutate on the reference structure, 2V3E, chain B.

<figure id="pymol_mutations" >

PyMol visualization of five residues of the structure 2V3E, chain B, we are going to mutate as described in Table 2. </figure>

As can be seen on the image, none of the residues to be mutated lies in the proximity of one of the three binding sites. (See also the second subfigures of the figure galleries in the next subsection for close up on the residues and their hydrogen bonds.) However, four of the residues lie within a secondary structure element (beta sheet or helix) and one - Glycine 241 - in a turn near a helix. This implies that exchange of these residues with others with different functional groups, polarity and charge could lead to destruction of some hydrogen bonds within or between the secondary structures (e.g. Asparagine 141). This might lead to structural changes and even to destruction of the secondary structures or important blocks of secondary structure elements. Moreover, an exchange with a side chain of a bigger size might lead to clashed with proximate residues (e.g. with the loop near the Serine 77).

3. Creation of mutated structures

We used SQWRL4 to create the five mutated structures. (See lab journal.) The mutated residues in comparison to the native residues, the hydrogen of the mutants and possible clashes are shown in the following figures.

S77R

<figure id="S77R" >

PyMol visualization of the mutation S77R (PDB 38) of the structure 2V3E, chain B, with SCWRL. </figure>

According to this images, we would also predict the mutation S77R as not disease causing.

N141S

<figure id="N141S" >

PyMol visualization of the mutation N141S (PDB 102) of the structure 2V3E, chain B, created with SCWRL. </figure>

The mutation N141S is benign, therefore the missing stabilizing contact between the two helices is probably not so important.

G241E

<figure id="G241E">

PyMol visualization of the mutation G241E (PDB 202) of the structure 2V3E, chain B, created with SCWRL. </figure>

As no changes in the hydrogen bonds or clashes etc. can be noticed, we would predict the mutation G241E as benign. However, it is a disease causing mutation. Maybe the effect is caused by the polarity and negative charge of the mutant Glutamate, instead of the nonpolar and neutral Glycine.

N409S

<figure id="N409S" >

PyMol visualization of the mutation N409S (PDB 370) of the structure 2V3E, chain B, created with SCWRL. </figure>

Due to the fact that all hydrogen bonds remain conserved in the mutant and that no clashes etc. could be detected, we would predict the mutation N409S as non-effect. Moreover, the both amino acids, Asparagine and Serine, are both polar and neutral. Nevertheless, this mutation is annotated as disease causing.

L483P

<figure id="L483P" >

PyMol visualization of the mutation L483P (PDB 444) of the structure 2V3E, chain B, created with SCWRL. </figure>

We would predict the mutation L483P as having an effect - and so it is according to annotations.

Energy comparisons

Lab journal

foldX

We applied foldX for the five mutations to predict new structures with energies.

<figtable id="scwrl_foldx">

SCWRL FoldX
Mutant Energy Mutant-WT Predicted effect Energy Mutant Energy WT Mutant-WT Predicted effect Observed effect
WT 386.356
S77R 386.706 0.350 FALSE 20.90 23.25 -2.35 FALSE FALSE
N141S 390.966 4.610 TRUE 28.81 28.84 -0.03 FALSE FALSE
G241E 394.319 7.963 TRUE 32.30 29.31 2.99 TRUE TRUE
N409S 391.73 5.374 TRUE 29.08 27.77 1.31 TRUE TRUE
L483P 424.715 38.359 TRUE 38.40 35.55 2.85 TRUE TRUE
Energies of the WT structure and the five mutant structures calculated by SQWRL and foldX. FoldX minimizes the energy of the WT structure for each mutation. The energy difference between each mutant and the respective WT structure was calculated. If it is (significantly) positive, the mutation is regarded as effect causing. The observed effect of the mutation is given for comparison.

</figtable>

We decided to regard all differences between the mutant and WT energies higher than 1 as significant. SCWRL identifies the three disease causing mutations as such. However, it also predicts one not disease causing mutation (N141S) as effect causing, thought with the smallest positive difference (higher 1) of 4.61. Interestingly, the most prevalent Gaucher disease mutation according to OMIM, L483P (L444P in OMIM), has a highest energy change of 38.359. FoldX makes a correct prediction in all cases, the two benign mutations even have a small negative energy change. However, the disease causing mutations have all only a small energy increase (1.31 - 2.99). In foldX the mutation L483P does not have the highest energy increase, in the contrary to SCWRL.

Next, we superimposed the SCWRL and foldX structures in Pymol and compared them in the following figures.

<figure id="scwrl_vs_foldx" >

Comparison between the mutants created with SCWRL (gray protein, orange mutated residue, yellow polar contacts) and foldX (lime protein, cyan mutated residue, dark cyan polar contacts). </figure>

Minimise

We applied minimise for all mutant structures produced by SCWRL and foldX and the WT structure. We used the output of one minimisation for another run as input 5 times for each structure. The <xr id="minimise"/> summarizes the resulting energies:


<figtable id="minimise">

Method Mutation Round 1 Round 2 Round 3 Round 4 Round 5
- WT -12360.0 -11661.9 -12079.7 -11636.9 -11630.6
SCWRL S77R -3482.3 -12111.1 -11724.2 -11509.3 -11260.2
N141S -3593.4 -12199.7 -11787.7 -11568.1 -11328.3
G241E -3548.4 -12163.9 -11792.4 -11593.1 -11381.2
N409S -3631.0 -12214.2 -11801.9 -11594.4 -11347.0
L483P -3469.4 -11900.1 -11610.9 -11665.8 -11330.6
foldX S77R -12297.0 -11892.1 -12037.2 -11707.4 -11703.1
N141S -12359.7 -11961.2 -11983.4 -11550.9 -11515.9
G241E -12356.6 -11944.8 -12103.6 -11782.5 -11764.0
N409S -12389.9 -11966.0 -12117.7 -11845.4 -11808.9
L483P -12335.4 -11925.2 -12096.9 -11771.4 -11771.3
Energies produced by each of five consecutive runs of the program minimise, applied on the WT structure and the five mutant structures calculated by SQWRL and foldX. The lowest energies for each structure are marked gold.

</figtable>

For the WT, the lowest energy is reached after the first iteration of minimise. This is because the structure is already correct and therefore optimal. Interestingly, the lowest energy of the SQWRL mutants is reached after two minimise iterations, whereas for the foldX mutants the minimal energy is reached already after the first iteration. This may be explained by the fact that foldX already performs a minimisation step. Moreover, foldX minimal energies are always lower, than those calculated by SCWRL.

Another interesting observation is that after reaching the lowest energy in the second minimise iterations in the WT and the foldX mutants, the energy rises constantly in the consequent runs. This probably means that the minimal local energy is reached in the second iteration and then the computation of the gradient at the minimal point can only lead to a higher energy value, maybe to another local minimum with a higher energy. A different pattern happens during the minimisation of the SCWRL mutants: after reaching the local minimal energy already after the first iteration, in the second iteration the energy rises, however it falls a little again in the third iteration, thought not into the same local minimum as in the first iteration. This indicates that a second "suboptimal" local minimum is found. After the forth iteration the energy rises again.

We compared the mutant structures in Pymol again after the minimise runs reaching the best energies: after the second round for SQWRL and the first round for foldX (<xr id="minimise"/>).

<figure id="minimise" >

Comparison between the mutants created with SCWRL after two minimise iterations (gray protein, orange mutated residue) and foldX after one minimise iteration (lime protein, cyan mutated residue). There is almost no difference in the conformations of all the mutations. </figure>

To conclude, SQWRL and foldX are both pretty good tools for calculating the energy difference and the structural change of a protein after a point mutation. There is almost no difference in the resulting structures between the two programs, in particular the predicted conformations of the mutated residues and their environments are very similar.