Difference between revisions of "Structure-based mutation analysis ARSA"

From Bioinformatikpedia
(Results)
m (Summary)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
=== Preparation ===
 
=== Preparation ===
   
  +
For the upcoming analyses in this TASK, a structure fulfilling the following requirements should be selected:
==== Visualization with Pymol ====
 
  +
  +
* The resolution (in Å) should be sufficiently high, i.e. have a small value
  +
* The structure should be ideally have been resolved at physiological pH (7.4)
  +
* It should have a small R-factor. The R-factor is a measure to determine the reliability of a crystal structure.
  +
  +
PDB contains the following nine entries of the structure of ARSA:
  +
  +
  +
{| border="1" style="text-align:left; border-spacing:0;"
  +
| '''ID'''
  +
| '''exp. method'''
  +
| '''resolution in Å'''
  +
| ''' positions '''
  +
| ''' R-factor '''
  +
|-
  +
|1AUK || X-ray || 2.10 || 19-507 || 0.248
  +
|-
  +
|1E1Z || X-ray || 2.40 || 19-507 || 0.196
  +
|-
  +
|1E2S || X-ray || 2.35 || 19-507 || 0.194
  +
|-
  +
|1E33 || X-ray || 2.50 || 19-507 || 0.187
  +
|-
  +
|1E3C || X-ray || 2.65 || 19-507 || 0.174
  +
|-
  +
|1N2K || X-ray || 2.75 || 19-507 || 0.202
  +
|-
  +
|1N2L || X-ray || 3.20 || 19-507 || 0.182
  +
|-
  +
|2AIJ || X-ray || 1.55 || 69-73 || 0.146
  +
|-
  +
|2AIK || X-ray || 1.73 || 68-74 || 0.142
  +
|-
  +
|}
  +
  +
  +
The structures 2AIJ and 2AIK can be eliminated immediately as they only resolve a very small part of the enzyme. Structures 1E1Z, 1E2S, 1E33, 1E3C, 1N2K and 1N2L show high resolutions and very low R-Factors. However, these are all mutant structure and therefore not applicable to our analysis as we need the wild type structure. <br>
  +
The only structure left is 1AUK, which we also used in previous TASKs. This structure has a good resoultion and a sufficiently low R-Factor. But, as we already noticed in previous TASKs, this structure unfortunately contains six missing residues in the middle of the protein, so we could not take an original structure for the subsequent analyses. Thus it first had to be modified with programs - written by Marc Offman - to model the missing residues in the structure. We use this structure for the subsequent analyses.
  +
  +
=== Visualization with Pymol ===
   
 
The following image shows a pymol visualization of Arylsulfatase A, together with all known active and binding sites.
 
The following image shows a pymol visualization of Arylsulfatase A, together with all known active and binding sites.
Line 7: Line 47:
 
[[File:arsamodel.png | 200px | center | thumb | Pymol visualization of ARSA (with closed gaps). The active site is depicted in yellow, metal-binding site in blue, substrate binding sites in green and missense mutations in red.]]
 
[[File:arsamodel.png | 200px | center | thumb | Pymol visualization of ARSA (with closed gaps). The active site is depicted in yellow, metal-binding site in blue, substrate binding sites in green and missense mutations in red.]]
   
One can see, that the mutations are spread out through the protein. Some lie near functional sites, other are very distant from them. The table below shows again Pymol visualizations of all mutations, but each seperately. With this, we want to try to derive investigate the correlation of location of the mutation with respect to functional sites and its effect on the protein function.
+
One can see, that the mutations are spread out through the protein. Some lie near functional sites, others are very distant from them. The table below shows again Pymol visualizations of all mutations, but each seperately. With this, we want to try to investigate the correlation of location of the mutation with respect to functional sites and it's effect on the protein function.
   
 
{| border="1" style="text-align:left; border-spacing:0;"
 
{| border="1" style="text-align:left; border-spacing:0;"
Line 25: Line 65:
 
|-
 
|-
 
| 2
 
| 2
| Pro - Ala
+
| Pro-Ala
 
| 136
 
| 136
 
| [[File:arsa136.png | 200px ]]
 
| [[File:arsa136.png | 200px ]]
Line 53: Line 93:
 
|-
 
|-
 
| 6
 
| 6
| Phe -Val
+
| Phe-Val
 
| 356
 
| 356
 
| [[File:arsa356.png | 200px ]]
 
| [[File:arsa356.png | 200px ]]
Line 127: Line 167:
 
|-
 
|-
 
| 2
 
| 2
| Pro - Ala
+
| Pro-Ala
 
| 136
 
| 136
 
|[[Image:arsa_hydro136.png|200px|]]
 
|[[Image:arsa_hydro136.png|200px|]]
Line 163: Line 203:
 
|-
 
|-
 
| 6
 
| 6
| Phe -Val
+
| Phe-Val
 
| 356
 
| 356
 
|[[Image:arsa_hydro356.png|200px|]]
 
|[[Image:arsa_hydro356.png|200px|]]
Line 370: Line 410:
   
 
2. Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain.
 
2. Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain.
  +
-> TODO
 
  +
-> The following options are available:
  +
-offset value offset the residue numbering
  +
-chain char change Chain ID
  +
-ratom renumber Atoms
  +
-rres renumber Residues
  +
-noh remove hydrogens
  +
-het do not change HETATM to ATOM for AA
  +
-seq protein sequence from AA
  +
-seqrs protein sequence from SEQRES entries
  +
-nosol just Protein OR
  +
-ssw cutoff print only waters with B-value below cutoff OR
  +
-cleansol remove overlapping solvent for GROMACS
  +
  +
  +
  +
We ran repairPDB with the following command:
  +
repairPDB ARSA.pdb -noh -nosol > ARSA_clean.pdb
  +
   
   
 
3. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.
 
3. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.
  +
-> TODO
 
  +
We ran SCWRL with the following command:
  +
scwrl -i ARSA.pdb -s extractedPDB.seq -o ARSA_scwrl.pdb
  +
SCWRL returned a pdb-file that includes HETATOMS.
  +
These solvent atoms needed to be removed before continuing.
   
   
Line 432: Line 494:
   
 
=== Results ===
 
=== Results ===
  +
in this section, we give a short summary of the analyses performed above and - based on these results - conclude for each mutation if we can assign a neutral or non-neutral effect.
   
 
==== Mutation 1 ====
 
==== Mutation 1 ====
Line 471: Line 534:
   
 
==== Mutation 9 ====
 
==== Mutation 9 ====
The mutation is very distant from all functional sites. SCWRL predicts, that the H-bond pattern changes. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. These informations give no clear hint on wether this mutation could be harmful or not. <br>
+
The mutation is very distant from all functional sites. SCWRL predicts, that the H-bond pattern does not change. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. Again, we are left to guess, whether the mutation is neutral or not. <br>
  +
The mutation is non-neutral (HGMD).
   
 
==== Mutation 10 ====
 
==== Mutation 10 ====
  +
The mutation is very distant from all functional sites. SCWRL predicts, that the H-bond pattern changea. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. Again, we are left to guess, whether the mutation is neutral or not. <br>
  +
The mutation is non-neutral (HGMD).
  +
  +
  +
==== Summary ====
  +
Compared to the sequence-based mutation analysis, we had much more difficulties to discriminate neutral from non-neutral mutations using only structural features of the protein. In most cases we were even left to guess into the blue, because of a lack of reliable information giving clear evidence. The most informative facts for this structure-based mutation analysis were the H-bond patterns predicted by SCWRL and the analysis of the location of the mutation with respect to functional important sites. <br>
  +
Surprisingly the analysis of the free energies of the mutated proteins compared to the wild type revealed only very slight fold changes and we did not know how to interpret the impact of these on the structure and thus on the function. Moreover, these fold-changes were not consistent in predicting (de-)stabilizing effects across all methods. Whereas minimise predicted destabilising effects, FoldX predicted stabilizing effects for all mutations. This fact made these analyses quite useless for us. <br>
  +
However, some informations of the structure-based analysis are informative and are useful to dissect neutral from non-neutral mutations, if they are combined with additional methods - e.g. from the sequence-based mutation analysis.

Latest revision as of 18:34, 23 January 2012

Preparation

For the upcoming analyses in this TASK, a structure fulfilling the following requirements should be selected:

  • The resolution (in Å) should be sufficiently high, i.e. have a small value
  • The structure should be ideally have been resolved at physiological pH (7.4)
  • It should have a small R-factor. The R-factor is a measure to determine the reliability of a crystal structure.

PDB contains the following nine entries of the structure of ARSA:


ID exp. method resolution in Å positions R-factor
1AUK X-ray 2.10 19-507 0.248
1E1Z X-ray 2.40 19-507 0.196
1E2S X-ray 2.35 19-507 0.194
1E33 X-ray 2.50 19-507 0.187
1E3C X-ray 2.65 19-507 0.174
1N2K X-ray 2.75 19-507 0.202
1N2L X-ray 3.20 19-507 0.182
2AIJ X-ray 1.55 69-73 0.146
2AIK X-ray 1.73 68-74 0.142


The structures 2AIJ and 2AIK can be eliminated immediately as they only resolve a very small part of the enzyme. Structures 1E1Z, 1E2S, 1E33, 1E3C, 1N2K and 1N2L show high resolutions and very low R-Factors. However, these are all mutant structure and therefore not applicable to our analysis as we need the wild type structure.
The only structure left is 1AUK, which we also used in previous TASKs. This structure has a good resoultion and a sufficiently low R-Factor. But, as we already noticed in previous TASKs, this structure unfortunately contains six missing residues in the middle of the protein, so we could not take an original structure for the subsequent analyses. Thus it first had to be modified with programs - written by Marc Offman - to model the missing residues in the structure. We use this structure for the subsequent analyses.

Visualization with Pymol

The following image shows a pymol visualization of Arylsulfatase A, together with all known active and binding sites.

Pymol visualization of ARSA (with closed gaps). The active site is depicted in yellow, metal-binding site in blue, substrate binding sites in green and missense mutations in red.

One can see, that the mutations are spread out through the protein. Some lie near functional sites, others are very distant from them. The table below shows again Pymol visualizations of all mutations, but each seperately. With this, we want to try to investigate the correlation of location of the mutation with respect to functional sites and it's effect on the protein function.

Nr. mutation position Pymol image Description Effect on function
1 Asp-Asn 29 Arsa29.png The mutation is located at the position of a metal-binding site. harmful
2 Pro-Ala 136 Arsa136.png The mutation is located near the active site and a substrate binding site in sequence as well as in structure. harmful
3 Gln-His 153 Arsa153.png The mutation is located near the active site and a substrate binding site in sequence as well as in structure. harmful
4 Trp-Cys 193 Arsa193.png The mutation is at moderate distance to all important functional sites of the protein. neutral
5 Thr-Met 274 Arsa274.png The mutation is located very distant from the active site and a substrate binding site in sequence as well as in structure.
It is located within a beta sheet.
harmful
6 Phe-Val 356 Arsa356.png The mutation is at moderate distance to all important functional sites of the protein. neutral
7 Thr-Ile 409 Arsa409.png The mutation is not close to important functional sites. harmful
8 Asn-Ser 440 Arsa440.png The mutation is very distant from all functional sites. neutral
9 Cys-Gly 489 Arsa489.png The mutation is very distant from all functional sites. harmful
10 Arg-His 496 Arsa496.png The mutation is very distant from all functional sites. harmful

Above visualizations indicate, that mutations near functional important sites of the protein are likely to cause a harmful effect. However, for distant mutations no trend can be observed.

SCRWL

First, we extracted the amino acid sequence from our pdb file and converted it to lower case.


repairPDB arsa_model.pdb -seq > arsa.model.seq
tr '[:upper:]' '[:lower:]' < arsa.model.seq > arsa.model.lower.seq

Next, we included the individual mutations as capital letters in seperate files and executed scwrl with the following command:


scwrl cmd

We also ran SCWRL on the wild type (wt) structure in order to make it comparable to energy predictions by other programs. The minimal energy of the graph for the wt is 415.134.

Nr. mutation position Reference amino acid mutated amino acid both (without H-bonds) Minimal energy Energy(mutant)/Energy(wt)
1 Asp-Asn 29 Arsa hydro29.png Arsa scwrl hydro29.png Arsa both29.png 419.996 1.011712
2 Pro-Ala 136 Arsa hydro136.png Arsa scwrl hydro136.png Arsa both136.png 415.133 0.9999976
3 Gln-His 153 Arsa hydro153.png Arsa scwrl hydro153.png Arsa both153.png 420.494 1.012911
4 Trp-Cys 193 Arsa hydro193.png Arsa scwrl hydro193.png Arsa both193.png 416.252 1.002693
5 Thr-Met 274 Arsa hydro274.png Arsa scwrl hydro274.png Arsa both274.png 434.014 1.045479
6 Phe-Val 356 Arsa hydro356.png Arsa scwrl hydro356.png Arsa both356.png 415.134 1
7 Thr-Ile 409 Arsa hydro409.png Arsa scwrl hydro409.png Arsa both409.png 414.481 0.998427
8 Asn-Ser 440 Arsa hydro440.png Arsa scwrl hydro440.png Arsa both440.png 418.863 1.008983
9 Cys-Gly 489 Arsa hydro489.png Arsa scwrl hydro489.png Arsa both489.png 415.136 1.000005
10 Arg-His 496 Arsa hydro496.png Arsa scwrl hydro496.png Arsa both496.png 421.011 1.014157
FoldX

wt: 510.88

Nr. mutation position Minimal energy Energy(mutant)/Energy(wt)
1 Asp-Asn 29 496.88 0.9725963
2 Pro - Ala 136 493.81 0.966587
3 Gln-His 153 493.90 0.9667632
4 Trp-Cys 193 496.23 0.971324
5 Thr-Met 274 503.34 0.9852412
6 Phe -Val 356 495.39 0.9696798
7 Thr-Ile 409 495.45 0.9697972
8 Asn-Ser 440 496.79 0.9724201
9 Cys-Gly 489 495.87 0.9706193
10 Arg-His 496 498.75 0.9762567

Minimise

wt: -3839.492677

Nr. mutation position Free energy Energy(mutant)/Energy(wt)
1 Asp-Asn 29 -4174.487222 1.087250
2 Pro - Ala 136 -4164.523707 1.084655
3 Gln-His 153 -4109.12924 1.070227
4 Trp-Cys 193 -4169.617285 1.085981
5 Thr-Met 274 -4065.730562 1.058924
6 Phe -Val 356 -4109.021939 1.070199
7 Thr-Ile 409 -4121.130656 1.073353
8 Asn-Ser 440 -4120.275589 1.073130
9 Cys-Gly 489 -4127.969116 1.075134
10 Arg-His 496 -4120.151911 1.073098


Gromacs

Gromacs is an abbreviation for 'GROningen Mixture of Alchemy and Childrens' Stories' and is a package to perform molecular dynamics. It is Free Software, available under the GNU General Public License.

Workflow

1. Use fetchpdb to get the pdb structure – look at the script, what does it do?

-> we did not use fetchpdb, as we already had the needed pdb-file. The script simply downloads a pdb-file for a given PDB-ID from
the PDB-webpage.


2. Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain.

-> The following options are available:
 -offset value     offset the residue numbering
 -chain char       change Chain ID
 -ratom            renumber Atoms
 -rres             renumber Residues
 -noh              remove hydrogens
 -het              do not change HETATM to ATOM for AA
 -seq              protein sequence from AA
 -seqrs            protein sequence from SEQRES entries
 -nosol            just Protein OR
 -ssw cutoff       print only waters with B-value below cutoff OR
 -cleansol         remove overlapping solvent for GROMACS


We ran repairPDB with the following command:

repairPDB ARSA.pdb -noh -nosol > ARSA_clean.pdb


3. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.

We ran SCWRL with the following command:

scwrl -i ARSA.pdb -s extractedPDB.seq -o ARSA_scwrl.pdb

SCWRL returned a pdb-file that includes HETATOMS. These solvent atoms needed to be removed before continuing.


4. Use the gromacs command “pdb2gmx” the option –f defines the input structure, -o the gromacs outputfile (.gro) and –p the topology (.top) output file. IMPORTANT the input must end with “pdb”. Choose a forcefield and for water the TIP3P model.

-> we chose the AMBER03 forcefield as it is optimized for the use with proteins


5. Create a MDP file with the following content

title = PBSA minimization in vacuum
cpp = /usr/bin/cpp
define = -DFLEXIBLE -DPOSRES
implicit_solvent = GBSA
integrator = steep
emtol = 1.0
nsteps = 500
nstenergy = 1
energygrps = System
ns_type = grid
coulombtype = cut-off
rcoulomb = 1.0
rvdw	 = 1.0
constraints = none
pbc = no

Give a brief description of the different keywords used in this file - for this you should use the gromacs manual.

-> DFLEXIBLE: water in the the topology is flexible
-> DPOSRES: position restraints, defined in posre.itp, are included in the topology
-> implicit_solvent = GBSA: defines the implicit solvent model, here the Generalized Born formalism is used
-> integrator = steep: defines the algorithm for energy minimization, here the steepest descent algorithm is used
-> emtol: defines, when convergence is assumed: minimization is stoppen when maximum force is smaller than this value
-> nsteps: defines the maximum number of steps in the energy minimization
-> nstenergy: frequency in which energies are written to the energy file
-> energygrps: group(s) to write to the energy file 
-> ns_type: defines type of Neighbor searching, here a grid is built and only atoms in neighboring grid cells are checked when a new neighbor list is constructed
-> coulombtype: Twin range cut-off’s with neighborlist cut-off rlist and Coulomb cut-off rcoulomb, where rcoulomb ≥ rlist.
-> rcoulomb: distance for the Coulomb cut-off
-> rvdw: distance for the LJ or Buckingham cut-off
-> constraints: defines additional constraints besides the ones defined in the topology file
-> pbc: Use no periodic boundary conditions, ignore the box.


6. Use grompp to prepare the system for gromacs: grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr FILE.tpr is the system file to create which we use in the next step

-> commandline: grompp -v -f gromacs_AMBER03.mdp -c gromacs_AMBER03.gro -p gromacs_AMBER03.top -o gromacs_AMBER03.tpr


7. Now we minimize the system: mdrun -v -deffnm FILE

-> commandline: mdrun -f -deffnm gromacs_AMBER03


8. Analyze the minimization of the system with the following command: g_energy -f FILE.edr -o energy_1.xvg. Do the analysis for Bond, Angle and Potential. The xvg graphs can be viewed with xmgrace and in the print settings you can choose eps output, the print and convert to pdf.

Results

in this section, we give a short summary of the analyses performed above and - based on these results - conclude for each mutation if we can assign a neutral or non-neutral effect.

Mutation 1

The mutation is located at the metal-binding site of the protein. Only regarding this information, we would already suggests, that the mutation is non-neutral. However, we also want to integrate the other informations we have. The H-bond pattern changes and for SCWRL and FoldX, the mutated protein is more stable - i.e. has more free energy - than the wild type protein. For minimise, the wild type has more free energy. Summarizing, the location of the mutation and the changing H-bond pattern suggest, that this is a harmful mutation. The fold-changes in free energy are very low and thus give no striking evidence for one of the cases.
The mutation is indeed harmful.

Mutation 2

Mutation 2 is located near the active site and a substrate binding site in sequence as well as in structure. The H-bond pattern does not change, regarding the SCWRL mutagenesis analysis. Again the energy fold changes are very low and contradictory. SCWRL and FoldX assign a destabilizing effect, wehereas minimise assigns a stabilizing effect to the protein. As for mutation 1, we are again left to only consider ther location of the mutation and the H-bond pattern. However, in this case we cannot make a reliable guess, based on this information, although we would say that the location of the mutation could be an indicator, that it is non-neutral.
HGMD assigns a harmful effect to the mutation.

Mutation 3

The mutation is located near the active site and a substrate binding site in sequence as well as in structure. SCWRL and minimise predict again a slight stabilizing effect, whereas minimise predicts less free energy for the mutated protein. The SCWRL analysis predicts, that the H-bond pattern changes. Also here the changes in free energy predicted by the methods give no clear hint on the effect of the mutation. Again, we cannot make a reliable guess, based on this information, although we would say that the location of the mutation could be an indicator, that it is non-neutral.
HGMD assigns a harmful effect to the mutation.

Mutation 4

The mutation is at moderate distance to all important functional sites of the protein. The H-bond pattern changes and SCWRL and minimise predict a stabilizing effect and FoldX again a destabilizing. As the H-bond pattern changes it is likely that the local structure of the protein is altered. However, the mutation is not very near to any functional sites and thus this mutation might be harmless. But there is no striking evidence.
This mutation is taken from dbSNP and neutral.

Mutation 5

The mutation is located very distant from the active site and a substrate binding site in sequence as well as in structure. It is located within a beta sheet and the SCWRL analysis shows that the H-bond pattern changes in the mutated protein. As this mutation is located within a secondary structure element and changing the H-bond pattern it could disrupt the beta sheets and thus alter the structure of the whole protein.
The free energy predictions are aagin contradictory. SCWRL and minimise predict a stabilising effect, while FoldX predicts a destabilizing effect. As stated above, the mutation could disrupt the beta sheet structure and have therefore a high impact on the overall structure. Thus the mutation might be harmful.
HGMD assigns a deleterious effect to the mutation.

Mutation 6

The mutation is at moderate distance to all important functional sites of the protein. The H-bond pattern changes and SCWRL assigns exactly the same free energy to the mutated protein and to the wild type. minimise assigns a destabilising effect and FoldX a stabilizing. Again this information is not very useful and the only slightly informative facts are the location and the changing H-bond pattern, which do not give evidence for neutrality or non-neutrality.
The mutation is taken from dbSNP and has a neutral effect.

Mutation 7

The mutation is not close to important functional sites and SCWRL predicts, that the H-bond pattern is altered in the mutated protein. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. Again we have not a clear hint on wether this mutation could be harmful or not.
The mutation is taken from HGMD and is disease causing.

Mutation 8

The mutation is very distant from all functional sites. SCWRL predicts, that the H-bond pattern changes. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. These informations give no clear hint on wether this mutation could be harmful or not.
The mutation is taken from dbSNP and is neutral.

Mutation 9

The mutation is very distant from all functional sites. SCWRL predicts, that the H-bond pattern does not change. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. Again, we are left to guess, whether the mutation is neutral or not.
The mutation is non-neutral (HGMD).

Mutation 10

The mutation is very distant from all functional sites. SCWRL predicts, that the H-bond pattern changea. SCWRL and FoldX predict a stabilizing effect, while minimise predicts again a destabilizing effect. Again, we are left to guess, whether the mutation is neutral or not.
The mutation is non-neutral (HGMD).


Summary

Compared to the sequence-based mutation analysis, we had much more difficulties to discriminate neutral from non-neutral mutations using only structural features of the protein. In most cases we were even left to guess into the blue, because of a lack of reliable information giving clear evidence. The most informative facts for this structure-based mutation analysis were the H-bond patterns predicted by SCWRL and the analysis of the location of the mutation with respect to functional important sites.
Surprisingly the analysis of the free energies of the mutated proteins compared to the wild type revealed only very slight fold changes and we did not know how to interpret the impact of these on the structure and thus on the function. Moreover, these fold-changes were not consistent in predicting (de-)stabilizing effects across all methods. Whereas minimise predicted destabilising effects, FoldX predicted stabilizing effects for all mutations. This fact made these analyses quite useless for us.
However, some informations of the structure-based analysis are informative and are useful to dissect neutral from non-neutral mutations, if they are combined with additional methods - e.g. from the sequence-based mutation analysis.