Difference between revisions of "Sequence-based mutation analysis of ARSA"
(→Intro) |
(→Intro) |
||
Line 82: | Line 82: | ||
|- |
|- |
||
|colspan="5"| ''Description of Phe-Val'' |
|colspan="5"| ''Description of Phe-Val'' |
||
+ | {| class="centered" |
||
− | Paste description of mutation here |
||
+ | |[[Image:Phenylalanine.png | 100px | thumb | Phenylalanine (Phe) ]] |
||
+ | |[[Image:Valine.png | 100px | thumb | Valine (Val) ]] |
||
+ | |} |
||
+ | Phenylalanine and Valine are both hydrophobic amino acids. So the only impact on structure could come frome the structural differences between Phe and Val. Phe has a aromatic ring and due to that needs more space than Val. |
||
|- |
|- |
||
| Thr-Ile |
| Thr-Ile |
||
Line 91: | Line 95: | ||
|- |
|- |
||
|colspan="5"| ''Description of Thr-Ile'' |
|colspan="5"| ''Description of Thr-Ile'' |
||
+ | {| class="centered" |
||
− | Paste description of mutation here |
||
+ | |[[Image:Threonine.png | 100px | thumb | Threonine (Thr) ]] |
||
+ | |[[Image:Isoleucine.png | 100px | thumb | Isoleucine (Ile) ]] |
||
+ | |} |
||
+ | Threonine is a hydrophilic amino acid while Isoleucine is a hydrophobic amino acid. So the behaviour towards water changes. |
||
|- |
|- |
||
| Asn-Ser |
| Asn-Ser |
||
Line 100: | Line 108: | ||
|- |
|- |
||
|colspan="5"| ''Description of Asn-Ser'' |
|colspan="5"| ''Description of Asn-Ser'' |
||
+ | {| class="centered" |
||
− | Paste description of mutation here |
||
+ | |[[Image:Asparagine.png | 100px | thumb | Asparagine (Asn) ]] |
||
+ | |[[Image:Serine.png | 100px | thumb | Serine (Ser) ]] |
||
+ | |} |
||
|- |
|- |
||
| Cys-Gly |
| Cys-Gly |
Revision as of 21:33, 25 June 2011
Intro
We randomly picked 10 missense mutations from dbSNP and HGMD. The mutations are listed below, together with a pymol mutagenesis image and a description of the properties of the mutations.
mutation | position | reference | mutation | both | ||
Asp-Asn | 29 | |||||
Description of Asp-Asn
Aspartic acid is an acidic amino acid while Asparagine is a hydrophilic amino acid. So the mutation changes the behaviour towards water as well as the pH. | ||||||
Pro - Ala | 136 | |||||
Description of Pro-Ala
Proline and Alanine are both hydrophobic amino acids. So the behaviour towards water does not change. As Proline is a cyclic amino acid, it can "break" alpha-helices and is structural very important. Alanine is one the the smallest amino acids and so the mutation from Pro to Ala should have a big influence on the structure of ARSA. | ||||||
Gln-His | 153 | |||||
Description of Gln-His
Glutamine is a hydrophilic amino acid while Histidine is a basic amino acid. So the behaviour towards water changes as well as the charge of the amino acid. Also Gln and His are very different in structure, so His needs much more space than Gln, which should have a big influence on the structure of ARSA. | ||||||
Trp-Cys | 193 | |||||
Description of Trp-Cys
Tryptophan is a hydrophobic, aromatic amino acid while Cysteine is a hydrophilic amino acid. So the behaviour towards water changes dramatically. Also, Trp is the largest amino acid while Cys is a rather small amino acid. So the space needed for the amino acid changes also. This should have a huge influence on the structure of ARSA. | ||||||
Thr-Met | 274 | |||||
Description of Thr-Met
Threonine is a hydrophilic amino acid while Methionine is a hydrophobic amino acid. So the behaviour towards water changes. Also, Methionine has a very long sidechain while Threonine does not. So the structure of ARSA should be altered by this mutation. | ||||||
Phe -Val | 356 | |||||
Description of Phe-Val
Phenylalanine and Valine are both hydrophobic amino acids. So the only impact on structure could come frome the structural differences between Phe and Val. Phe has a aromatic ring and due to that needs more space than Val. | ||||||
Thr-Ile | 409 | |||||
Description of Thr-Ile
Threonine is a hydrophilic amino acid while Isoleucine is a hydrophobic amino acid. So the behaviour towards water changes. | ||||||
Asn-Ser | 440 | |||||
Description of Asn-Ser
| ||||||
Cys-Gly | 489 | |||||
Description of Cys-Gly
Paste description of mutation here | ||||||
Arg-His | 496 | |||||
Description of Arg-His
Paste description of mutation here |
SNAP
We ran snap using the following command:
snapfun -i ARSA.fasta -m mutants.txt -o snap.out
output:
nsSNP Prediction Reliability Index Expected Accuracy
----- ------------ ------------------- -------------------
D29N Non-neutral 7 96%
Q153H Neutral 0 53%
T274M Non-neutral 6 93%
T409I Non-neutral 1 63%
C489G Non-neutral 5 87%
W193C Non-neutral 3 78%
F356V Neutral 1 60%
N440S Non-neutral 2 70%
R496H Neutral 1 60%
P136A Non-neutral 4 82%
In order to analyze all possible combinations of amino acid substitutions from the above mutated positions, we used the Generate Mutants
tool on http://rostlab.org/services/snap/submit to create all possible exchanges from the following pattern: referenceAminoAcidPosition*
. Then we again executed snap:
snapfun -i ARSA.fasta -m all_mutants.txt -o snap_all.out
Next, we wrote a perl script to parse and summarize the SNAP output in the following table, which shows which amino acid substitutions are Non-neutral or Neutral:
ref\mutation | A | R | N | D | C | Q | E | G | H | I | L | K | M | F | P | S | T | W | Y | V |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
D29 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
Q153 | Non-neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
T274 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
T409 | Neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
C489 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
W193 | Non-neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
F356 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Neutral | Neutral | Neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Neutral | Neutral | |
N440 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | |
R496 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | Neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Neutral | Non-neutral | |
P136 | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral | Non-neutral |
Multiple sequence alignments
First, we downloaded the HSSP file for ARSA to get all proteins, which are homologuous to it. Then we downloaded all mammalian protein sequences from Uniprot. This was achieved by searching for the term taxonomy:40674
, which codes for all mammalian protein sequences. We saved all sequences in one multiple fasta file. Then we extracted all homologuous mammalian proteins to human ARSA by mapping the ids from the HSSP file to sequence ids in the multi fasta file. This yielded 75 homologuous mammalian sequences to human ARSA.
Next, we calculated a multiple sequence alignments of these proteins (including ARSA) with Muscle. The Jalview image of the alignment is shown below.
The following table shows the conservation of the original amino acid in the reference sequence and their mutations at the respective positions.
pos | conservation - reference | conservation - mutant |
---|---|---|
29 | 0.86 | 0 |
153 | 0.14 | 0 |
274 | 0.87 | 0 |
409 | 0.35 | 0.16 |
489 | 0.80 | 0.05 |
193 | 0.13 | 0 |
356 | 0.15 | 0 |
440 | 0.15 | 0 |
496 | 0.14 | 0.01 |
136 | 0.93 | 0 |
PSI-BLAST
blastpgp -i ARSA.fasta -d /data/blast/nr/nr -e 10E-6 -j 5 -Q psiblast.mat -o psiblast_eval10E_6.it.5.new.txt
Last position-specific scoring matrix computed, weighted observed percentages rounded down, information per position, and relative weight of gapless real matches to pseudocounts
A R N D C Q E G H I L K M F P S T W Y V A R N D C Q E G H I L K M F P S T W Y V
29 D -5 -5 -2 8 -7 -3 -1 -4 -4 -6 -7 -4 -6 -7 -5 -3 -4 -7 -6 -6 0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.49 1.56
153 Q 3 2 -1 4 -4 -1 -1 -2 0 -2 -3 -3 4 -2 -3 -1 -2 -3 -2 -2 26 10 3 23 0 3 3 3 2 2 1 1 13 2 1 3 2 0 1 2 0.53 1.48
274 T -3 -4 -3 -4 -2 -4 -4 -5 -5 -4 -4 -4 -3 -5 -4 1 8 -6 -5 -3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 92 0 0 0 1.94 1.62
409 T -1 0 0 -1 -2 -1 -1 0 -1 -1 -1 0 -1 -1 3 0 1 6 0 -1 5 5 5 4 1 3 4 8 1 3 6 5 1 2 13 6 8 11 3 4 0.26 0.95
489 C 2 -1 1 -4 8 -4 -4 -2 -1 -1 -2 -3 -1 -4 -4 0 0 5 -1 -3 15 4 8 0 36 0 0 2 1 3 3 1 1 0 0 6 5 9 2 0 0.99 1.22
440 N -5 -3 6 5 -6 -2 -1 -4 -3 -6 -6 -3 -6 -6 2 -2 -3 -6 -6 -5 0 1 46 36 0 1 2 0 0 0 0 1 0 0 10 1 1 0 0 0 1.48 1.67
356 F -3 -1 -5 -5 -3 0 -1 -6 1 3 0 -1 0 2 -6 -3 -2 -3 5 3 1 4 0 0 1 5 4 0 3 18 8 5 2 8 0 1 2 0 20 20 0.59 1.62
193 W -2 4 2 3 -5 0 0 -2 0 -3 -4 1 -3 -1 -2 -1 -2 1 1 -3 3 25 11 16 0 4 5 3 2 2 1 7 0 2 2 4 2 2 5 2 0.46 1.45
136 P -3 -5 -5 -5 -6 -4 -4 -5 -5 -6 -6 -4 -6 -7 9 -4 -4 -7 -6 -5 1 0 0 0 0 0 0 0 0 0 0 0 0 0 98 0 0 0 0 0 3.03 1.61
496 R -3 1 0 -3 -4 1 1 -1 1 -3 1 1 -2 2 4 0 -3 -1 -1 -3 1 7 4 1 0 5 10 4 3 1 16 9 0 9 20 8 1 1 1 1 0.34 0.96