Sequence-based mutation analysis of ARSA

From Bioinformatikpedia
Revision as of 20:39, 23 June 2011 by Zacher (talk | contribs) (SNAP)

Intro

We randomly picked 10 missense mutations from dbSNP and HGMD. The mutations are listed below, together with a pymol mutagenesis image and a description of the properties of the mutations.

mutation position reference mutation both
Asp-Asn 29
29 arsa asp.png
29 arsa asn.png
29 arsa both.png
Description of Asp-Asn

Paste description of mutation here

Pro - Ala 136
136 arsa PRO.png
136 arsa ALA.png
136 arsa both.png
Description of Pro-Ala

Paste description of mutation here

Gln-His 153
153 arsa GLN.png
153 arsa HIS.png
153 arsa both.png
Description of Gln-His

Paste description of mutation here

Trp-Cys 193
193 arsa TRP.png
193 arsa CYS.png
193 arsa both.png
Description of Trp-Cys

Paste description of mutation here

Thr-Met 274
274 arsa THR.png
274 arsa MET.png
274 arsa both.png
Description of Thr-Met

Paste description of mutation here

Phe -Val 356
356 arsa PHE.png
356 arsa VAL.png
356 arsa both.png
Description of Phe-Val

Paste description of mutation here

Thr-Ile 409
409 arsa THR.png
409 arsa ILE.png
409 arsa both.png
Description of Thr-Ile

Paste description of mutation here

Asn-Ser 440
440 arsa ASN.png
440 arsa SER.png
440 arsa both.png
Description of Asn-Ser

Paste description of mutation here

Cys-Gly 489
489 arsa CYS.png
489 arsa GLY.png
489 arsa both.png
Description of Cys-Gly

Paste description of mutation here

Arg-His 496
496 arsa ARG.png
496 arsa HIS.png
496 arsa both.png
Description of Arg-His

Paste description of mutation here

SNAP

We ran snap using the following command:


snapfun -i ARSA.fasta -m mutants.txt -o snap.out

output:


nsSNP	Prediction	Reliability Index	Expected Accuracy
-----	------------	-------------------	-------------------
D29N	Non-neutral		7			96%
Q153H	 Neutral 		0			53%
T274M	Non-neutral		6			93%
T409I	Non-neutral		1			63%
C489G	Non-neutral		5			87%
W193C	Non-neutral		3			78%
F356V	 Neutral 		1			60%
N440S	Non-neutral		2			70%
R496H	 Neutral 		1			60%
P136A	Non-neutral		4			82%

In order to analyze all possible combinations of amino acid substitutions from the above mutated positions, we used the Generate Mutants tool on http://rostlab.org/services/snap/submit to create all possible exchanges from the following pattern: referenceAminoAcidPosition* . Then we again executed snap:


snapfun -i ARSA.fasta -m all_mutants.txt -o snap_all.out

Next, we wrote a perl script to parse and summarize the SNAP output in the following table, which shows which amino acid substitutions are Non-neutral or Neutral:

{!border="1" style="text-align:center; border-spacing:0;" ! ref\mutation || A || R || N || D || C || Q || E || G || H || I || L || K || M || F || P || S || T || W || Y || V |- | D29 || Non-neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral |- | Q153 || Non-neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Non-neutral || Neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral |- | T274 || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Non-neutral |- | T409 || Neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Non-neutral |- | C489 || Non-neutral || Non-neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral |- | W193 || Non-neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral |- | F356 || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Neutral || Neutral || Neutral || Non-neutral || Neutral || || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Neutral || Neutral |- | N440 || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral |- | R496 || Non-neutral || || Non-neutral || Non-neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Neutral || Non-neutral || Neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Neutral || Non-neutral |- | P136 || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral || || Non-neutral || Non-neutral || Non-neutral || Non-neutral || Non-neutral |- |}

Multiple sequence alignments

First, we downloaded the HSSP file for ARSA to get all proteins, which are homologuous to it. Then we downloaded all mammalian protein sequences from Uniprot. This was achieved by searching for the term taxonomy:40674, which codes for all mammalian protein sequences. We saved all sequences in one multiple fasta file. Then we extracted all homologuous mammalian proteins to human ARSA by mapping the ids from the HSSP file to sequence ids in the multi fasta file. This yielded 75 homologuous mammalian sequences to human ARSA.
Next, we calculated multiple sequence alignments of these proteins (including ARSA) with Muscle. The Jalview images of the alignments are shown below.

Multiple sequence alignments of all 75 homologuous sequences using muscle
pos conservation - reference conservation - mutant
29 0.86 0
153 0.14 0
274 0.87 0
409 0.35 0.16
489 0.80 0.05
193 0.13 0
356 0.15 0
440 0.15 0
496 0.14 0.01
136 0.93 0

PSI-BLAST


blastpgp -i ARSA.fasta -d /data/blast/nr/nr -e 10E-6 -j 5 -Q psiblast.mat -o psiblast_eval10E_6.it.5.new.txt


Last position-specific scoring matrix computed, weighted observed percentages rounded down, information per position, and relative weight of gapless real matches to pseudocounts
          A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V   A   R   N   D   C   Q   E   G   H   I   L   K   M   F   P   S   T   W   Y   V
  29 D   -5 -5 -2  8 -7 -3 -1 -4 -4 -6 -7 -4 -6 -7 -5 -3 -4 -7 -6 -6    0   0   0 100   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  2.49 1.56
 153 Q    3  2 -1  4 -4 -1 -1 -2  0 -2 -3 -3  4 -2 -3 -1 -2 -3 -2 -2   26  10   3  23   0   3   3   3   2   2   1   1  13   2   1   3   2   0   1   2  0.53 1.48
 274 T   -3 -4 -3 -4 -2 -4 -4 -5 -5 -4 -4 -4 -3 -5 -4  1  8 -6 -5 -3    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   7  92   0   0   0  1.94 1.62
 409 T   -1  0  0 -1 -2 -1 -1  0 -1 -1 -1  0 -1 -1  3  0  1  6  0 -1    5   5   5   4   1   3   4   8   1   3   6   5   1   2  13   6   8  11   3   4  0.26 0.95
 489 C    2 -1  1 -4  8 -4 -4 -2 -1 -1 -2 -3 -1 -4 -4  0  0  5 -1 -3   15   4   8   0  36   0   0   2   1   3   3   1   1   0   0   6   5   9   2   0  0.99 1.22
 440 N   -5 -3  6  5 -6 -2 -1 -4 -3 -6 -6 -3 -6 -6  2 -2 -3 -6 -6 -5    0   1  46  36   0   1   2   0   0   0   0   1   0   0  10   1   1   0   0   0  1.48 1.67
 356 F   -3 -1 -5 -5 -3  0 -1 -6  1  3  0 -1  0  2 -6 -3 -2 -3  5  3    1   4   0   0   1   5   4   0   3  18   8   5   2   8   0   1   2   0  20  20  0.59 1.62
 193 W   -2  4  2  3 -5  0  0 -2  0 -3 -4  1 -3 -1 -2 -1 -2  1  1 -3    3  25  11  16   0   4   5   3   2   2   1   7   0   2   2   4   2   2   5   2  0.46 1.45
 136 P   -3 -5 -5 -5 -6 -4 -4 -5 -5 -6 -6 -4 -6 -7  9 -4 -4 -7 -6 -5    1   0   0   0   0   0   0   0   0   0   0   0   0   0  98   0   0   0   0   0  3.03 1.61
 496 R   -3  1  0 -3 -4  1  1 -1  1 -3  1  1 -2  2  4  0 -3 -1 -1 -3    1   7   4   1   0   5  10   4   3   1  16   9   0   9  20   8   1   1   1   1  0.34 0.96