Sequence-based mutation analysis of ARSA

From Bioinformatikpedia
Revision as of 15:33, 23 June 2011 by Zacher (talk | contribs) (SNAP)

Intro

SNP type mutation position
missense Asp-Asn 29
missense Gln-His 153
missense Thr-Met 274
missense Thr-Ile 409
missense Cys-Gly 489
missense Trp [W]-Cys [C] 193
missense Phe [F]-Val [V] 356
missense Asn [N]-Ser [S] 440
missense R - H 496
missense P - A 136

SNAP

We ran snap using the following command:


snapfun -i ARSA.fasta -m mutants.txt -o snap.out


output:


nsSNP	Prediction	Reliability Index	Expected Accuracy
-----	------------	-------------------	-------------------
D29N	Non-neutral		7			96%
Q153H	 Neutral 		0			53%
T274M	Non-neutral		6			93%
T409I	Non-neutral		1			63%
C489G	Non-neutral		5			87%
D381D	 Neutral 		5			89%
P195P	 Neutral 		6			92%
H151H	 Neutral 		4			85%
W193C	Non-neutral		3			78%

Multiple sequence alignments

First, we downloaded the HSSP file for ARSA to get all proteins, which are homolog to it. Then we extracted from it all 75 mammalian proteins and downloaded their sequences. Uniprot identifiers of these are listed below:

  • sp|Q08DD1|ARSA_BOVIN
  • sp|P15289|ARSA_HUMAN
  • sp|P50428|ARSA_MOUSE
  • sp|P15848|ARSB_HUMAN
  • sp|P50429|ARSB_MOUSE
  • sp|P50430|ARSB_RAT
  • sp|P51689|ARSD_HUMAN
  • sp|P51690|ARSE_HUMAN
  • sp|Q60HH5|ARSE_MACFA
  • sp|P54793|ARSF_HUMAN
  • sp|Q32KH9|ARSG_CANFA
  • sp|Q96EG1|ARSG_HUMAN
  • sp|Q3TYD4|ARSG_MOUSE
  • sp|Q32KJ9|ARSG_RAT
  • sp|Q32KH8|ARSH_CANFA
  • sp|Q5FYA8|ARSH_HUMAN
  • sp|Q32KH7|ARSI_CANFA
  • sp|Q5FYB1|ARSI_HUMAN
  • sp|Q32KI9|ARSI_MOUSE
  • sp|Q32KJ8|ARSI_RAT
  • sp|Q32KH5|GALNS_CANFA
  • sp|P34059|GALNS_HUMAN
  • sp|Q571E4|GALNS_MOUSE
  • sp|Q8WNQ7|GALNS_PIG
  • sp|Q32KJ6|GALNS_RAT
  • sp|P08842|STS_HUMAN
  • sp|P50427|STS_MOUSE
  • sp|P15589|STS_RAT
  • tr|Q8N322|Q8N322_HUMAN
  • tr|Q96I49|Q96I49_HUMAN
  • tr|Q6YL38|Q6YL38_HUMAN
  • tr|Q63HL5|Q63HL5_HUMAN
  • tr|Q6ZNJ9|Q6ZNJ9_HUMAN
  • tr|B4DVI5|B4DVI5_HUMAN
  • tr|A8K4A0|A8K4A0_HUMAN
  • tr|C9J5G7|C9J5G7_HUMAN
  • tr|B7XD04|B7XD04_HUMAN
  • tr|B7Z267|B7Z267_HUMAN
  • tr|B2R6P1|B2R6P1_HUMAN
  • tr|B7Z6V4|B7Z6V4_HUMAN
  • tr|B4DQ74|B4DQ74_HUMAN
  • tr|B7WNL6|B7WNL6_HUMAN
  • tr|A1L484|A1L484_HUMAN
  • tr|B2R7S0|B2R7S0_HUMAN
  • tr|B7Z1M0|B7Z1M0_HUMAN
  • tr|A5D7J7|A5D7J7_BOVIN
  • tr|Q32KI0|Q32KI0_CANFA
  • tr|Q32KI2|Q32KI2_CANFA
  • tr|D2HFI0|D2HFI0_AILME
  • tr|Q2XQY2|Q2XQY2_MACFA
  • tr|Q32KI1|Q32KI1_CANFA
  • tr|D2HFI1|D2HFI1_AILME
  • tr|A6MKC3|A6MKC3_CALJA
  • tr|D2H6D4|D2H6D4_AILME
  • tr|D2HFI2|D2HFI2_AILME
  • tr|Q8WNR3|Q8WNR3_PIG
  • tr|D2HXW7|D2HXW7_AILME
  • tr|Q32KI3|Q32KI3_CANFA
  • tr|A6QLR7|A6QLR7_BOVIN
  • tr|D2I3S5|D2I3S5_AILME
  • tr|A1XI21|A1XI21_HORSE
  • tr|Q32KI5|Q32KI5_CANFA
  • tr|Q19AM0|Q19AM0_BOVIN
  • tr|D2HFH9|D2HFH9_AILME
  • tr|A6QLZ3|A6QLZ3_BOVIN
  • tr|Q15B85|Q15B85_MACFA
  • tr|Q9DC66|Q9DC66_MOUSE
  • tr|Q32KK2|Q32KK2_RAT
  • tr|B5DEF1|B5DEF1_RAT
  • tr|B2RWQ7|B2RWQ7_MOUSE
  • tr|B4F7E2|B4F7E2_RAT
  • tr|Q8CC47|Q8CC47_MOUSE
  • tr|Q32KK0|Q32KK0_RAT
  • tr|Q3KR80|Q3KR80_RAT
  • tr|D3ZC09|D3ZC09_RAT

Next, we calculated multiple sequence alignments of these proteins (including ARSA) with Muscle. The Jalview images of the alignments are shown below.

Multiple sequence alignments of all 75 homologuous sequences using muscle
pos conservation - reference conservation - mutant
29 0.86 0
153 0.14 0
274 0.87 0
409 0.35 0.16
489 0.80 0.05
193 0.13 0
356 0.15 0
440 0.15 0
496 0.14 0.01
136 0.93 0

PSI-BLAST


blastpgp -i ARSA.fasta -d /data/blast/nr/nr -e 10E-6 -j 5 -Q psiblast.mat -o psiblast_eval10E_6.it.5.new.txt


Last position-specific scoring matrix computed, weighted observed percentages rounded down, information per position, and relative weight of gapless real matches to pseudocounts
          A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V   A   R   N   D   C   Q   E   G   H   I   L   K   M   F   P   S   T   W   Y   V
  29 D   -5 -5 -2  8 -7 -3 -1 -4 -4 -6 -7 -4 -6 -7 -5 -3 -4 -7 -6 -6    0   0   0 100   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  2.49 1.56
 153 Q    3  2 -1  4 -4 -1 -1 -2  0 -2 -3 -3  4 -2 -3 -1 -2 -3 -2 -2   26  10   3  23   0   3   3   3   2   2   1   1  13   2   1   3   2   0   1   2  0.53 1.48
 274 T   -3 -4 -3 -4 -2 -4 -4 -5 -5 -4 -4 -4 -3 -5 -4  1  8 -6 -5 -3    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   7  92   0   0   0  1.94 1.62
 409 T   -1  0  0 -1 -2 -1 -1  0 -1 -1 -1  0 -1 -1  3  0  1  6  0 -1    5   5   5   4   1   3   4   8   1   3   6   5   1   2  13   6   8  11   3   4  0.26 0.95
 489 C    2 -1  1 -4  8 -4 -4 -2 -1 -1 -2 -3 -1 -4 -4  0  0  5 -1 -3   15   4   8   0  36   0   0   2   1   3   3   1   1   0   0   6   5   9   2   0  0.99 1.22
 440 N   -5 -3  6  5 -6 -2 -1 -4 -3 -6 -6 -3 -6 -6  2 -2 -3 -6 -6 -5    0   1  46  36   0   1   2   0   0   0   0   1   0   0  10   1   1   0   0   0  1.48 1.67
 356 F   -3 -1 -5 -5 -3  0 -1 -6  1  3  0 -1  0  2 -6 -3 -2 -3  5  3    1   4   0   0   1   5   4   0   3  18   8   5   2   8   0   1   2   0  20  20  0.59 1.62
 193 W   -2  4  2  3 -5  0  0 -2  0 -3 -4  1 -3 -1 -2 -1 -2  1  1 -3    3  25  11  16   0   4   5   3   2   2   1   7   0   2   2   4   2   2   5   2  0.46 1.45
 136 P   -3 -5 -5 -5 -6 -4 -4 -5 -5 -6 -6 -4 -6 -7  9 -4 -4 -7 -6 -5    1   0   0   0   0   0   0   0   0   0   0   0   0   0  98   0   0   0   0   0  3.03 1.61
 496 R   -3  1  0 -3 -4  1  1 -1  1 -3  1  1 -2  2  4  0 -3 -1 -1 -3    1   7   4   1   0   5  10   4   3   1  16   9   0   9  20   8   1   1   1   1  0.34 0.96