Difference between revisions of "Sequence-based mutation analysis of ARSA"

From Bioinformatikpedia
(Multiple sequence alignments)
(PSI-BLAST)
Line 35: Line 35:
   
 
<code>
 
<code>
blastpgp -i ARSA.fasta -d /data/blast/nr/nr -e 10E-6 -j 5 > psiblast_eval10E_6.it.5.txt#
+
blastpgp -i ARSA.fasta -d /data/blast/nr/nr -e 10E-6 -j 5 -Q psiblast.mat -o psiblast_eval10E_6.it.5.new.txt
 
</code>
 
</code>
  +
  +
  +
The relevant lines of the psiblast matrix are shown below:
   
 
=== Multiple sequence alignments ===
 
=== Multiple sequence alignments ===

Revision as of 15:08, 23 June 2011

Intro

SNP type mutation position
missense Asp-Asn 29
missense Gln-His 153
missense Thr-Met 274
missense Thr-Ile 409
missense Cys-Gly 489
missense Trp [W]-Cys [C] 193
missense Phe [F]-Val [V] 356
missense Asn [N]-Ser [S] 440

SNAP

We ran snap using the following command:


snapfun -i ARSA.fasta -m mutants.txt -o snap.out

PSI-BLAST


blastpgp -i ARSA.fasta -d /data/blast/nr/nr -e 10E-6 -j 5 -Q psiblast.mat -o psiblast_eval10E_6.it.5.new.txt


The relevant lines of the psiblast matrix are shown below:

Multiple sequence alignments

First, we downloaded the HSSP file for ARSA to get all proteins, which are homolog to it. Then we extracted from it all 75 mammalian proteins and downloaded their sequences. Uniprot identifiers of these are listed below:

  • sp|Q08DD1|ARSA_BOVIN
  • sp|P15289|ARSA_HUMAN
  • sp|P50428|ARSA_MOUSE
  • sp|P15848|ARSB_HUMAN
  • sp|P50429|ARSB_MOUSE
  • sp|P50430|ARSB_RAT
  • sp|P51689|ARSD_HUMAN
  • sp|P51690|ARSE_HUMAN
  • sp|Q60HH5|ARSE_MACFA
  • sp|P54793|ARSF_HUMAN
  • sp|Q32KH9|ARSG_CANFA
  • sp|Q96EG1|ARSG_HUMAN
  • sp|Q3TYD4|ARSG_MOUSE
  • sp|Q32KJ9|ARSG_RAT
  • sp|Q32KH8|ARSH_CANFA
  • sp|Q5FYA8|ARSH_HUMAN
  • sp|Q32KH7|ARSI_CANFA
  • sp|Q5FYB1|ARSI_HUMAN
  • sp|Q32KI9|ARSI_MOUSE
  • sp|Q32KJ8|ARSI_RAT
  • sp|Q32KH5|GALNS_CANFA
  • sp|P34059|GALNS_HUMAN
  • sp|Q571E4|GALNS_MOUSE
  • sp|Q8WNQ7|GALNS_PIG
  • sp|Q32KJ6|GALNS_RAT
  • sp|P08842|STS_HUMAN
  • sp|P50427|STS_MOUSE
  • sp|P15589|STS_RAT
  • tr|Q8N322|Q8N322_HUMAN
  • tr|Q96I49|Q96I49_HUMAN
  • tr|Q6YL38|Q6YL38_HUMAN
  • tr|Q63HL5|Q63HL5_HUMAN
  • tr|Q6ZNJ9|Q6ZNJ9_HUMAN
  • tr|B4DVI5|B4DVI5_HUMAN
  • tr|A8K4A0|A8K4A0_HUMAN
  • tr|C9J5G7|C9J5G7_HUMAN
  • tr|B7XD04|B7XD04_HUMAN
  • tr|B7Z267|B7Z267_HUMAN
  • tr|B2R6P1|B2R6P1_HUMAN
  • tr|B7Z6V4|B7Z6V4_HUMAN
  • tr|B4DQ74|B4DQ74_HUMAN
  • tr|B7WNL6|B7WNL6_HUMAN
  • tr|A1L484|A1L484_HUMAN
  • tr|B2R7S0|B2R7S0_HUMAN
  • tr|B7Z1M0|B7Z1M0_HUMAN
  • tr|A5D7J7|A5D7J7_BOVIN
  • tr|Q32KI0|Q32KI0_CANFA
  • tr|Q32KI2|Q32KI2_CANFA
  • tr|D2HFI0|D2HFI0_AILME
  • tr|Q2XQY2|Q2XQY2_MACFA
  • tr|Q32KI1|Q32KI1_CANFA
  • tr|D2HFI1|D2HFI1_AILME
  • tr|A6MKC3|A6MKC3_CALJA
  • tr|D2H6D4|D2H6D4_AILME
  • tr|D2HFI2|D2HFI2_AILME
  • tr|Q8WNR3|Q8WNR3_PIG
  • tr|D2HXW7|D2HXW7_AILME
  • tr|Q32KI3|Q32KI3_CANFA
  • tr|A6QLR7|A6QLR7_BOVIN
  • tr|D2I3S5|D2I3S5_AILME
  • tr|A1XI21|A1XI21_HORSE
  • tr|Q32KI5|Q32KI5_CANFA
  • tr|Q19AM0|Q19AM0_BOVIN
  • tr|D2HFH9|D2HFH9_AILME
  • tr|A6QLZ3|A6QLZ3_BOVIN
  • tr|Q15B85|Q15B85_MACFA
  • tr|Q9DC66|Q9DC66_MOUSE
  • tr|Q32KK2|Q32KK2_RAT
  • tr|B5DEF1|B5DEF1_RAT
  • tr|B2RWQ7|B2RWQ7_MOUSE
  • tr|B4F7E2|B4F7E2_RAT
  • tr|Q8CC47|Q8CC47_MOUSE
  • tr|Q32KK0|Q32KK0_RAT
  • tr|Q3KR80|Q3KR80_RAT
  • tr|D3ZC09|D3ZC09_RAT

Next, we calculated multiple sequence alignments of these proteins (including ARSA) with ClustalW and Muscle. The Jalview images of the alignments are shown below.

Multiple sequence alignments of all 75 homologuous sequences using muscle
Multiple sequence alignments of all 75 homologuous sequences using clustalw
pos conservation - reference conservation - mutant
29 0.86 0
153 0.14 0
274 0.87 0
409 0.35 0.16
489 0.80 0.05
193 0.13 0
356 0.15 0
440 0.15 0