Metachromatic leukodystrophy reference aminoacids
Contents
Sequence
>sp|P15289|ARSA_HUMAN Arylsulfatase A OS=Homo sapiens GN=ARSA PE=1 SV=3
MGAPRSLLLALAAGLAVARPPNIVLIFADDLGYGDLGCYGHPSSTTPNLDQLAAGGLRFT
DFYVPVSLCTPSRAALLTGRLPVRMGMYPGVLVPSSRGGLPLEEVTVAEVLAARGYLTGM
AGKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIP
LLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAE
RSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRC
GKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAALAGAPLPNVTLDGFDLSP
LLLGTGKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTADPACHASSSL
TAHEPPLLYDLSKDPGENYNLLGGVAGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARG
EDPALQICCHPGCTPRPACCHCPDPHA
Source
Database Searches
BLAST against NR
How To
With BLAST being installed, the following steps were performed:
- typed
blastall -p blastp -i refSeq.fasta -d /data/blast/nr/nr > blastp
with refSeq.fasta being the file containing the reference sequence and blastp the outfile
Best 20 results
PSI-BLAST against NR
How To
With PSI-BLAST being installed, the following command was executed:
blastpgp -i refSeq.fasta -d /data/blast/nr/nr -e"e-value" -j "#iterations" > psiblast_"e-value"_"#iterations"
e-value cutoff 0.005, 3 iterations
Best 20 results
e-value cutoff 0.005, 5 iterations
Best 20 results
e-value cutoff 10E-6, 3 iterations
Best 20 results
e-value cutoff 10E-6, 5 iterations
Best 20 results
FASTA against NR
How To
fasta was not yet installed on the computer, so it was installed, executing the following command from the ./src directory from the software's sourc code:
make -f ../make/Makefile.linux_sse2 all
We aligned the sequences, using the parameters written below:
./bin/fasta36 -q ~/Documents/refSeq.fasta /data/blast/nr/nr > fasta_results.txt
hhsearch
How To
We used the online version of hhPred <ref>http://toolkit.lmb.uni-muenchen.de/hhpred</ref> with the following parameters
- local alignment
- 3 iterations
Due to the fact that only PDB-IDs could be extracted from the HHpred-output, we had to do a mapping from PDB ID to RefSeq AC. This was done by mapping PDB ID to UniProt AC and then to RefSeq AC by PIR ID Mapping <ref>http://pir.georgetown.edu/pirwww/search/idmapping.shtml</ref>
against PDB
Best 20 results
Multiple Alignments
For building the multiple Alignments the results of the Psiblast run with e-value cutoff of 10E-6 and 5 iterations were divided into 6 groups by sequence identity:
- <20%
- 20% - 39%
- 40% - 59%
- 60% - 89%
- 90% - 99%
- >99%
The sequences with <20% and >99% sequence identitiy were ignored and 5 samples were randomly picked from the other ranges. So 20 sequences were available for the multiple alignments.
References
<references />