Sequence-based mutation analysis TSD Journal
Back to results.
To improve on the readability of the journal, only the basic steps and program calls are outlined here, while the full source code of self-written scripts is linked to like this.
Quantiles can be easily calculated in R, using <source lang="bash"> quantile(m) #Where m is a matrix </source>
To create the PSSM, PSI-Blast was called as follows, using the parameters from Task 2, if not already given by the Task description (number of iterations): <source lang="bash"> wget http://www.uniprot.org/uniprot/P06865.fasta PAT=`pwd` blastpgp -m 8 -Q $PAT/blastpgp_pssm -d /mnt/project/pracstrucfunc12/data/big/big -i $PAT/P06865.fasta -v 3800 -b 3800 -j 5 > $PAT/lastpgp.out </source>
To create the MSA, a psiblast query with P06865 was performed, using the NCBI's webserver with default values (2 iterations, database 'nr'). The resulting sequences were downloaded and aligned using the EBI's clustalw webserver. The resulting MSA was downloaded and converted to FASTA format, using Jalview. Finally the PSSM can be created with the following commandline: <source lang="bash"> psiblast -subject ./query.fasta -in_msa clustalw.fasta -out_ascii_pssm pssm.txt </source>
Pymol, see Mutagenesis.
snap2 -i P06865.fasta -o snap.out -m all --tolerate
snapfun -i P06865.fasta -o snap.out -m mutation file
egrep "(M1|L39|C58|L127|R170|R178|S210|D258|L451|E482)[A-Z]+" snap.out > filteredsnap.out egrep "M1V|L39R|C58Y|L127R|R170W|R178H|S210F|D258H|L451V|E482K" snap.out > onlyRealSNPsnap.out