Difference between revisions of "Sequence-based mutation analysis TSD Journal"

From Bioinformatikpedia
m (Created page with "To improve on the readability of the journal, only the basic steps and program calls are outlined here, while the full source code is linked to like [https://gist.github.com/2919…")
 
(MSA)
 
(18 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
Back to [[Sequence-based_mutation_analysis_TSD| results]].
To improve on the readability of the journal, only the basic steps and program calls are outlined here, while the full source code is linked to like [https://gist.github.com/2919934 this].
 
  +
  +
To improve on the readability of the journal, only the basic steps and program calls are outlined here, while the full source code of self-written scripts is linked to like [https://gist.github.com/2919934 this].
  +
  +
== Substitution matrices ==
  +
Quantiles can be easily calculated in R, using
  +
<source lang="bash">
  +
quantile(m) #Where m is a matrix
  +
</source>
  +
  +
To create the PSSM, PSI-Blast was called as follows, using the parameters from Task 2, if not already given by the Task description (number of iterations):
  +
<source lang="bash">
  +
wget http://www.uniprot.org/uniprot/P06865.fasta
  +
PAT=`pwd`
  +
blastpgp -m 8 -Q $PAT/blastpgp_pssm -d /mnt/project/pracstrucfunc12/data/big/big -i $PAT/P06865.fasta -v 3800 -b 3800 -j 5 > $PAT/lastpgp.out
  +
</source>
  +
  +
== MSA ==
  +
To create the MSA, a psiblast query with P06865 was performed, using the NCBI's webserver with default values (2 iterations, database 'nr'). The resulting sequences were downloaded and aligned using the EBI's clustalw webserver. The resulting MSA was downloaded and converted to FASTA format, using Jalview. Finally the PSSM can be created with the following commandline:
  +
<source lang="bash">
  +
psiblast -subject ./query.fasta -in_msa clustalw.fasta -out_ascii_pssm pssm.txt
  +
</source>
  +
  +
== Structural visualisation ==
  +
Pymol, see [http://www.pymolwiki.org/index.php/Mutagenesis Mutagenesis].
  +
== Prediction ==
  +
=== PolyPhen2 ===
  +
PolyPhen 2 predictions were done using the webserver's [http://genetics.bwh.harvard.edu/pph2/bgi.shtml batch mode]. All settings were left at default values. Here are [https://gist.github.com/2944224 batch file] and [https://gist.github.com/2944226 query sequence].
  +
=== SIFT ===
  +
SIFT predictions were performed using the [http://sift.jcvi.org/www/SIFT_seq_submit2.html webserver] at default values (database: UniRef90 2011 Apr). Here is the input [https://gist.github.com/2944236 mutation file].
  +
  +
===SNAP===
  +
  +
snap2 -i P06865.fasta -o snap.out -m all --tolerate
  +
or respectively
  +
snapfun -i P06865.fasta -o snap.out -m [https://gist.github.com/2944236 mutation file]
  +
  +
  +
egrep "(M1|L39|C58|L127|R170|R178|S210|D258|L451|E482)[A-Z]+" snap.out > filteredsnap.out
  +
egrep "M1V|L39R|C58Y|L127R|R170W|R178H|S210F|D258H|L451V|E482K" snap.out > onlyRealSNPsnap.out

Latest revision as of 23:51, 18 June 2012

Back to results.

To improve on the readability of the journal, only the basic steps and program calls are outlined here, while the full source code of self-written scripts is linked to like this.

Substitution matrices

Quantiles can be easily calculated in R, using <source lang="bash"> quantile(m) #Where m is a matrix </source>

To create the PSSM, PSI-Blast was called as follows, using the parameters from Task 2, if not already given by the Task description (number of iterations): <source lang="bash"> wget http://www.uniprot.org/uniprot/P06865.fasta PAT=`pwd` blastpgp -m 8 -Q $PAT/blastpgp_pssm -d /mnt/project/pracstrucfunc12/data/big/big -i $PAT/P06865.fasta -v 3800 -b 3800 -j 5 > $PAT/lastpgp.out </source>

MSA

To create the MSA, a psiblast query with P06865 was performed, using the NCBI's webserver with default values (2 iterations, database 'nr'). The resulting sequences were downloaded and aligned using the EBI's clustalw webserver. The resulting MSA was downloaded and converted to FASTA format, using Jalview. Finally the PSSM can be created with the following commandline: <source lang="bash"> psiblast -subject ./query.fasta -in_msa clustalw.fasta -out_ascii_pssm pssm.txt </source>

Structural visualisation

Pymol, see Mutagenesis.

Prediction

PolyPhen2

PolyPhen 2 predictions were done using the webserver's batch mode. All settings were left at default values. Here are batch file and query sequence.

SIFT

SIFT predictions were performed using the webserver at default values (database: UniRef90 2011 Apr). Here is the input mutation file.

SNAP

snap2 -i P06865.fasta -o snap.out -m all --tolerate

or respectively

snapfun -i P06865.fasta -o snap.out -m mutation file


egrep "(M1|L39|C58|L127|R170|R178|S210|D258|L451|E482)[A-Z]+" snap.out > filteredsnap.out
egrep "M1V|L39R|C58Y|L127R|R170W|R178H|S210F|D258H|L451V|E482K" snap.out > onlyRealSNPsnap.out