Gaucher Task06 Protocol

From Bioinformatikpedia
Revision as of 20:58, 17 June 2012 by Zhangg (talk | contribs)

Mutation list

We need a file "mutations.txt" containing the mutations for using SIFT and SNAP:

H99R
V211I
E150K
L236P
W248R
L509P
W351C
A423D
D482N
R83S

Sources

You can checkout the git repository containing all relevant data an scripts by:

git clone /mnt/home/student/angermue/mp/tasks/task06

PSSM

We created the PSSM as follows:

blastpgp -i data/P04062.seq -d $NR -j 5 -h 1e-3 -b 1000 -o pssm/all/P04062.bla -Q pssm/all/P04062.pssm

We used the script alignhits.pl from the HHsuite for filtering out the most similar hits from the PSI-BLAST result file:

alignhits.pl -Q data/P04062.seq -qsc 1.5 pssm/all/P04062.bla pssm/best/P04062.psi

The PSSM for the resulting PSI-BLAST alignment was computed as follows:

blastpgp -i data/P04062.seq -B pssm/best/P04062.psi -d $DUMMY -j 0 -Q pssm/best/P04062.pssm

SIFT

We used the online server of SIFT. It took a little bit long (10-15 min) because they have to search for the related sequences in database.

Input: the protein sequence P04062, the list of Mutations. Other setting default.

Alternatively, the online server of SIFT Blink was used. The predictions there are based on pre-computed BLAST searches, therefore are returned almost immediately. For SIFT Blink, we should provide the corresponding NCBI GI number (66347912) for our protein (UniProt id: P04062).

Input: the corresponding NCBI GI number (66347912), the list of Mutations. Other setting default.

PlyPhen2

We used the online server of PolyPhen-2.

Input: the protein sequence P04062, the position of the mutant, wildtype residue and the mutant. Other setting default.

SNAP

The web site version of SNAP seems not work. SNAP is also installed on the student cluster and should be used command-line only. We need to create our own ~/.snapfunrc (unless Tim will change the default one) to point to the correct paths.

The usage of SNAP:

snapfun -i P04062.fasta -m mutations.txt -o snapfun_out.out