Difference between revisions of "Resource software"

From Bioinformatikpedia
(Script for filter the secondary structure of reprof output files)
(Script for filter the secondary structure of reprof output files)
Line 34: Line 34:
 
=== Task 3 ===
 
=== Task 3 ===
 
==== Script for filter the secondary structure of reprof output files ====
 
==== Script for filter the secondary structure of reprof output files ====
This script reads the output of a ReProf or a PsiPred run and filters for the secondary structure: [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Phenylketonuria/Task3/Scripts filter_secStruc.pl]
+
This script reads the output of a ReProf, a PsiPred or a DSSP run and filters for the secondary structure: [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Phenylketonuria/Task3/Scripts filter_secStruc.pl]
   
 
== Molecular visualization ==
 
== Molecular visualization ==

Revision as of 17:59, 11 May 2013

Here, we collect descriptions of the software used in the practical. This can be software used in online portals or software installed locally on your own computers or the lab resources. In each case, please describe how to access the software and where to find manuals. Also use this site to collect scripts or HOW_TOs that could be useful for others.

Your own scripts

If you have produced a script that does something that could be useful for others, please "publish" it here. E.g. create a page for your tool where you provide information where to find the software (path on the student cluster, git repository, ...) and how to use it. -- For users: If you use a script produced by another group, please document that (e.g. in the "lab book" part of your wiki section). And if you find bugs, please help the other group improve.

Executing blastp, blastpgp and hhblits

A script run.pl executes blastp; blastpgp and hhblits with different options: databases, number of iterations and E-value cutoffs. Also uses checkfiles for blastpgp and outputs PSSMs.

Convert hhr to parseable tsv format

A C program for extraction of statistics results from hhblits output (hhr format) to a tsv (tab separated values) file: hhr2tsv.git

  • Install:
./configure --prefix=$HOME
./make install
./make clean
  • Usage:
$HOME/bin/hhr2tsv <input_hhr_file> <output_tsv_file>

Parser of (Psi-)BLAST and HHblits hhr output files

A script parse_output.pl parses alignments information from (Psi-)BLAST and HHblits hhr output files into tab-separated format, suitable for plotting, calculates the number of hits and overlap of hits with same ID between (Psi-)BLAST and HHblits outputs. Moreover, there is an option to evaluate PDB hits against COPS and create files for plotting.

Script for comparing the CATH fold classes of the quer and the pdb hits

The script compareCath.py reads the output from parse_output.pl (see above) and compares the fold classes of the query domains with the fold classes of the hits and writes a histogram to stdout.

Script for finding GOAnnotations

This Script finds GOAnnotations for a given Protein and creates an outfile.
The script can be found here
A typical command would be: python goAnnotation.py B2JCG3 /Desktop/result.out


Task 3

Script for filter the secondary structure of reprof output files

This script reads the output of a ReProf, a PsiPred or a DSSP run and filters for the secondary structure: filter_secStruc.pl

Molecular visualization

To look at protein structures you can use any molecular visualization programm. Here are a few options:

  • PyMOL -> installed on the i12k-biolab computers
  • Jmol, e.g. via PDB
  • VMD
  • the SRS 3D server -- unfortunately not working any more. Maybe Aquaria will become publicly available within this practical.

Changing Blast output

By default, Blast lists 500 search hits and 250 alignment details. This can be changed (see Blast manual for details):

  • You can use a custom output format to get a table with "-m 8" (see "-help" or this hint on how to parse Blast output).
  • You can use "-b" to set the number of alignments to be shown, "-b 20000" is the maximum.


Modeller

Troubleshooting

A very common error from Modeller is the following: "Sequence difference between alignment and pdb" . This usually means the structure of the template available in PDB (which was experimentally solved) has missing residues, which could be a result of technical problems with the X-ray diffraction data. Therefore, you need to make sure sure target-template alignment uses the sequence implied in the ATOM records, not the SEQRES record. To locate the error you could e.g. generate a fasta sequence can be generated from the PDB file coordinates, align this sequence with the fasta sequence for the SEQRES sequence and check for missing residues (gaps within the alignment). If residues are missing, regenerate you target template alignment based on the new fasta sequence made from the coordinates.

SNAP

There is a very brief explanation about SNAP available here --> Media:SNAP.pdf.

Energy Minimization

There is a script to automatically run energy minimizations with Gromacs here --> MutEn.pl.


R

Error in hist.default(a$V2, main = "evals") : 'x' must be numeric

Blast uses non-standard scientific notation and ommits the preceding 1 for eValues like 'e-190'. Change it to '1e-190' and R will stop complaining.