Difference between revisions of "CompareCath.py"

From Bioinformatikpedia
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
The script compareCath.py can be found on the biocluster at /home/alexander/studium/master-prac/scripts.
+
The script compareCath.py can be found on the biocluster at /mnt/home/student/betza/scripts.
   
 
usage: compareCath.py [-h] -i IFILE -q QUERY
 
usage: compareCath.py [-h] -i IFILE -q QUERY
Line 9: Line 9:
 
The input is the results file from parse_output.pl for (Psi)Blast.
 
The input is the results file from parse_output.pl for (Psi)Blast.
   
In CATH each domain of the protein is assigned to a fold class. This means, that one query protein can have several fold classes, one for each domain.
+
In CATH, each domain of the protein is assigned to a fold class. This means, that one query protein can have several fold classes, one for each domain.
 
This pyhton script computes the number of fold classes that each hit has in common with the specified query. The output is a histogram histogram of the number of same fold classes per protein for all pdb hits.
 
This pyhton script computes the number of fold classes that each hit has in common with the specified query. The output is a histogram histogram of the number of same fold classes per protein for all pdb hits.

Latest revision as of 18:07, 6 May 2013

The script compareCath.py can be found on the biocluster at /mnt/home/student/betza/scripts.

usage: compareCath.py [-h] -i IFILE -q QUERY
optional arguments:
 -h, --help  show this help message and exit
 -i IFILE    with parse_output.pl created results file (default: None)
 -q QUERY    PDB id and chain of query, e.g. 1a6zA (default: None)

The input is the results file from parse_output.pl for (Psi)Blast.

In CATH, each domain of the protein is assigned to a fold class. This means, that one query protein can have several fold classes, one for each domain. This pyhton script computes the number of fold classes that each hit has in common with the specified query. The output is a histogram histogram of the number of same fold classes per protein for all pdb hits.