Lab Journal of Task 4 (MSUD)

From Bioinformatikpedia

Structural alignment

  • Generation of the set of structures:
    • Sequence search tool of PDB website was used to select set of similar and dissimilar structures to BCKDHA.
    • For structures with identical sequence to BCKDHA, we have only found structures with ligand. So, the structural comparison between ligand-binding and ligand-free PDB structures could not be performed.
    • Structure with sequence similarity between 60%-90% in comparison to refseq of BCKDHA were not found.
    • We have also searched related structures using CATH classifications.
  • All selected structures can be found at: /mnt/home/student/weish/master-practical-2013/task04/structures/
    • Protein structure of BCKDHA locates in indentical/
    • Structurally related structures locate in C/, CA/ and CAT/
    • Arbitrary structure of other CATH classification locates in diffCATH/
    • Structures with high and low sequence similarity locate in 60_identity/ and 30_identity/

Evaluation of alignments using structures

Creation of structure models from homologous sequences

The input for the the script hhmakemodel.pl was create with HHsearch:

hhsearch -i /mnt/home/student/weish/master-practical-2013/task01/refseq_BCKDHA_protein.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o hhsearch_BCKDHA.hhr -Z 200000 -B 200000

The models were made as follows:

/usr/share/hhsuite/scripts/hhmakemodel.pl hhsearch_BCKDHA.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 2 4 5 6 10 20 28 31 39 43 -ts hhmakemodel_BCKDHA.pdb

As can be seen in the option -m, 10 structures were chosen to make a model of each. The structures were selected preferably to cover the whole range of alignment scores (E-value, probability, sequence identity), but no hits with very high E-values were taken.

The models are stored in /mnt/home/student/schillerl/MasterPractical/task4/hhmakemodel/.

Comparison with known structure

The models were structually aligned with the reference structure 2BDF using LGA, which was run on this server with default parameters.

LGA results can be found in /mnt/home/student/schillerl/MasterPractical/task4/LGA/.

Stastitical evaluation of alignment scores (from sequence and structural alignments), especially calculation of correlation between these, was performed with R. See /mnt/home/student/schillerl/MasterPractical/task4/alignment_statistics.txt for an overview of the scores.