Difference between revisions of "Lab Journal of Task 4 (MSUD)"

From Bioinformatikpedia
(Created page with "== Explore structural alignments == == Use structural alignments to evaluate sequence alignments ==")
 
m (Structural alignment)
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
== Structural alignment ==
== Explore structural alignments ==
 
  +
* Generation of the set of structures:
  +
** Sequence search tool of PDB website was used to select set of similar and dissimilar structures to BCKDHA.
  +
** For structures with identical sequence to BCKDHA, we have only found structures with ligand. So, the structural comparison between ligand-binding and ligand-free PDB structures could not be performed.
  +
** Structure with sequence similarity between 60%-90% in comparison to refseq of BCKDHA were not found.
  +
** We have also searched related structures using CATH classifications.
   
  +
* All selected structures can be found at: <tt>/mnt/home/student/weish/master-practical-2013/task04/structures/</tt>
== Use structural alignments to evaluate sequence alignments ==
 
  +
** Protein structure of BCKDHA locates in <tt>indentical/</tt>
  +
** Structurally related structures locate in <tt>C/</tt>, <tt>CA/</tt> and <tt>CAT/</tt>
  +
** Arbitrary structure of other CATH classification locates in <tt>diffCATH/</tt>
  +
** Structures with high and low sequence similarity locate in <tt>60_identity/</tt> and <tt>30_identity/</tt>
  +
  +
== Evaluation of alignments using structures ==
  +
  +
=== Creation of structure models from homologous sequences ===
  +
  +
The input for the the script hhmakemodel.pl was create with HHsearch:
  +
  +
<code>
  +
hhsearch -i /mnt/home/student/weish/master-practical-2013/task01/refseq_BCKDHA_protein.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o hhsearch_BCKDHA.hhr -Z 200000 -B 200000
  +
</code>
  +
  +
The models were made as follows:
  +
  +
<code>
  +
/usr/share/hhsuite/scripts/hhmakemodel.pl hhsearch_BCKDHA.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 2 4 5 6 10 20 28 31 39 43 -ts hhmakemodel_BCKDHA.pdb
  +
</code>
  +
  +
As can be seen in the option <code>-m</code>, 10 structures were chosen to make a model of each. The structures were selected preferably to cover the whole range of alignment scores (E-value, probability, sequence identity), but no hits with very high E-values were taken.
  +
  +
The models are stored in <code>/mnt/home/student/schillerl/MasterPractical/task4/hhmakemodel/</code>.
  +
  +
=== Comparison with known structure ===
  +
  +
The models were structually aligned with the reference structure 2BDF using LGA, which was run on [http://proteinmodel.org/AS2TS/LGA/lga.html this server] with default parameters.
  +
  +
LGA results can be found in <code>/mnt/home/student/schillerl/MasterPractical/task4/LGA/</code>.
  +
  +
Stastitical evaluation of alignment scores (from sequence and structural alignments), especially calculation of correlation between these, was performed with R. See <code>/mnt/home/student/schillerl/MasterPractical/task4/alignment_statistics.txt</code> for an overview of the scores.

Latest revision as of 12:30, 2 August 2013

Structural alignment

  • Generation of the set of structures:
    • Sequence search tool of PDB website was used to select set of similar and dissimilar structures to BCKDHA.
    • For structures with identical sequence to BCKDHA, we have only found structures with ligand. So, the structural comparison between ligand-binding and ligand-free PDB structures could not be performed.
    • Structure with sequence similarity between 60%-90% in comparison to refseq of BCKDHA were not found.
    • We have also searched related structures using CATH classifications.
  • All selected structures can be found at: /mnt/home/student/weish/master-practical-2013/task04/structures/
    • Protein structure of BCKDHA locates in indentical/
    • Structurally related structures locate in C/, CA/ and CAT/
    • Arbitrary structure of other CATH classification locates in diffCATH/
    • Structures with high and low sequence similarity locate in 60_identity/ and 30_identity/

Evaluation of alignments using structures

Creation of structure models from homologous sequences

The input for the the script hhmakemodel.pl was create with HHsearch:

hhsearch -i /mnt/home/student/weish/master-practical-2013/task01/refseq_BCKDHA_protein.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o hhsearch_BCKDHA.hhr -Z 200000 -B 200000

The models were made as follows:

/usr/share/hhsuite/scripts/hhmakemodel.pl hhsearch_BCKDHA.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 2 4 5 6 10 20 28 31 39 43 -ts hhmakemodel_BCKDHA.pdb

As can be seen in the option -m, 10 structures were chosen to make a model of each. The structures were selected preferably to cover the whole range of alignment scores (E-value, probability, sequence identity), but no hits with very high E-values were taken.

The models are stored in /mnt/home/student/schillerl/MasterPractical/task4/hhmakemodel/.

Comparison with known structure

The models were structually aligned with the reference structure 2BDF using LGA, which was run on this server with default parameters.

LGA results can be found in /mnt/home/student/schillerl/MasterPractical/task4/LGA/.

Stastitical evaluation of alignment scores (from sequence and structural alignments), especially calculation of correlation between these, was performed with R. See /mnt/home/student/schillerl/MasterPractical/task4/alignment_statistics.txt for an overview of the scores.