Difference between revisions of "Lab journal task 4"

From Bioinformatikpedia
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
  +
== Explore structural alignments ==
First, we used Pymol to view the different structures and to construct structural alignments to the template 1A6Z_A. Therfore we used the command " align to structure" from the action menu.
 
  +
  +
The structure assembly was done manually using the results from the sequence search, but also browsing CATH.
  +
We constructed pairwise sequence alignments with EMBOSS Needle (http://www.ebi.ac.uk/Tools/psa/emboss_needle/) using default parameters to compute the pairwise sequence identity.
  +
  +
Pymol was used to view the different structures and to construct structural alignments to the template 1A6Z_A. Therfore we used the command " align to structure" from the action menu.
   
 
The pymol alignment uses only the C_alpha atoms per default. The following command can be used to align all atoms and create the object ali12<br>
 
The pymol alignment uses only the C_alpha atoms per default. The following command can be used to align all atoms and create the object ali12<br>
Line 32: Line 37:
 
</css>
 
</css>
   
  +
Apart from Pymol, the following four alignment methods were used from their web interface:
 
apart from Pymol, the following four alignment methods were used:
 
   
 
{|class="colBasic2"
 
{|class="colBasic2"
Line 46: Line 50:
 
| CE || http://source.rcsb.org/jfatcatserver/index.jsp
 
| CE || http://source.rcsb.org/jfatcatserver/index.jsp
 
|-
 
|-
  +
|}
 
   
 
LGA: <br>
 
LGA: <br>
* superimpose to target structure (second structure, in our case 1A6Z )
+
* superimpose to template structure (second structure, in our case 1A6Z )
 
* default parameters
 
* default parameters
 
SSAP: <br>
 
SSAP: <br>
 
* no target structure to specify
 
* no target structure to specify
 
* no parameters to set
 
* no parameters to set
*score from 0 to 100
+
* score from 0 to 100
 
TopMatch: <br>
 
TopMatch: <br>
* target 1A6Z
+
* target 1A6Z
  +
* "match"
 
* default parameters
 
* default parameters
* target in blue, query in green
+
* target in blue, query in gree
 
* structurally equivalent residue pairs: orange (query) and red (target)
 
* structurally equivalent residue pairs: orange (query) and red (target)
* S = Measure of structural similarity based on Gaussian functions (see Sippl & Wiederstein (2012)). If the structurally equivalent parts in query and target match perfectly, S is equal to L. With increasing spatial deviation of the aligned residues, S approaches 0.
+
* S = "structural similarity, if the structurally equivalent parts in query and target match perfectly, S is equal to L. With increasing spatial deviation of the aligned residues, S approaches 0." (https://topmatch.services.came.sbg.ac.at/help/tm_help.html)
  +
* S_r = "Typical distance error" (https://topmatch.services.came.sbg.ac.at/help/tm_help.html)
* S_r = Typical distance error. For details on the construction of this per-residue measure of structural similarity, see Sippl & Wiederstein (2012).
 
  +
CE: <br>
  +
* no template to specify
  +
* no parameters to set
  +
*jCE algorithm
  +
  +
  +
== Structural alignments for evaluating sequence alignments ==
  +
  +
The .hhr input file for hhmakemodel.pl was created with hhsearch using the reference structure 1A6Z_A and the current pdb70 database. The maximum # of lines in the summary hit list and the # of alignments were set to 10000:
  +
hhsearch -i /mnt/home/student/betza/data/1A6Z_A.fasta -d -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o /mnt/home/student/betza/task4/1A6Z_A_pdb.hhr -Z 10000 -B 10000
  +
A models were then created with hhmakemodel.pl
  +
perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/betza/task4/1A6Z_A_pdb.hhr -ts /mnt/home/student/betza/task4/1A6Z_A_models.pdb
  +
The hhsearch results contain 449 proteins. Therfore, we selected 9 proteins that span different ranges of probability, e-value and sequence identity. We did not take proteins with an E-value above 1, because only sequences with a low E-value can be considererd as true hits. The identity was taken from the psiblast runs, but not all of the 9 proteins were contained in the results. For the remaining proteins, we calculated pairwise sequence alignments using the above mentioned EMBOSS Needle with standard parameters.
  +
  +
hhmakemodel.pl was executed for the selected proteins with following command:
  +
perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/betza/task4/1A6Z_A_pdb.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries -m 2 30 47 65 103 141 151 160 253 -ts /mnt/home/student/betza/task4/1A6Z_A_models.pdb
  +
The resulting structures from 1A6Z_A_models.pdb wer then extracted into single pdb files. Using the models we then created structural alignments to the reference with LGA.
  +
  +
Pearson's correlation coefficient was computed with R.

Latest revision as of 00:52, 13 August 2013

Explore structural alignments

The structure assembly was done manually using the results from the sequence search, but also browsing CATH. We constructed pairwise sequence alignments with EMBOSS Needle (http://www.ebi.ac.uk/Tools/psa/emboss_needle/) using default parameters to compute the pairwise sequence identity.

Pymol was used to view the different structures and to construct structural alignments to the template 1A6Z_A. Therfore we used the command " align to structure" from the action menu.

The pymol alignment uses only the C_alpha atoms per default. The following command can be used to align all atoms and create the object ali12

align structure1 and resi 1-n, structure2 and resi 1-m, object=ali12 

,where n equals the number of atoms in structure 1 and m the number of atoms in structure 2.

We saved the current view of the first alignment

my_view = cmd.get_view()

and then used this variable later to set the orientation to these values, in order to be able to make images from the different alignments always using the same view.

cmd.set_view(my_view)

<css> table.colBasic2 { margin-left: auto; margin-right: auto; border: 2px solid black; border-collapse:collapse; width: 70%; }

.colBasic2 th,td { padding: 3px; border: 2px solid black; }

.colBasic2 td { text-align:left; }

.colBasic2 tr th { background-color:#efefef; color: black;} .colBasic2 tr:first-child th { background-color:#adceff; color:black;} </css>

Apart from Pymol, the following four alignment methods were used from their web interface:

method name location
LGA http://proteinmodel.org/AS2TS/LGA/lga.html
SSAP http://protein.hbu.cn/cath/cathwww.biochem.ucl.ac.uk/cgi-bin/cath/GetSsapRasmol.html
TopMatch https://topmatch.services.came.sbg.ac.at/
CE http://source.rcsb.org/jfatcatserver/index.jsp

LGA:

  • superimpose to template structure (second structure, in our case 1A6Z )
  • default parameters

SSAP:

  • no target structure to specify
  • no parameters to set
  • score from 0 to 100

TopMatch:

CE:

  • no template to specify
  • no parameters to set
  • jCE algorithm


Structural alignments for evaluating sequence alignments

The .hhr input file for hhmakemodel.pl was created with hhsearch using the reference structure 1A6Z_A and the current pdb70 database. The maximum # of lines in the summary hit list and the # of alignments were set to 10000:

hhsearch -i /mnt/home/student/betza/data/1A6Z_A.fasta -d -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o /mnt/home/student/betza/task4/1A6Z_A_pdb.hhr -Z 10000 -B 10000

A models were then created with hhmakemodel.pl

perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/betza/task4/1A6Z_A_pdb.hhr -ts /mnt/home/student/betza/task4/1A6Z_A_models.pdb

The hhsearch results contain 449 proteins. Therfore, we selected 9 proteins that span different ranges of probability, e-value and sequence identity. We did not take proteins with an E-value above 1, because only sequences with a low E-value can be considererd as true hits. The identity was taken from the psiblast runs, but not all of the 9 proteins were contained in the results. For the remaining proteins, we calculated pairwise sequence alignments using the above mentioned EMBOSS Needle with standard parameters.

hhmakemodel.pl was executed for the selected proteins with following command:

perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/betza/task4/1A6Z_A_pdb.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries -m 2 30 47 65 103 141 151 160 253 -ts /mnt/home/student/betza/task4/1A6Z_A_models.pdb

The resulting structures from 1A6Z_A_models.pdb wer then extracted into single pdb files. Using the models we then created structural alignments to the reference with LGA.

Pearson's correlation coefficient was computed with R.