Difference between revisions of "Canavan Disease: Task 04 - Journal"
From Bioinformatikpedia
(Created page with "<h1 id="task-04-working-log">Task 04 Working Log</h1> <h2 id="data-gathering">data gathering</h2> <ul> <li>reference structure of protein -> <a href="http://www.rcsb.org/pdb/ex…") |
|||
Line 22: | Line 22: | ||
<li>for 2I3C, 2O4H, 2Q51, 2GU2 and 2QJ8 there was a selection of only the chain A created (via "select XXXX_A, XXXX and chain A")</li> |
<li>for 2I3C, 2O4H, 2Q51, 2GU2 and 2QJ8 there was a selection of only the chain A created (via "select XXXX_A, XXXX and chain A")</li> |
||
<li>the alignment was done from the action menue -> align -> to selection -> 2I3C (alternatively via align XXXX, 2I3C_A however this results in slightly different RMSD most certainly due to the fact that the alignemnt via the action menu does some additional preselection)</li> |
<li>the alignment was done from the action menue -> align -> to selection -> 2I3C (alternatively via align XXXX, 2I3C_A however this results in slightly different RMSD most certainly due to the fact that the alignemnt via the action menu does some additional preselection)</li> |
||
+ | <li>updated to superimpose (super XXXX & alt A+'', 2I3C_A & B+'')</li> |
||
<li>back ground set to white</li> |
<li>back ground set to white</li> |
||
<li>zinc via selection tool selcted as sphere</li> |
<li>zinc via selection tool selcted as sphere</li> |
||
Line 35: | Line 36: | ||
<li>2GU2 = 0.619</li> |
<li>2GU2 = 0.619</li> |
||
<li>2QJ8 = 7.001</li> |
<li>2QJ8 = 7.001</li> |
||
+ | </ul></li> |
||
+ | <li>Updated RMSD: |
||
+ | <ul> |
||
+ | <li>2O4H = 0.445</li> |
||
+ | <li>2Q51 = 0.223</li> |
||
+ | <li>2GU2 = 0.493</li> |
||
+ | <li>2QJ8 = 3.474</li> |
||
</ul></li> |
</ul></li> |
||
</ul> |
</ul> |
||
Line 41: | Line 49: | ||
<ul> |
<ul> |
||
− | <li> |
+ | <li>used Web interface</li> |
</ul> |
</ul> |
||
Line 57: | Line 65: | ||
</ul> |
</ul> |
||
− | <h2 id="topmatch-not-comupted-so-far-java">Topmatch |
+ | <h2 id="topmatch-not-comupted-so-far-java">Topmatch</h2> |
+ | |||
+ | <ul> |
||
+ | <li>used web interface</li> |
||
+ | </ul> |
||
+ | |||
+ | <h2 id="evaluation-with-structures">Evaluation with structures</h2> |
||
+ | |||
+ | <ul> |
||
+ | <li>generation of models: |
||
+ | <ul> |
||
+ | <li>first get the sequences which are used to generate the model via: hhsearch -i 2I3C.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70<em>current</em>hhm_db -o 2i3C.hhr -Z 10000 -B 10000</li> |
||
+ | <li>chosing the sequences based on seqId / Probability and evalue -> compared the resutls auf hhsearch to the rsults of psiblast from task 02 to get the "true sequence identity"</li> |
||
+ | <li>Table here:</li> |
||
+ | <li>genereate the models with: /usr/share/hhsuite/scripts/hhmakemodel.pl -i ~/mapra/2i3C.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 4 25 32 35 115 -ts ~/mapra/2I3C.pdb</li> |
||
+ | <li>all models are split into single files</li> |
||
+ | <li>hhsearch based on fasta sequence of 2I3C_A -> 2I3C.pdb split into only Chain A as well</li> |
||
+ | <li>lga runs vie webserver</li> |
||
+ | </ul></li> |
||
+ | </ul> |
Revision as of 13:00, 7 August 2013
Contents
Task 04 Working Log
data gathering
- reference structure of protein -> <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=2I3C">2I3C</a>
- identical sequence with bound active center (however not NAA but N-Hydroxy(methyl)phosphoryl-L-aspartate that is an intermediate form) -> <a href="http://www.rcsb.org/pdb/explore.do?structureId=2O4H">2O4H</a>
- identical sequence with unbound active centre -> <a href="http://www.rcsb.org/pdb/explore.do?structureId=2Q51">2Q51</a>
- sequence similarity >60% (aspa from rat) -> <a href="http://www.rcsb.org/pdb/explore.do?structureId=2GU2">2GU2</a>
- sequence similarity <30% apsa familiy protein from _mesorhizobium loti_, -> <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=2QJ8">2QJ8</a> retrieved by looking at blast result for 2I3C and choosing a protein that is not in the 30% similarity cluster. then looked at 30% similarity cluster for 2QJ8 and double checked if 2I3C is not present either
- CATH -> no cath entry for ASPA however search via FASTA delivers Cytosol amino peptidase-like domain with an evalue of 1.9E-12. Further investigating the EC diversity suggests that ASPA is contained in this Superfamily as the EC number is named. Therefore the asumed CATH number for ASPA is 3.40.630.10
- Sequence with similar CATH classification for CAT <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=1AYE">1AYE</a>
- Sequence with similar CATH classification for CA <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=1BKJ">1BKJ</a>
- Sequence with similar CATH classification for C <a href="http://www.rcsb.org/pdb/explore/derivedData.do?structureId=1BD0">1BD0</a>
- Sequence with different CATH classification <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=1b3u">1B3U</a>
pymol
- loaded sequneces with "fetch XXXX"
- for 2I3C, 2O4H, 2Q51, 2GU2 and 2QJ8 there was a selection of only the chain A created (via "select XXXX_A, XXXX and chain A")
- the alignment was done from the action menue -> align -> to selection -> 2I3C (alternatively via align XXXX, 2I3C_A however this results in slightly different RMSD most certainly due to the fact that the alignemnt via the action menu does some additional preselection)
- updated to superimpose (super XXXX & alt A+, 2I3C_A & B+)
- back ground set to white
- zinc via selection tool selcted as sphere
- all structures represented as cartoon
- all structures colored with one color
- 2I3C always orange
- ray
- save png with "png XXXX.png"
- RMSDs:
- 2O4H = 0.341
- 2Q51 = 0.172
- 2GU2 = 0.619
- 2QJ8 = 7.001
- Updated RMSD:
- 2O4H = 0.445
- 2Q51 = 0.223
- 2GU2 = 0.493
- 2QJ8 = 3.474
SSAP
- used Web interface
LGA
- used the web interface
CE
- used the web interface by PDB (jCE)
- stored only the overview
Topmatch
- used web interface
Evaluation with structures
- generation of models:
- first get the sequences which are used to generate the model via: hhsearch -i 2I3C.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70currenthhm_db -o 2i3C.hhr -Z 10000 -B 10000
- chosing the sequences based on seqId / Probability and evalue -> compared the resutls auf hhsearch to the rsults of psiblast from task 02 to get the "true sequence identity"
- Table here:
- genereate the models with: /usr/share/hhsuite/scripts/hhmakemodel.pl -i ~/mapra/2i3C.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 4 25 32 35 115 -ts ~/mapra/2I3C.pdb
- all models are split into single files
- hhsearch based on fasta sequence of 2I3C_A -> 2I3C.pdb split into only Chain A as well
- lga runs vie webserver