Canavan Disease: Task 04 - Journal
From Bioinformatikpedia
Link back to Task 04: Structural Alignments
Contents
Task 04 Working Log
data gathering
- reference structure of protein -> <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=2I3C">2I3C</a>
- identical sequence with bound active center (however not NAA but N-Hydroxy(methyl)phosphoryl-L-aspartate that is an intermediate form) -> <a href="http://www.rcsb.org/pdb/explore.do?structureId=2O4H">2O4H</a>
- identical sequence with unbound active centre -> <a href="http://www.rcsb.org/pdb/explore.do?structureId=2Q51">2Q51</a>
- sequence similarity >60% (aspa from rat) -> <a href="http://www.rcsb.org/pdb/explore.do?structureId=2GU2">2GU2</a>
- sequence similarity <30% apsa familiy protein from _mesorhizobium loti_, -> <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=2QJ8">2QJ8</a> retrieved by looking at blast result for 2I3C and choosing a protein that is not in the 30% similarity cluster. then looked at 30% similarity cluster for 2QJ8 and double checked if 2I3C is not present either
- CATH -> no cath entry for ASPA however search via FASTA delivers Cytosol amino peptidase-like domain with an evalue of 1.9E-12. Further investigating the EC diversity suggests that ASPA is contained in this Superfamily as the EC number is named. Therefore the asumed CATH number for ASPA is 3.40.630.10
- Sequence with similar CATH classification for CAT <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=1AYE">1AYE</a>
- Sequence with similar CATH classification for CA <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=1BKJ">1BKJ</a>
- Sequence with similar CATH classification for C <a href="http://www.rcsb.org/pdb/explore/derivedData.do?structureId=1BD0">1BD0</a>
- Sequence with different CATH classification <a href="http://www.rcsb.org/pdb/explore/explore.do?structureId=1b3u">1B3U</a>
pymol
- loaded sequneces with "fetch XXXX"
- for 2I3C, 2O4H, 2Q51, 2GU2 and 2QJ8 there was a selection of only the chain A created (via "select XXXX_A, XXXX and chain A")
- the alignment was done from the action menue -> align -> to selection -> 2I3C (alternatively via align XXXX, 2I3C_A however this results in slightly different RMSD most certainly due to the fact that the alignemnt via the action menu does some additional preselection)
- updated to superimpose (super XXXX & alt A+, 2I3C_A & B+)
- back ground set to white
- zinc via selection tool selcted as sphere
- all structures represented as cartoon
- all structures colored with one color
- 2I3C always orange
- ray
- save png with "png XXXX.png"
- RMSDs:
- 2O4H = 0.341
- 2Q51 = 0.172
- 2GU2 = 0.619
- 2QJ8 = 7.001
- Updated RMSD:
- 2O4H = 0.445
- 2Q51 = 0.223
- 2GU2 = 0.493
- 2QJ8 = 3.474
SSAP
- used Web interface
LGA
- used the web interface
CE
- used the web interface by PDB (jCE)
- stored only the overview
Topmatch
- used web interface
Evaluation with structures
- generation of models:
- first get the sequences which are used to generate the model via: hhsearch -i 2I3C.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70currenthhm_db -o 2i3C.hhr -Z 10000 -B 10000
- chosing the sequences based on seqId / Probability and evalue -> compared the resutls auf hhsearch to the rsults of psiblast from task 02 to get the "true sequence identity"
- Table here:
- genereate the models with: /usr/share/hhsuite/scripts/hhmakemodel.pl -i ~/mapra/2i3C.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 4 25 32 35 115 -ts ~/mapra/2I3C.pdb
- all models are split into single files
- hhsearch based on fasta sequence of 2I3C_A -> 2I3C.pdb split into only Chain A as well
- lga runs vie webserver