Homology based structure predictions BCKDHA

From Bioinformatikpedia
Revision as of 21:27, 7 June 2011 by Reisinger (talk | contribs) (Swissmodel)

1.Calculation of models

To find similar structures to BCKDHA we ran HHsearch:
hhsearch -i query -d database -o output

It found the following 10 hits in the pdb70 database.

No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM Identity
1 2bfd_A 2-oxoisovalerate dehydr 1.0 1 1 791.3 0.0 400 1-400 1-400 (400) 99%
2 1qs0_A 2-oxoisovalerate dehydr 1.0 1 1 571.5 0.0 349 32-382 52-407 (407) 39%
3 1w85_A Pyruvate dehydrogenase 1.0 1 1 530.8 0.0 356 8-382 6-362 (368) 34%
4 1umd_A E1-alpha, 2-OXO acid de 1.0 1 1 521.8 0.0 351 34-386 16-367 (367) 37%
5 2ozl_A PDHE1-A type I, pyruvat 1.0 1 1 482.7 0.0 331 46-380 25-356 (365) 27%
6 3l84_A Transketolase; TKT, str 1.0 1 1 85.4 0.0 133 161-297 113-252 (632) 21%
7 2r8o_A Transketolase 1, TK 1; 1.0 1 1 74.5 0.0 121 161-285 113-245 (669) 33%
8 2o1x_A 1-deoxy-D-xylulose-5-ph 1.0 1 1 74.2 0.0 127 161-287 122-254 (629) 18%
9 1gpu_A Transketolase; transfer 1.0 1 1 74.2 0.0 140 161-302 115-265 (680) 22%
10 3m49_A Transketolase; alpha-be 1.0 1 1 68.8 0.0 121 161-285 139-271 (690) 31%

> 60% sequence identity:
-2bfd_A
> 40% sequence identity:
< 40% sequence identity (ideally go towards 20%) :
-1qs0_A, 1umd_A, 1w85_A, 2r8o_A, 3m49_A, 2ozl_A, 1gpu_A, 3l84_A, 2o1x_A, -1w85_A

HHSearch has only hits with an identity higher than 60% or lower than 40%.

These are the templates we will work with:
> 60% sequence identity:
-2bfd_A
< 40% sequence identity (ideally go towards 20%) :
-2r8o_A

Modeller

MODELLER is used for homology or comparative modeling of protein three-dimensional structures.It calculates a model containing all non-hydrogen atoms. There are also many other tasks provided by MODELLER like de novo modeling of loops in protein structures, optimization of various models of protein structure with respect to a flexibly defined objective function, multiple alignment of protein sequences and/or structures, clustering, searching of sequence databases, comparison of protein structures, etc.[1]

A tutorial is provided on [2] and on [3]

To run modeller with more than one target we use the targets:
-1DTW:A 95%
-2BFE:A 94%
-2BFB:A 99%
-2bfd:A 99%
-3m49:A 31%
-1gpu:A 22%
-2o1x:A 18%

Protocol Modeller

SWISS-MODEL

SWISS-MODEL server page


To find protein structure homology models SWISS-MODEL can be used. As input it needs a protein sequence or a UniProt AC Code. Optional the template PDB-Id and the chain or a template file can be assigned. SWISS-MODEL is a fully automated protein structure homology-modeling server. It is accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).
SWISS-MODEL

SWISS-MODEL server:

ID link
2bfd_A 2bfd_A
2r8o_A 2r8o_A


Prediction for 2bfd_A

TARGET    51      KPQFPGAS AEFIDKLEFI QPNVISGIPI YRVMDRQGQI INPSEDPHLP
2bfdA     6       kpqfpgas aefidklefi qpnvisgipi yrvmdrqgqi inpsedphlp
                                                                     
TARGET                                             sss   ss s         
2bfdA                                              sss   ss s         


TARGET    99    KEKVLKLYKS MTLLNTMDRI LYESQRQGRI SFYMTNYGEE GTHVGSAAAL
2bfdA     54    kekvlklyks mtllntmdri lyesqrqgri sfymtnygee gthvgsaaal
                                                                     
TARGET          hhhhhhhhhh hhhhhhhhhh hhhhhhh             h hhhhhhhh  
2bfdA           hhhhhhhhhh hhhhhhhhhh hhhhhhh             h hhhhhhhh  


TARGET    149   DNTDLVFGQY REAGVLMYRD YPLELFMAQC YGNISDLGKG RQMPVHYGCK
2bfdA     104   dntdlvfgqa reagvlmyrd yplelfmaqc ygnisdlgkg rqmpvhygck
                                                                     
TARGET              sss       hhhhh     hhhhhhhh h                    
2bfdA               sss       hhhhhh    hhhhhhhh h                    


TARGET    199   ERHFVTISSP LATQIPQAVG AAYAAKRANA NRVVICYFGE GAASEGDAHA
2bfdA     154   erhfvtissp latqipqavg aayaakrana nrvvicyfge gaasegdaha
                                                                     
TARGET                        hhhhhhh hhhhhhhh     ssssssss hhh   hhhh
2bfdA                         hhhhhhh hhhhhhhh     ssssssss hhh   hhhh


TARGET    249   GFNFAATLEC PIIFFCRNNG YAISTPTSEQ YRGDGIAARG PGYGIMSIRV
2bfdA     204   gfnfaatlec piiffcrnng yaistptseq yrgdgiaarg pgygimsirv
                                                                     
TARGET          hhhhhhhh   ssssssss                    hhhh hhh  sssss
2bfdA           hhhhhhhh   ssssssss                    hhhh hhh  sssss


TARGET    299   DGNDVFAVYN ATKEARRRAV AENQPFLIEA MTYRIGHHST SDDSSAYRSV
2bfdA     254   dgndvfavyn atkearrrav aenqpfliea mtyrig---- ----------
                                                                     
TARGET          ss  hhhhhh hhhhhhhhhh hh   sssss ss                   
2bfdA           ss  hhhhhh hhhhhhhhhh hh   sssss ss                   


TARGET    349   DEVNYWDKQD HPISRLRHYL LSQGWWDEEQ EKAWRKQSRR KVMEAFEQAE
2bfdA     292   -------std hpisrlrhyl lsqgwwdeeq ekawrkqsrr kvmeafeqae
                                                                      
TARGET                      hhhhhhhh    h    hhh hhhhhhhhhh hhhhhhhhhh
2bfdA                       hhhhhhhh    h    hhh hhhhhhhhhh hhhhhhhhhh


TARGET    399   RKPKPNPNLL FSDVYQEMPA QLRKQQESLA RHLQTYGEHY PLDHFDK   
2bfdA     354   rkpkpnpnll fsdvyqempa qlrkqqesla rhlqtygehy pldhfdk-  
                                                                     
TARGET          h                   h hhhhhhhhhh hhhhh                
2bfdA           h                   h hhhhhhhhhh hhhhh                


Prediction for 2r8o_A


TARGET    152           DL -VFG-QYREA ---GVLMYRD --YPLELFMA QCYGNISDLG
2r8oA     52    pswadr--dr fvlsnghgsm liysllhltg ydlpmeelkn -f-rql----
                                                                     
TARGET                              s    ssssss              sssssssss
2r8oA                    s ssss     h hhhhhhhh        hhhh            


TARGET    187   KGRQMPVHYG CK-ERHFVTI SSPLATQIPQ AVGAAYAAKR AN--------
2r8oA     94    -hsktpghpe vgytagvett tgplgqgian avgmaiaekt laaqfnrpgh
                                                                     
TARGET          s  sssss        hhhhh h     hhhh hhhhhhhhhh h         
2r8oA                                       hhhh hhhhhhhhhh hhhhh     


TARGET    228   --ANRVVICY FGEGAASEGD AHAGFNFAAT LEC-PIIFFC RNNGYAISTP
2r8oA     143   divdhytyaf mgdgcmmegi shevcslagt lklgkliafy ddngisidgh
                                                                     
TARGET                ssss s hhhh   h hhhhhhhhhh h    sssss ss sss  ss
2r8oA                sssss s hhhh   h hhhhhhhhhh h   ssssss ss sss  ss


TARGET    275   TSEQYRGDGI AARGPGYGIM SIR-VDGNDV FAVYNATKEA RRRAVAENQP
2r8oA     193   vegwft-ddt amrfeaygwh virdidghda asikraveea ra---vtdkp
                                                                     
TARGET          s        h hhhhhhh  s sss sss  h hhhhhhhhhh h        s
2r8oA           s        h hhhhhh   s ss  sss  h hhhhhhhhhh hh       s


TARGET    324   FLIEAMTYRI GHHSTSDDSS ----AYRSVD EVNYWDKQ - ----------
2r8oA     239   sllmcktiig fgspnkagth dshgaplgda eialtreqlg wkyapfeips
                                                                      
TARGET          sssssss                       hh hhhhhhhh             
2r8oA           sssssss               hh      hh hhhhhhhhh           h

iTasser

2bfd_A
2bfd_A


This prediction is based on several templates fount by iTasser itself.

Protocol Swissmodel

2.Evaluation of models

As we want to compare the predicted models to the existing PDB entry we need the PDB file for 1U5B. Here only chain A is important, as the sequence for chain A was used to create the models. Therefore chain B has to be removed from the PDB file. Herefore we used a program names ExtractChains.pl provided by <ref>http://www.rosettacommons.org/guide/PDB+Manipulation+Scripts</ref>.

Swissmodel

Numeric evaluation

QMEAN4 global scores

QMEANscore4

2bfd_A 2r8o_A
0.67 0.271


QMEAN Z-Score

2bfd_A 2r8o_A
-1.604 -6.943
Z-Score plot1 2bfd_A
Z-Score plot1 2r8o_A
Z-Score plot2 2bfd_A
Z-Score plot2 2r8o_A


Score components

2bfd_A 2r8o_A
score components 2bfd_A
score components 2r8o_A


Local scores

2bfd_A 2r8o_A
Coloring by residue error 2bfd_A
Coloring by residue error 2r8o_A
Residue error plot 2bfd_A
Residue error plot 2r8o_A


Global scores: QMEAN4:

2bfd_A 2r8o_A
Scoring function term Raw score Z-score Raw score Z-score
C_beta interaction energy -162.66 0.54 -47.91 -1.49
All-atom pairwise energy -10811.93 0.35 -2558.65 -1.98
Solvation energy -27.04 -1.02 10.53 -4.08
Torsion angle energy -75.78 -1.45 18.95 -4.99
QMEAN4 score 0.670 -1.60 0.271 -6.94


Local Model Quality Estimation

2bfd_A 2r8o_A
Local Model Quality Estimation 2bfd_A
Local Model Quality Estimation 2r8o_A


Comparison to experimental structure

experimental structure model with template C-alpha RMSD
1U5B_A 2BFD_A 1.1 [4]
1U5B_A 2R8O_A 2.9 [5]

iTasser

Numeric evaluation