Difference between revisions of "Homology based structure predictions BCKDHA"
(→Comparison to experimental structure) |
(→Comparison to experimental structure) |
||
Line 613: | Line 613: | ||
!Superposition |
!Superposition |
||
|- |
|- |
||
− | |1 || 0.9709 || 0.49 || 0.312 || [[File:Sup_iTasser_2bfd_model1.png|thumb| |
+ | |1 || 0.9709 || 0.49 || 0.312 || [[File:Sup_iTasser_2bfd_model1.png|thumb|150px|iTasser model 1 for template 2bfd superimposed on target 1U5B]] || 0.5190 || 3.4 || 3.377 || [[File:Sup_iTasser_2r8o_model1.png|thumb|150px|iTasser model 1 for template 2r8o superimposed on target 1U5B]] |
|- |
|- |
||
− | |2 || 0.8609 || 1.44 || 0.354 || [[File:Sup_iTasser_2bfd_model2.png|thumb| |
+ | |2 || 0.8609 || 1.44 || 0.354 || [[File:Sup_iTasser_2bfd_model2.png|thumb|150px|iTasser model 2 for template 2bfd superimposed on target 1U5B]] || 0.4979 || 3.2 || 3.935 || [[File:Sup_iTasser_2r8o_model2.png|thumb|150px|iTasser model 2 for template 2r8o superimposed on target 1U5B]] |
|- |
|- |
||
− | |3|| 0.8597 || 1.43 || 0.478 || [[File:Sup_iTasser_2bfd_model3.png|thumb| |
+ | |3|| 0.8597 || 1.43 || 0.478 || [[File:Sup_iTasser_2bfd_model3.png|thumb|150px|iTasser model 3 for template 2bfd superimposed on target 1U5B]] || 0.4871 || 3.0 || 3.476 || [[File:Sup_iTasser_2r8o_model3.png|thumb|150px|iTasser model 3 for template 2r8o superimposed on target 1U5B]] |
|- |
|- |
||
− | |4 || 0.8549 || 1.71 || 0.493 || [[File:Sup_iTasser_2bfd_model4.png|thumb| |
+ | |4 || 0.8549 || 1.71 || 0.493 || [[File:Sup_iTasser_2bfd_model4.png|thumb|150px|iTasser model 4 for template 2bfd superimposed on target 1U5B]] || 0.5354 || 4.8 || 2.449 || [[File:Sup_iTasser_2r8o_model4.png|thumb|150px|iTasser model 4 for template 2r8o superimposed on target 1U5B]] |
|- |
|- |
||
− | |5 || 0.8251 || 1.73 || 0.348 || [[File:Sup_iTasser_2bfd_model5.png|thumb| |
+ | |5 || 0.8251 || 1.73 || 0.348 || [[File:Sup_iTasser_2bfd_model5.png|thumb|150px|iTasser model 5 for template 2bfd superimposed on target 1U5B]] || 0.5107 || 6.0 || 2.540 || [[File:Sup_iTasser_2r8o_model5.png|thumb|150px|iTasser model 5 for template 2r8o superimposed on target 1U5B]] |
|} |
|} |
||
Revision as of 17:16, 13 June 2011
Contents
1.Calculation of models
Template selection
Homology modelling is a technique to determine the secondary structure of a target protein. It is based on an alignment of the target sequence and one or more template sequences with known secondary structures. The target sequence is assigned a secondary structure based on the template structure. The better the alignment, the better the predicted secondary structure for our template. Therefore the template selection is a crucial step in homology modelling.
To find similar structures to BCKDHA we ran HHsearch using the following command:
hhsearch -i query -d database -o output
It found the following 10 hits in the pdb70 database.
No | Hit | Prob | E-value | P-value | Score | SS | Cols | Query HMM | Template HMM | Identity |
---|---|---|---|---|---|---|---|---|---|---|
1 | 2bfd_A 2-oxoisovalerate dehydr | 1.0 | 1 | 1 | 791.3 | 0.0 | 400 | 1-400 | 1-400 (400) | 99% |
2 | 1qs0_A 2-oxoisovalerate dehydr | 1.0 | 1 | 1 | 571.5 | 0.0 | 349 | 32-382 | 52-407 (407) | 39% |
3 | 1w85_A Pyruvate dehydrogenase | 1.0 | 1 | 1 | 530.8 | 0.0 | 356 | 8-382 | 6-362 (368) | 34% |
4 | 1umd_A E1-alpha, 2-OXO acid de | 1.0 | 1 | 1 | 521.8 | 0.0 | 351 | 34-386 | 16-367 (367) | 37% |
5 | 2ozl_A PDHE1-A type I, pyruvat | 1.0 | 1 | 1 | 482.7 | 0.0 | 331 | 46-380 | 25-356 (365) | 27% |
6 | 3l84_A Transketolase; TKT, str | 1.0 | 1 | 1 | 85.4 | 0.0 | 133 | 161-297 | 113-252 (632) | 21% |
7 | 2r8o_A Transketolase 1, TK 1; | 1.0 | 1 | 1 | 74.5 | 0.0 | 121 | 161-285 | 113-245 (669) | 33% |
8 | 2o1x_A 1-deoxy-D-xylulose-5-ph | 1.0 | 1 | 1 | 74.2 | 0.0 | 127 | 161-287 | 122-254 (629) | 18% |
9 | 1gpu_A Transketolase; transfer | 1.0 | 1 | 1 | 74.2 | 0.0 | 140 | 161-302 | 115-265 (680) | 22% |
10 | 3m49_A Transketolase; alpha-be | 1.0 | 1 | 1 | 68.8 | 0.0 | 121 | 161-285 | 139-271 (690) | 31% |
- > 60% sequence identity: 2bfd_A
- > 40% sequence identity:
- < 40% sequence identity (ideally go towards 20%) : 1qs0_A, 1umd_A, 1w85_A, 2r8o_A, 3m49_A, 2ozl_A, 1gpu_A, 3l84_A, 2o1x_A, 1w85_A
HHSearch has only hits with an identity higher than 60% or lower than 40%.
These are the templates we will work with:
- > 60% sequence identity: 2bfd_A
- < 40% sequence identity (ideally go towards 20%): 2r8o_A
Modeller
MODELLER is used for homology or comparative modelling of protein three-dimensional structures. It calculates a model containing all non-hydrogen atoms. There are also many other tasks provided by MODELLER like de novo modelling of loops in protein structures, optimization of various models of protein structure with respect to a flexibly defined objective function, multiple alignment of protein sequences and/or structures, clustering, searching of sequence databases, comparison of protein structures, etc.[1]
A tutorial is provided on [2] and on [3]
To run modeller with more than one template we use the targets (the percentage values indicate the sequence similarity to the target):
- 1dtw:A 95%
- 2bfe:A 94%
- 2bfb:A 99%
- 2bfd:A 99%
- 1gpu:A 22%
- 2o1x:A 18%
- 2r8o:A 33%
SWISS-MODEL
To find protein structure homology models SWISS-MODEL can be used. As input it needs a protein sequence or a UniProt AC Code. Optional the template PDB-Id and the chain or a template file can be assigned. SWISS-MODEL is a fully automated protein structure homology-modeling server. It is accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).
SWISS-MODEL server:
ID | link |
---|---|
2bfd_A | 2bfd_A |
2r8o_A | 2r8o_A |
Prediction for 2bfd_A
TARGET 51 KPQFPGAS AEFIDKLEFI QPNVISGIPI YRVMDRQGQI INPSEDPHLP 2bfdA 6 kpqfpgas aefidklefi qpnvisgipi yrvmdrqgqi inpsedphlp TARGET sss ss s 2bfdA sss ss s TARGET 99 KEKVLKLYKS MTLLNTMDRI LYESQRQGRI SFYMTNYGEE GTHVGSAAAL 2bfdA 54 kekvlklyks mtllntmdri lyesqrqgri sfymtnygee gthvgsaaal TARGET hhhhhhhhhh hhhhhhhhhh hhhhhhh h hhhhhhhh 2bfdA hhhhhhhhhh hhhhhhhhhh hhhhhhh h hhhhhhhh TARGET 149 DNTDLVFGQY REAGVLMYRD YPLELFMAQC YGNISDLGKG RQMPVHYGCK 2bfdA 104 dntdlvfgqa reagvlmyrd yplelfmaqc ygnisdlgkg rqmpvhygck TARGET sss hhhhh hhhhhhhh h 2bfdA sss hhhhhh hhhhhhhh h TARGET 199 ERHFVTISSP LATQIPQAVG AAYAAKRANA NRVVICYFGE GAASEGDAHA 2bfdA 154 erhfvtissp latqipqavg aayaakrana nrvvicyfge gaasegdaha TARGET hhhhhhh hhhhhhhh ssssssss hhh hhhh 2bfdA hhhhhhh hhhhhhhh ssssssss hhh hhhh TARGET 249 GFNFAATLEC PIIFFCRNNG YAISTPTSEQ YRGDGIAARG PGYGIMSIRV 2bfdA 204 gfnfaatlec piiffcrnng yaistptseq yrgdgiaarg pgygimsirv TARGET hhhhhhhh ssssssss hhhh hhh sssss 2bfdA hhhhhhhh ssssssss hhhh hhh sssss TARGET 299 DGNDVFAVYN ATKEARRRAV AENQPFLIEA MTYRIGHHST SDDSSAYRSV 2bfdA 254 dgndvfavyn atkearrrav aenqpfliea mtyrig---- ---------- TARGET ss hhhhhh hhhhhhhhhh hh sssss ss 2bfdA ss hhhhhh hhhhhhhhhh hh sssss ss TARGET 349 DEVNYWDKQD HPISRLRHYL LSQGWWDEEQ EKAWRKQSRR KVMEAFEQAE 2bfdA 292 -------std hpisrlrhyl lsqgwwdeeq ekawrkqsrr kvmeafeqae TARGET hhhhhhhh h hhh hhhhhhhhhh hhhhhhhhhh 2bfdA hhhhhhhh h hhh hhhhhhhhhh hhhhhhhhhh TARGET 399 RKPKPNPNLL FSDVYQEMPA QLRKQQESLA RHLQTYGEHY PLDHFDK 2bfdA 354 rkpkpnpnll fsdvyqempa qlrkqqesla rhlqtygehy pldhfdk- TARGET h h hhhhhhhhhh hhhhh 2bfdA h h hhhhhhhhhh hhhhh
Prediction for 2r8o
TARGET 1 SS LDDKPQFPGA SAEFIDKLEF IQPNVISGIP 2r8oA 2 ssrkelanai ralsmd--av qkaksghpga pmgmadiaev lwrdflkhnp TARGET h hhh hh hh hhhhh hhhh 2r8oA hhhhhhhh hhhhhh hh hhh hh hh hhhhh hhhh TARGET 33 IYRVMDRQGQ IINPSEDPHL PKEKVLKLYK ---SMTLLNT MDRILYESQR 2r8oA 50 qnpswadrdr fvlsnghgsm liysllhltg ydlpmeelkn frqlhsktpg TARGET s ssss h hhhhhhhhh 2r8oA s ssss h hhhhhhhh hhhh TARGET 80 QGRISF---Y MTNYGEEGTH VGSAAALDNT DLVFG-QYRE AGVLMYRDYP 2r8oA 100 hpevgytagv etttgplgqg ianavgmaia ektlaaqfnr pghdivdhyt TARGET h hhhhhhhhhh hhhh ss 2r8oA h hhhhhhhhhh hhhhhhhh ss TARGET 126 LELFMAQCYG NIS-----DL GKGRQMPVHY GCKERHFVTI SS-------- 2r8oA 150 yafmgdgcmm egishevcsl agtlklgkli afyddngisi dghvegwftd TARGET ssss hhhhh sss sssss 2r8oA ssss hhhh hhhhhhhh hhhh sss sssss sss sss TARGET 163 ---------- ---------P LATQIPQAVG AAYAAKRANA NRVVICYFG- 2r8oA 200 dtamrfeayg whvirdidgh daasikrave earavtdkps llmcktiigf TARGET hhhhhhhhh hhhh s ssssss 2r8oA hhhhhhh sss sss hhhhhhhhh hhhh ss ssssss TARGET 193 ---------- EGAASEGDAH A--------- -GFNFAATLE CPIIFFCRNN 2r8oA 250 gspnkagthd shgaplgdae ialtreqlgw kyapfeipse iyaqwdakea TARGET h hhhh hhh 2r8oA h h hhh hhhhhhhh hh hhhh hhh
TARGET 223 GYAISTPTSE QYRGDG---- ---------- ---------- ---------- 2r8oA 300 gqakesawne kfaayakayp qeaaeftrrm kgempsdfda kakefiaklq TARGET hhhhhhhhhh 2r8oA hhhhhhhhhh hhhhhhh hhhhhhhhhh h hhhh hhhhhhhhhh TARGET 239 -----IAARG PGYGI----- --MSIRVDGN DVFAVYNATK EARRRAVAEN 2r8oA 350 anpakiasrk asqnaieafg pllpeflggs adlapsnltl wsgskained TARGET hhhh 2r8oA h hhh hhhhhhhhhh hh ssssss s TARGET 277 QPFLIEAM-- --------TY RIGHHS---- ---------- ---------- 2r8oA 400 aagnyihygv refgmtaian gislhggflp ytstflmfve yarnavrmaa TARGET h hhhh 2r8oA sss hhhhhhhhh hhhh ss ssss h hhhhhhh TARGET 293 ---------- ------TSDD SSAYRSVDEV NYW-----DK QDHPISRLRH 2r8oA 450 lmkqrqvmvy thdsiglged gpthqpveqv aslrvtpnms twrpcdqves TARGET hhhh 2r8oA hh sssss ss hh hhhh s ss hhhh TARGET 322 YLLSQGWWDE EQEKAWRKQS RRKVMEAFEQ AE-------- ----RKPKPN 2r8oA 500 avawkygver qdgptalils rqnlaqqert eeqlaniarg gyvlkdcagq TARGET hhhhhhhhh 2r8oA hhhhhhhhh ssssss hhhh h s ssss TARGET 360 PNLLFSDVYQ EMPAQLRKQQ ESLARHLQTY GEHY-PLDHF DK ------- 2r8oA 550 pelifiatgs evelavaaye kltaegvkar vvsmpstdaf dkqdaayres TARGET sssss hhhhhhhhhh hhhhh sss ss hhhh h 2r8oA sssss hhhhhhhhhh hhhhh sss sssss hhhh h hhhhhh TARGET ---------- ---------- ---------- ---------- ---------- 2r8oA 600 vlpkavtarv aveagiadyw ykyvglngai vgmttfgesa paellfeefg TARGET 2r8oA h sss sss h hhh ss s hhhhhhhh TARGET ---------- ---- 2r8oA 650 ftvdnvvaka kell TARGET 2r8oA hhhhhhhh hh
Prediction for 2r8o_A using the improved alignment
TARGET 6 GASAE FIDKLEFIQP NVISG----I PIYRVMDRQG 2r8oA 2 ssrkelan-- -----airal smdavqkaks ghpgapmgma diaevlwrdf TARGET hhhh hhhhhhhh hhhhhhh 2r8oA hhhhhh hhhhh hhhhhhhh hhhh hhhhhhhhh TARGET 37 QIINPSEDPH LPKEKVLKLY KSMTLLNTMD RILYESQRQG RISFYMTNYG 2r8oA 45 lkhnpqnpsw adrdrfvlsn ghgsmliysl lh-l--t--- ------gydl TARGET hhhhhh sssssss s ssssss 2r8oA sssss hhhhhh hh h TARGET 87 EEGTHVGSAA ALDNTDLVFG QYREAGVLMY RDYPLELFMA QCYGNISDLG 2r8oA 83 pmeelknfrq lhsktpghpe vgytagvett tgplgqgian avgmaiaekt TARGET hhhh hhhh hhhhhhhhhh 2r8oA hhhh hhhh hhhhhhhhhh TARGET 137 KGRQMPVHYG CKERHFV--- ---------- -------TIS SPLATQ---- 2r8oA 133 laaqfnrpgh divdhytyaf mgdgcmmegi shevcslagt lklgkliafy TARGET hhhh 2r8oA hhhhh sssss s hhhh h hhhhhhhhhh h ssssss TARGET 163 ---------- ---------- ---IPQAVGA AYAAKRANAN RVVICYFGEG 2r8oA 183 ddngisidgh vegwftddta mrfe--ayg- ----whvird idghdaasik TARGET sss sss sss sss hhhhh 2r8oA ss sss ss s hh hhhh h sss sss hhhhh TARGET 190 AASEGDAHAG FNFAATLECP IIFFCRNNGY AISTPTSEQY RGDGIAARGP 2r8oA 226 raveearavt dkpsllmckt iigfgspnka gthdshgapl gdaeialtre TARGET hhhhhhhh sssssss hhhhhhhhh 2r8oA hhhhhhhh ssssssss hh hhhhhhhhh TARGET 240 GYGIMSIRVD GNDVFAVYNA TKEARRRAVA ENQ------- ---PFLIEAM 2r8oA 276 qlgwkyapfe ipseiyaqwd akeagqakes awnekfaaya kaypqeaaef TARGET hh hhhhhh hhhhhhhhh h hhh 2r8oA hh hhhhhh hhhhhhhhh hhhhhhhhhh h hhhhhh TARGET 280 TYRIGHHSTS DDSSAYRVNY WDKQDHPISR LRHY--LLSQ --------GW 2r8oA 326 trrmkgemps dfdakakefi aklqanpaki asrkasqnai eafgpllpef TARGET hhhhh hhhhhhhhhh hhhhh 2r8oA hhhhh hhhhhhhhhh hhhhh hhhhhhhhh hhhhhh ss TARGET 320 WDEEQEK--- AWRKQSRR-- ---------- --KVMEAFEQ ---------- 2r8oA 376 lggsadlaps nltlwsgska inedaagnyi hygvrefgmt aiangislhg TARGET hhhh 2r8oA sssss s ss hhhhh hhhhhhhh TARGET 343 ---------- --------AE RKPK------ ---------- ---------P 2r8oA 426 gflpytstfl mfveyarnav rmaalmkqrq vmvythdsig lgedgpthqp TARGET 2r8oA ssssss h hhh hhhhhh s ssssss TARGET 350 NPNLLFSDVY QEMPAQLRKQ QESLARHLQT YGEHYPLDHF ---------- 2r8oA 476 veqvaslrvt pnmstwrpcd qvesavawky gverqdgpta lilsrqnlaq TARGET hhhhhh hhhhhhhhhh hhh 2r8oA hhhhhh sss hhhhhhhhhh hhh sss sss TARGET 390 ---------- ---------D K -------- ---------- ---------- 2r8oA 526 qerteeqlan iarggyvlkd cagqpelifi atgsevelav aayekltaeg TARGET 2r8oA hhhh h sssss ssss s hhhhhh hhhhhhhhh TARGET ---------- ---------- ---------- ---------- ---------- 2r8oA 576 vkarvvsmps tdafdkqdaa yresvlpkav tarvaveagi adywykyvgl TARGET 2r8oA ssssssss hhhhh hh hhhhh ssssss h hhh TARGET ---------- ---------- ---------- -------- 2r8oA 626 ngaivgmttf gesapaellf eefgftvdnv vakakell TARGET 2r8oA sss hhhhh hhh hhhh hhhhhh
iTasser
2bfd_A |
2bfd_A |
Prediction for 2bfd
Seq SSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSA Pred ccccccccccccccccccccccccccccccccSSSSSccccccccccccccccHHHHHHHHHHHHHHHHHHHHHHHHHHcccccccccccccHHHHHHHH Conf 9867789999988665555664786666789768888999988884236898999999999999999999999999996798467658877389999999 Seq AALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGD Pred HHcccccSSScccHHHHHHHHccccHHHHHHHHHccccccccccccccccccccccccccccHHHccHcHHHHHHHHHHHcccccSSSSSSccccccccc Conf 9769989775570357899837998999999983777788989998673426212872246336336308999999999709998899994577444210 Seq AHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAY Pred HHHHHHHHHHHcccSSSSSScccSSccccHHHHHccccHHHHcHcccccSSSSccccHHHHHHHHHHHHHHHHcccccSSSSSSSSSccccccccccccc Conf 9999999999679979999559821467788772698789843106988689769479999999999999998189988999998750686788998667 Seq RSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK Pred ccHHHHHHHHHcccHHHHHHHHHHHcccccHHHHHHHHHHHHHHHHHHHHHHHHcccccHHHHHHHHHccccHHHHHHHHHHHHHHHHccccccHHHHcc Conf 8999999998639869999999998799999999999999999999999999858998999999675318998899999999999996733188555249
Prediction for 2r8o
Seq SSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSA
Pred CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCSSSSSCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHH
Conf 9867789999988665555664786666789768888999988884236898999999999999999999999999996798467658877389999999
Seq AALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGD
Pred HHCCCCCSSSCCCHHHHHHHHCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHCCHCHHHHHHHHHHHCCCCCSSSSSSCCCCCCCCC
Conf 9769989775570357899837998999999983777788989998673426212872246336336308999999999709998899994577444210
Seq AHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAY
Pred HHHHHHHHHHHCCCSSSSSSCCCSSCCCCHHHHHCCCCHHHHCHCCCCCSSSSCCCCHHHHHHHHHHHHHHHHCCCCCSSSSSSSSSCCCCCCCCCCCCC
Conf 9999999999679979999559821467788772698789843106988689769479999999999999998189988999998750686788998667
Seq RSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
Pred CCHHHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCCCCCCHHHHCC
Conf 8999999998639869999999998799999999999999999999999999858998999999675318998899999999999996733188555249
Secondary structure elements are shown as H for Alpha helix,S for Beta sheet and c for Coil
Additionally iTasser predicts several different models and presents the top five. To predict these models it uses a lot of templates. iTasser searches the templates itself and also evaluates which one is the best.
2.Evaluation of models
General
A detailed description of how the created models were evaluated can be found in the Evaluation Protocol. The following section presents only the modelling and evaluation results.
Two interesting score when comparing two structures for their structural similarity are the RMSD and the TM-Score. These are two measures which are usually used to measure the accuracy of modelling a structure when the native structure is known.
The RMSD is the average distance of all residue pairs in two structures. The C-alpha RMSD is the average distance between aligned alpha-carbons. The smaller the RMSD value, the better the predicted structure. A local error (e.g. misorientation of the tail) will result in a high RMSD value, although the global structure is correct.
As the RMSD is sensitive to the local error, the TM-Score was proposed. The TM-Score weights close matches stronger than distant matches and therefore the local error problem is overcome. A TM-Score <0.5 indicates a model with random structural similarity, wherease 0.5 < TM-score < 1.00 means the two compared structures are in about the same fold and therefore the predicted model has a correct topology.
Modeller
Numeric evaluation
template | molpdf | DOPE score | GA341 score |
---|---|---|---|
2R8O | 11049.43 | -7610.51 | 0.00000 |
2BFD | 2247.36 | -41979.05 | 1.00000 |
1DTW, 1GPU, 2BFB, 2BFD, 2BFE, 2O1X, 2R8O | 13873.63 | -43399.59 | 1.00000 |
The DOPE (Discrete Optimized Protein Energy) score is calculated to assess homology models. The lower the value of the DOPE score the better the . This can be also seen in our three models. The first one (2r8o) which has the worst sequence identity has a quite high DOPE score. The model where 2bfd was the template has a very low score which is reasonable since 2bfd had a very high sequence identity. It is interesting that the model which is build with 7 templates has a higher score than the one which is only build with 1bfd. This can be explained by the influence of the templates which have a low sequence identity with 1u5b.
GA341 is calculated to decide wether the result is a good model or not. A model which is quite good has a score near one. When a model has a score lower than 0.6 it is a bad model. This is also reflected by our results. The model with 2r8o as template is not a good model since the sewuence identity was low and also the DOPE score is quite high so it has a GA341 score of 0. This shows that it is a really bad model. The other two models have a GA341 score of one which shows that they are good models.
Comparison to experimental structure
experimental structure | model with template | RMSD (DaliLite) | RMSD (sap) | TM-score | Superposition |
---|---|---|---|---|---|
1U5B_A | 2BFD_A | 1.1 | 0.442 | 0.3526 | |
1U5B_A | 2R8O_A | no value | 95.095 | 0.1749 | |
1U5B_A | 1DTW_A, 2BFE_A, 2BFB_A, 2BFD_A, 1GPU_A, 2O1X_A, 2R8O_A | 1.4 | 0.396 | 0.3596 |
C-alpha RMSD is a measure of the average deviation in distance between aligned alpha-carbons. The higher this distance value the worse is the model. The first model with 2r8o as template has no C-alpha RMSD since the programm we used could find enough significant similarities because the structures are to dissimilar. The model build with 2bfd has a C-alpha RMSD score of 1.1. This is a very good score. It is interesting that again the model out of the 7 proteins have no better score (1.4). This shows that the model with 2bfd is the best one.
improved alignment
The model which was build with 2r8o was so bad that it was not possible for DaliLite to predict a C-alpha RMSD. So we had to improve it. For this improvement we load the alignment of 1u5b and 2r8o in Jalview <ref>http://www.jalview.org/download.html</ref> to compare the two sequences. To find more equal residues in both sequences we deleted some gaps and checked the Consensus-line to find the amino acids which are in both sequences. With this handmade alignment we repeated the MODELLER-run. To evaluate the resulting model we calculated the C-alpha RMSD and the TMscore.
template | C-alpha score | TMscore | Superposition |
---|---|---|---|
2r8o | 3.1 | 0.1740 |
As we can see the improvement of the alignment was successful since the model has a much better C-alpha score. In comparison to the C-alpha scores of the other modeller results, this model with the smallest sequence identity still performs worst. The TM-score also gets a little bit smaller compared to the unimproved alignment, indicating that the overall model did not improve.
Swissmodel
Numeric evaluation
QMEAN4 global scores
QMEANscore4
2bfd_A | 2r8o_A |
---|---|
0.67 | 0.203 |
QMEANscore4 is calculated to compare whole models. The score ranges between 0 and 1. The higher the value the better is the quality of the model. By comparing the scores of the model with 2bfd as target and 2r8o as target it iat obvious that the first one os the better one since it has a much higher QMEANscore4.
QMEAN Z-Score
2bfd_A | 2r8o_A |
---|---|
-1.604 | -9.522 |
The QMEAN Z-Score represents the absolute quality of a model. Models with a low quality have a strongly negative QMEAN Z-scores. The 2bfd-model has a less negative score than the 2r8o-model which schos again that this model has a better quality.
Score components
2bfd_A | 2r8o_A |
---|---|
Local scores
2bfd_A | 2r8o_A |
---|---|
With the coloring by residue error the inaccuracy of each residue is esitmated . The results are visualised using a color gradient where blue means that assured region and red means that this region is unreliable.
In the model of 2bfd there are many blue alpha helices which means that they are right and only a few red coils. Since blue is the dominant color this shows that the model is mainly right. In contrast the other model has a lot of red and orange alpha helices and coils and nearly no blue region. This reflects the bad quality of this model.
The residue error plot shows the predicted error (y-axis) per residue (x-axis). The highest error score of the 2bfd-model is 12 and the average is about 3 whereas the highest peak score of the 2r8o-model is 15 and the average is about 5. Again it can be seen that the 2bfd-model is the better one.
Global scores: QMEAN4:
2bfd_A | 2r8o_A | |||
---|---|---|---|---|
Scoring function term | Raw score | Z-score | Raw score | Z-score |
C_beta interaction energy | -162.66 | 0.54 | 74.97 | -4.18 |
All-atom pairwise energy | -10811.93 | 0.35 | 2113.21 | -5.03 |
Solvation energy | -27.04 | -1.02 | 26.87 | -5.92 |
Torsion angle energy | -75.78 | -1.45 | 36.84 | -6.47 |
QMEAN4 score | 0.670 | -1.60 | 0.203 | -9.52 |
Local Model Quality Estimation
2bfd_A | 2r8o_A |
---|---|
For the local model quality estimation we chose the ANOLEA potential. This program performs energy calculations on a protein chain. On the y-axis the energyof each amino acid is represented. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid. The result of the comparison of this estimation between the 2bfd-model and the 2r8o-model is quite clear since nearly the whole left plot is green and nearly the whole right plot is red. These two plots show that the 2bfd-model is much better than the other one.
Comparison to experimental structure
experimental structure | model with template | RMSD (DaliLite) | RMSD (sap) | TMscore |
---|---|---|---|---|
1U5B_A | 2BFD_A | 1.1 | 0.288 | |
1U5B_A | 2R8O_A | 3.1 | 2.110 | 0.1639 |
C-alpha RMSD is a measure of the average deviation in distance between aligned alpha-carbons. The higher this distance value the worse is the model. The 2bfd-model has a score of 1.1 and the 2r8o-model has a score of 3.1. This comparison shows clearly that the first model is mcuh better than the second one.
improved alignment
experimental structure | model with template | C-alpha RMSD | TMscore |
---|---|---|---|
1U5B_A | 2R8O_A | 0.1592 |
iTasser
Numeric evaluation
C-score
2bfd | ||||
---|---|---|---|---|
model1 | model2 | model3 | model4 | model5 |
1.999 | -3.781 | -4.970 | -4.970 | -3.781 |
The C-score is a measure for the quality of predicted models by I-TASSER. C-score ranges between [-5,2], where a C-score of higher value signifies a model with a high confidence.
Comparison to experimental structure
2bfd | 2r8o | |||||||
---|---|---|---|---|---|---|---|---|
No | TMscore | RMSD (DaliLite) | RMSD (sap) | Superposition | TMscore | RMSD (DaliLite) | RMSD (sap) | Superposition |
1 | 0.9709 | 0.49 | 0.312 | 0.5190 | 3.4 | 3.377 | ||
2 | 0.8609 | 1.44 | 0.354 | 0.4979 | 3.2 | 3.935 | ||
3 | 0.8597 | 1.43 | 0.478 | 0.4871 | 3.0 | 3.476 | ||
4 | 0.8549 | 1.71 | 0.493 | 0.5354 | 4.8 | 2.449 | ||
5 | 0.8251 | 1.73 | 0.348 | 0.5107 | 6.0 | 2.540 |
All of these models are very good which is shown by the table since they have all a high TMscore and a low C-alpha RMSD score. But this is clear because they are the top 5 hits of iTasser. Perhaps the first model is a bit better than the other 4. This can be expected since the Scores are a bit better than of the other 4 models.
Comparison of the methods
modeller
2BFD_A | 2R8O_A | Multi | |||
---|---|---|---|---|---|
C-alpha RMSD | TMscore | C-alpha RMSD | TMscore | C-alpha RMSD | TMscore |
1.1 | 0.3526 | 3.1 | 0.1749 | 1.4 | 0.3596 |
Swissmodel
2BFD_A | 2R8O_A | ||
---|---|---|---|
C-alpha RMSD | TMscore | C-alpha RMSD | TMscore |
1.1 | 0.1640 | 3.1 | 0.1639 |
iTasser
2bfd | 2r8o | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
model1 | model2 | model3 | model4 | model5 | model1 | model2 | model3 | model4 | model5 | ||||||||||
RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore | RMSD | TMscore |
0.49 | 0.9709 | 1.44 | 0.8609 | 1.43 | 0.8597 | 1.71 | 0.8549 | 1.73 | 0.8251 | 3.4 | 0.5190 | 3.2 | 0.4979 | 3.0 | 0.4871 | 4.8 | 0.5354 | 6.0 | 0.5107 |
References
<references />
back to Maple syrup urine disease main page