Difference between revisions of "Homology based structure predictions BCKDHA"

Revision as of 17:16, 13 June 2011

1.Calculation of models

Template selection

Homology modelling is a technique to determine the secondary structure of a target protein. It is based on an alignment of the target sequence and one or more template sequences with known secondary structures. The target sequence is assigned a secondary structure based on the template structure. The better the alignment, the better the predicted secondary structure for our template. Therefore the template selection is a crucial step in homology modelling.

To find similar structures to BCKDHA we ran HHsearch using the following command:
hhsearch -i query -d database -o output

It found the following 10 hits in the pdb70 database.

No	Hit	Prob	E-value	P-value	Score	Cols	Query HMM	Template HMM	Identity
1	2bfd_A 2-oxoisovalerate dehydr	1.0	1	1	791.3	400	1-400	1-400 (400)	99%
2	1qs0_A 2-oxoisovalerate dehydr	1.0	1	1	571.5	349	32-382	52-407 (407)	39%
3	1w85_A Pyruvate dehydrogenase	1.0	1	1	530.8	356	8-382	6-362 (368)	34%
4	1umd_A E1-alpha, 2-OXO acid de	1.0	1	1	521.8	351	34-386	16-367 (367)	37%
5	2ozl_A PDHE1-A type I, pyruvat	1.0	1	1	482.7	331	46-380	25-356 (365)	27%
6	3l84_A Transketolase; TKT, str	1.0	1	1	85.4	133	161-297	113-252 (632)	21%
7	2r8o_A Transketolase 1, TK 1;	1.0	1	1	74.5	121	161-285	113-245 (669)	33%
8	2o1x_A 1-deoxy-D-xylulose-5-ph	1.0	1	1	74.2	127	161-287	122-254 (629)	18%
9	1gpu_A Transketolase; transfer	1.0	1	1	74.2	140	161-302	115-265 (680)	22%
10	3m49_A Transketolase; alpha-be	1.0	1	1	68.8	121	161-285	139-271 (690)	31%

> 60% sequence identity: 2bfd_A
> 40% sequence identity:
< 40% sequence identity (ideally go towards 20%) : 1qs0_A, 1umd_A, 1w85_A, 2r8o_A, 3m49_A, 2ozl_A, 1gpu_A, 3l84_A, 2o1x_A, 1w85_A

HHSearch has only hits with an identity higher than 60% or lower than 40%.

These are the templates we will work with:

> 60% sequence identity: 2bfd_A
< 40% sequence identity (ideally go towards 20%): 2r8o_A

Modeller

MODELLER is used for homology or comparative modelling of protein three-dimensional structures. It calculates a model containing all non-hydrogen atoms. There are also many other tasks provided by MODELLER like de novo modelling of loops in protein structures, optimization of various models of protein structure with respect to a flexibly defined objective function, multiple alignment of protein sequences and/or structures, clustering, searching of sequence databases, comparison of protein structures, etc.[1]

A tutorial is provided on [2] and on [3]

To run modeller with more than one template we use the targets (the percentage values indicate the sequence similarity to the target):

1dtw:A 95%
2bfe:A 94%
2bfb:A 99%
2bfd:A 99%
1gpu:A 22%
2o1x:A 18%
2r8o:A 33%

Protocol Modeller

SWISS-MODEL

SWISS-MODEL server page

To find protein structure homology models SWISS-MODEL can be used. As input it needs a protein sequence or a UniProt AC Code. Optional the template PDB-Id and the chain or a template file can be assigned. SWISS-MODEL is a fully automated protein structure homology-modeling server. It is accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).

SWISS-MODEL

Protocol Swissmodel

SWISS-MODEL server:

ID	link
2bfd_A	2bfd_A
2r8o_A	2r8o_A

Prediction for 2bfd_A

TARGET    51      KPQFPGAS AEFIDKLEFI QPNVISGIPI YRVMDRQGQI INPSEDPHLP
2bfdA     6       kpqfpgas aefidklefi qpnvisgipi yrvmdrqgqi inpsedphlp
                                                                     
TARGET                                             sss   ss s         
2bfdA                                              sss   ss s         


TARGET    99    KEKVLKLYKS MTLLNTMDRI LYESQRQGRI SFYMTNYGEE GTHVGSAAAL
2bfdA     54    kekvlklyks mtllntmdri lyesqrqgri sfymtnygee gthvgsaaal
                                                                     
TARGET          hhhhhhhhhh hhhhhhhhhh hhhhhhh             h hhhhhhhh  
2bfdA           hhhhhhhhhh hhhhhhhhhh hhhhhhh             h hhhhhhhh  


TARGET    149   DNTDLVFGQY REAGVLMYRD YPLELFMAQC YGNISDLGKG RQMPVHYGCK
2bfdA     104   dntdlvfgqa reagvlmyrd yplelfmaqc ygnisdlgkg rqmpvhygck
                                                                     
TARGET              sss       hhhhh     hhhhhhhh h                    
2bfdA               sss       hhhhhh    hhhhhhhh h                    


TARGET    199   ERHFVTISSP LATQIPQAVG AAYAAKRANA NRVVICYFGE GAASEGDAHA
2bfdA     154   erhfvtissp latqipqavg aayaakrana nrvvicyfge gaasegdaha
                                                                     
TARGET                        hhhhhhh hhhhhhhh     ssssssss hhh   hhhh
2bfdA                         hhhhhhh hhhhhhhh     ssssssss hhh   hhhh


TARGET    249   GFNFAATLEC PIIFFCRNNG YAISTPTSEQ YRGDGIAARG PGYGIMSIRV
2bfdA     204   gfnfaatlec piiffcrnng yaistptseq yrgdgiaarg pgygimsirv
                                                                     
TARGET          hhhhhhhh   ssssssss                    hhhh hhh  sssss
2bfdA           hhhhhhhh   ssssssss                    hhhh hhh  sssss


TARGET    299   DGNDVFAVYN ATKEARRRAV AENQPFLIEA MTYRIGHHST SDDSSAYRSV
2bfdA     254   dgndvfavyn atkearrrav aenqpfliea mtyrig---- ----------
                                                                     
TARGET          ss  hhhhhh hhhhhhhhhh hh   sssss ss                   
2bfdA           ss  hhhhhh hhhhhhhhhh hh   sssss ss                   


TARGET    349   DEVNYWDKQD HPISRLRHYL LSQGWWDEEQ EKAWRKQSRR KVMEAFEQAE
2bfdA     292   -------std hpisrlrhyl lsqgwwdeeq ekawrkqsrr kvmeafeqae
                                                                      
TARGET                      hhhhhhhh    h    hhh hhhhhhhhhh hhhhhhhhhh
2bfdA                       hhhhhhhh    h    hhh hhhhhhhhhh hhhhhhhhhh


TARGET    399   RKPKPNPNLL FSDVYQEMPA QLRKQQESLA RHLQTYGEHY PLDHFDK   
2bfdA     354   rkpkpnpnll fsdvyqempa qlrkqqesla rhlqtygehy pldhfdk-  
                                                                     
TARGET          h                   h hhhhhhhhhh hhhhh                
2bfdA           h                   h hhhhhhhhhh hhhhh

Prediction for 2r8o

TARGET    1                        SS LDDKPQFPGA SAEFIDKLEF IQPNVISGIP
2r8oA     2     ssrkelanai ralsmd--av qkaksghpga pmgmadiaev lwrdflkhnp
                                                                     
TARGET                              h hhh     hh hh   hhhhh hhhh      
2r8oA             hhhhhhhh hhhhhh  hh hhh     hh hh   hhhhh hhhh      

TARGET    33    IYRVMDRQGQ IINPSEDPHL PKEKVLKLYK ---SMTLLNT MDRILYESQR
2r8oA     50    qnpswadrdr fvlsnghgsm liysllhltg ydlpmeelkn frqlhsktpg
                                                                     
TARGET                   s ssss     h hhhhhhhhh                       
2r8oA                    s ssss     h hhhhhhhh        hhhh            
 
TARGET    80    QGRISF---Y MTNYGEEGTH VGSAAALDNT DLVFG-QYRE AGVLMYRDYP
2r8oA     100   hpevgytagv etttgplgqg ianavgmaia ektlaaqfnr pghdivdhyt
                                                                     
TARGET                              h hhhhhhhhhh hhhh               ss
2r8oA                               h hhhhhhhhhh hhhhhhhh           ss

TARGET    126   LELFMAQCYG NIS-----DL GKGRQMPVHY GCKERHFVTI SS--------
2r8oA     150   yafmgdgcmm egishevcsl agtlklgkli afyddngisi dghvegwftd
                                                                     
TARGET          ssss hhhhh                   sss sssss                
2r8oA           ssss hhhh    hhhhhhhh hhhh   sss sssss sss   sss      

TARGET    163   ---------- ---------P LATQIPQAVG AAYAAKRANA NRVVICYFG-
2r8oA     200   dtamrfeayg whvirdidgh daasikrave earavtdkps llmcktiigf
                                                                     
TARGET                                 hhhhhhhhh hhhh     s ssssss    
2r8oA            hhhhhhh    sss  sss   hhhhhhhhh hhhh    ss ssssss    

TARGET    193   ---------- EGAASEGDAH A--------- -GFNFAATLE CPIIFFCRNN
2r8oA     250   gspnkagthd shgaplgdae ialtreqlgw kyapfeipse iyaqwdakea
                                                                     
TARGET                                                    h hhhh   hhh
2r8oA                    h h      hhh hhhhhhhh           hh hhhh   hhh

TARGET    223   GYAISTPTSE QYRGDG---- ---------- ---------- ----------
2r8oA     300   gqakesawne kfaayakayp qeaaeftrrm kgempsdfda kakefiaklq
                                                                     
TARGET          hhhhhhhhhh                                            
2r8oA           hhhhhhhhhh hhhhhhh    hhhhhhhhhh h     hhhh hhhhhhhhhh

TARGET    239   -----IAARG PGYGI----- --MSIRVDGN DVFAVYNATK EARRRAVAEN
2r8oA     350   anpakiasrk asqnaieafg pllpeflggs adlapsnltl wsgskained
                                                                     
TARGET                                              hhhh              
2r8oA           h      hhh hhhhhhhhhh hh  ssssss s                    

TARGET    277   QPFLIEAM-- --------TY RIGHHS---- ---------- ----------
2r8oA     400   aagnyihygv refgmtaian gislhggflp ytstflmfve yarnavrmaa
                                                                     
TARGET                              h hhhh                            
2r8oA                sss    hhhhhhhhh hhhh    ss ssss     h    hhhhhhh

TARGET    293   ---------- ------TSDD SSAYRSVDEV NYW-----DK QDHPISRLRH
2r8oA     450   lmkqrqvmvy thdsiglged gpthqpveqv aslrvtpnms twrpcdqves
                                                                     
TARGET                                                            hhhh
2r8oA           hh   sssss ss                 hh hhhh     s ss    hhhh

TARGET    322   YLLSQGWWDE EQEKAWRKQS RRKVMEAFEQ AE-------- ----RKPKPN
2r8oA     500   avawkygver qdgptalils rqnlaqqert eeqlaniarg gyvlkdcagq
                                                                      
TARGET          hhhhhhhhh                                             
2r8oA           hhhhhhhhh     ssssss             hhhh   h s ssss      

TARGET    360   PNLLFSDVYQ EMPAQLRKQQ ESLARHLQTY GEHY-PLDHF DK -------
2r8oA     550   pelifiatgs evelavaaye kltaegvkar vvsmpstdaf dkqdaayres
                                                                     
TARGET            sssss    hhhhhhhhhh hhhhh  sss ss    hhhh h         
2r8oA             sssss    hhhhhhhhhh hhhhh  sss sssss hhhh h   hhhhhh

TARGET          ---------- ---------- ---------- ---------- ----------
2r8oA     600   vlpkavtarv aveagiadyw ykyvglngai vgmttfgesa paellfeefg
                                                                      
TARGET                                                                
2r8oA           h      sss sss    h    hhh    ss s           hhhhhhhh 

TARGET          ---------- ----                                       
2r8oA     650   ftvdnvvaka kell                                       
                                                                      
TARGET                                                                
2r8oA             hhhhhhhh hh

Prediction for 2r8o_A using the improved alignment

TARGET    6                     GASAE FIDKLEFIQP NVISG----I PIYRVMDRQG
2r8oA     2     ssrkelan-- -----airal smdavqkaks ghpgapmgma diaevlwrdf
                                                                      
TARGET                           hhhh hhhhhhhh                hhhhhhh 
2r8oA             hhhhhh        hhhhh hhhhhhhh      hhhh    hhhhhhhhh 

TARGET    37    QIINPSEDPH LPKEKVLKLY KSMTLLNTMD RILYESQRQG RISFYMTNYG
2r8oA     45    lkhnpqnpsw adrdrfvlsn ghgsmliysl lh-l--t--- ------gydl
                                                                      
TARGET                                    hhhhhh sssssss  s ssssss    
2r8oA                          sssss      hhhhhh hh h                 

TARGET    87    EEGTHVGSAA ALDNTDLVFG QYREAGVLMY RDYPLELFMA QCYGNISDLG
2r8oA     83    pmeelknfrq lhsktpghpe vgytagvett tgplgqgian avgmaiaekt
                                                                      
TARGET            hhhh                                 hhhh hhhhhhhhhh
2r8oA             hhhh                                 hhhh hhhhhhhhhh

TARGET    137   KGRQMPVHYG CKERHFV--- ---------- -------TIS SPLATQ----
2r8oA     133   laaqfnrpgh divdhytyaf mgdgcmmegi shevcslagt lklgkliafy
                                                                      
TARGET          hhhh                                                  
2r8oA           hhhhh           sssss s hhhh   h hhhhhhhhhh h   ssssss

TARGET    163   ---------- ---------- ---IPQAVGA AYAAKRANAN RVVICYFGEG
2r8oA     183   ddngisidgh vegwftddta mrfe--ayg- ----whvird idghdaasik
                                                                      
TARGET                                      sss   sss sss   sss  hhhhh
2r8oA           ss sss  ss s       hh hhhh  h         sss   sss  hhhhh

TARGET    190   AASEGDAHAG FNFAATLECP IIFFCRNNGY AISTPTSEQY RGDGIAARGP
2r8oA     226   raveearavt dkpsllmckt iigfgspnka gthdshgapl gdaeialtre
                                                                      
TARGET          hhhhhhhh     sssssss                         hhhhhhhhh
2r8oA           hhhhhhhh     ssssssss               hh       hhhhhhhhh

TARGET    240   GYGIMSIRVD GNDVFAVYNA TKEARRRAVA ENQ------- ---PFLIEAM
2r8oA     276   qlgwkyapfe ipseiyaqwd akeagqakes awnekfaaya kaypqeaaef
                                                                      
TARGET          hh           hhhhhh    hhhhhhhhh h                 hhh
2r8oA           hh           hhhhhh    hhhhhhhhh hhhhhhhhhh h   hhhhhh

TARGET    280   TYRIGHHSTS DDSSAYRVNY WDKQDHPISR LRHY--LLSQ --------GW
2r8oA     326   trrmkgemps dfdakakefi aklqanpaki asrkasqnai eafgpllpef
                                                                     
TARGET          hhhhh      hhhhhhhhhh hhhhh                           
2r8oA           hhhhh      hhhhhhhhhh hhhhh       hhhhhhhhh hhhhhh  ss

TARGET    320   WDEEQEK--- AWRKQSRR-- ---------- --KVMEAFEQ ----------
2r8oA     376   lggsadlaps nltlwsgska inedaagnyi hygvrefgmt aiangislhg
                                                                     
TARGET                                                hhhh            
2r8oA           sssss                          s ss   hhhhh hhhhhhhh  

TARGET    343   ---------- --------AE RKPK------ ---------- ---------P
2r8oA     426   gflpytstfl mfveyarnav rmaalmkqrq vmvythdsig lgedgpthqp
                                                                     
TARGET                                                                
2r8oA             ssssss      h   hhh hhhhhh   s ssssss               

TARGET    350   NPNLLFSDVY QEMPAQLRKQ QESLARHLQT YGEHYPLDHF ----------
2r8oA     476   veqvaslrvt pnmstwrpcd qvesavawky gverqdgpta lilsrqnlaq
                                                                      
TARGET            hhhhhh              hhhhhhhhhh hhh                  
2r8oA             hhhhhh      sss     hhhhhhhhhh hhh    sss sss       

TARGET    390   ---------- ---------D K -------- ---------- ----------
2r8oA     526   qerteeqlan iarggyvlkd cagqpelifi atgsevelav aayekltaeg
                                                                     
TARGET                                                                
2r8oA               hhhh    h sssss         ssss s   hhhhhh hhhhhhhhh 

TARGET          ---------- ---------- ---------- ---------- ----------
2r8oA     576   vkarvvsmps tdafdkqdaa yresvlpkav tarvaveagi adywykyvgl
                                                                     
TARGET                                                                
2r8oA            ssssssss  hhhhh   hh hhhhh       ssssss     h   hhh  

TARGET          ---------- ---------- ---------- --------             
2r8oA     626   ngaivgmttf gesapaellf eefgftvdnv vakakell             
                                                                     
TARGET                                                                
2r8oA             sss           hhhhh hhh   hhhh hhhhhh

iTasser

2bfd_A

Prediction for 2bfd

Seq    SSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSA 
Pred   ccccccccccccccccccccccccccccccccSSSSSccccccccccccccccHHHHHHHHHHHHHHHHHHHHHHHHHHcccccccccccccHHHHHHHH
Conf   9867789999988665555664786666789768888999988884236898999999999999999999999999996798467658877389999999

Seq    AALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGD
Pred   HHcccccSSScccHHHHHHHHccccHHHHHHHHHccccccccccccccccccccccccccccHHHccHcHHHHHHHHHHHcccccSSSSSSccccccccc
Conf   9769989775570357899837998999999983777788989998673426212872246336336308999999999709998899994577444210

Seq    AHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAY
Pred   HHHHHHHHHHHcccSSSSSScccSSccccHHHHHccccHHHHcHcccccSSSSccccHHHHHHHHHHHHHHHHcccccSSSSSSSSSccccccccccccc
Conf   9999999999679979999559821467788772698789843106988689769479999999999999998189988999998750686788998667

Seq    RSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
Pred   ccHHHHHHHHHcccHHHHHHHHHHHcccccHHHHHHHHHHHHHHHHHHHHHHHHcccccHHHHHHHHHccccHHHHHHHHHHHHHHHHccccccHHHHcc
Conf   8999999998639869999999998799999999999999999999999999858998999999675318998899999999999996733188555249

Prediction for 2r8o

Seq    SSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSA
Pred   CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCSSSSSCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHH
Conf   9867789999988665555664786666789768888999988884236898999999999999999999999999996798467658877389999999

Seq    AALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGD
Pred   HHCCCCCSSSCCCHHHHHHHHCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHCCHCHHHHHHHHHHHCCCCCSSSSSSCCCCCCCCC
Conf   9769989775570357899837998999999983777788989998673426212872246336336308999999999709998899994577444210

Seq    AHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAY
Pred   HHHHHHHHHHHCCCSSSSSSCCCSSCCCCHHHHHCCCCHHHHCHCCCCCSSSSCCCCHHHHHHHHHHHHHHHHCCCCCSSSSSSSSSCCCCCCCCCCCCC
Conf   9999999999679979999559821467788772698789843106988689769479999999999999998189988999998750686788998667

Seq    RSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
Pred   CCHHHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCCCCCCHHHHCC
Conf   8999999998639869999999998799999999999999999999999999858998999999675318998899999999999996733188555249

Secondary structure elements are shown as H for Alpha helix,S for Beta sheet and c for Coil

Additionally iTasser predicts several different models and presents the top five. To predict these models it uses a lot of templates. iTasser searches the templates itself and also evaluates which one is the best.

2.Evaluation of models

General

A detailed description of how the created models were evaluated can be found in the Evaluation Protocol. The following section presents only the modelling and evaluation results.

Two interesting score when comparing two structures for their structural similarity are the RMSD and the TM-Score. These are two measures which are usually used to measure the accuracy of modelling a structure when the native structure is known.

The RMSD is the average distance of all residue pairs in two structures. The C-alpha RMSD is the average distance between aligned alpha-carbons. The smaller the RMSD value, the better the predicted structure. A local error (e.g. misorientation of the tail) will result in a high RMSD value, although the global structure is correct.

As the RMSD is sensitive to the local error, the TM-Score was proposed. The TM-Score weights close matches stronger than distant matches and therefore the local error problem is overcome. A TM-Score <0.5 indicates a model with random structural similarity, wherease 0.5 < TM-score < 1.00 means the two compared structures are in about the same fold and therefore the predicted model has a correct topology.

Modeller

Numeric evaluation

template	molpdf	DOPE score	GA341 score
2R8O	11049.43	-7610.51	0.00000
2BFD	2247.36	-41979.05	1.00000
1DTW, 1GPU, 2BFB, 2BFD, 2BFE, 2O1X, 2R8O	13873.63	-43399.59	1.00000

The DOPE (Discrete Optimized Protein Energy) score is calculated to assess homology models. The lower the value of the DOPE score the better the . This can be also seen in our three models. The first one (2r8o) which has the worst sequence identity has a quite high DOPE score. The model where 2bfd was the template has a very low score which is reasonable since 2bfd had a very high sequence identity. It is interesting that the model which is build with 7 templates has a higher score than the one which is only build with 1bfd. This can be explained by the influence of the templates which have a low sequence identity with 1u5b.

GA341 is calculated to decide wether the result is a good model or not. A model which is quite good has a score near one. When a model has a score lower than 0.6 it is a bad model. This is also reflected by our results. The model with 2r8o as template is not a good model since the sewuence identity was low and also the DOPE score is quite high so it has a GA341 score of 0. This shows that it is a really bad model. The other two models have a GA341 score of one which shows that they are good models.

Comparison to experimental structure

experimental structure	model with template	RMSD (DaliLite)	RMSD (sap)	TM-score	Superposition
1U5B_A	2BFD_A	1.1	0.442	0.3526	Superimposed structures of 1U5B and the modeller model with template 2bfd
1U5B_A	2R8O_A	no value	95.095	0.1749	Superimposed structures of 1U5B and the modeller model with template 2r8o
1U5B_A	1DTW_A, 2BFE_A, 2BFB_A, 2BFD_A, 1GPU_A, 2O1X_A, 2R8O_A	1.4	0.396	0.3596	Superimposed structures of 1U5B and the modeller model with more than one template

C-alpha RMSD is a measure of the average deviation in distance between aligned alpha-carbons. The higher this distance value the worse is the model. The first model with 2r8o as template has no C-alpha RMSD since the programm we used could find enough significant similarities because the structures are to dissimilar. The model build with 2bfd has a C-alpha RMSD score of 1.1. This is a very good score. It is interesting that again the model out of the 7 proteins have no better score (1.4). This shows that the model with 2bfd is the best one.

improved alignment

The model which was build with 2r8o was so bad that it was not possible for DaliLite to predict a C-alpha RMSD. So we had to improve it. For this improvement we load the alignment of 1u5b and 2r8o in Jalview <ref>http://www.jalview.org/download.html</ref> to compare the two sequences. To find more equal residues in both sequences we deleted some gaps and checked the Consensus-line to find the amino acids which are in both sequences. With this handmade alignment we repeated the MODELLER-run. To evaluate the resulting model we calculated the C-alpha RMSD and the TMscore.

template	C-alpha score	TMscore	Superposition
2r8o	3.1	0.1740	Superimposed structures of 1U5B and the modeller model with the improved alignment for template 2r8o

As we can see the improvement of the alignment was successful since the model has a much better C-alpha score. In comparison to the C-alpha scores of the other modeller results, this model with the smallest sequence identity still performs worst. The TM-score also gets a little bit smaller compared to the unimproved alignment, indicating that the overall model did not improve.

Swissmodel

Numeric evaluation

QMEAN4 global scores

QMEANscore4

2bfd_A	2r8o_A
0.67	0.203

QMEANscore4 is calculated to compare whole models. The score ranges between 0 and 1. The higher the value the better is the quality of the model. By comparing the scores of the model with 2bfd as target and 2r8o as target it iat obvious that the first one os the better one since it has a much higher QMEANscore4.

QMEAN Z-Score

2bfd_A	2r8o_A
-1.604	-9.522
Z-Score plot1 2bfd_A	Z-Score plot1 2r8o_A
Z-Score plot2 2bfd_A	Z-Score plot2 2r8o_A

The QMEAN Z-Score represents the absolute quality of a model. Models with a low quality have a strongly negative QMEAN Z-scores. The 2bfd-model has a less negative score than the 2r8o-model which schos again that this model has a better quality.

Score components

2bfd_A	2r8o_A
score components 2bfd_A	score components 2r8o_A

Local scores

2bfd_A	2r8o_A
Coloring by residue error 2bfd_A	Coloring by residue error 2r8o_A
Residue error plot 2bfd_A	Residue error plot 2r8o_A

With the coloring by residue error the inaccuracy of each residue is esitmated . The results are visualised using a color gradient where blue means that assured region and red means that this region is unreliable. In the model of 2bfd there are many blue alpha helices which means that they are right and only a few red coils. Since blue is the dominant color this shows that the model is mainly right. In contrast the other model has a lot of red and orange alpha helices and coils and nearly no blue region. This reflects the bad quality of this model.

The residue error plot shows the predicted error (y-axis) per residue (x-axis). The highest error score of the 2bfd-model is 12 and the average is about 3 whereas the highest peak score of the 2r8o-model is 15 and the average is about 5. Again it can be seen that the 2bfd-model is the better one.

Global scores: QMEAN4:

	2bfd_A		2r8o_A
Scoring function term	Raw score	Z-score	Raw score	Z-score
C_beta interaction energy	-162.66	0.54	74.97	-4.18
All-atom pairwise energy	-10811.93	0.35	2113.21	-5.03
Solvation energy	-27.04	-1.02	26.87	-5.92
Torsion angle energy	-75.78	-1.45	36.84	-6.47
QMEAN4 score	0.670	-1.60	0.203	-9.52

Local Model Quality Estimation

2bfd_A	2r8o_A
Local Model Quality Estimation 2bfd_A	Local Model Quality Estimation 2r8o_A

For the local model quality estimation we chose the ANOLEA potential. This program performs energy calculations on a protein chain. On the y-axis the energyof each amino acid is represented. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid. The result of the comparison of this estimation between the 2bfd-model and the 2r8o-model is quite clear since nearly the whole left plot is green and nearly the whole right plot is red. These two plots show that the 2bfd-model is much better than the other one.

Comparison to experimental structure

experimental structure	model with template	RMSD (DaliLite)	RMSD (sap)	TMscore
1U5B_A	2BFD_A	1.1	0.288
1U5B_A	2R8O_A	3.1	2.110	0.1639

C-alpha RMSD is a measure of the average deviation in distance between aligned alpha-carbons. The higher this distance value the worse is the model. The 2bfd-model has a score of 1.1 and the 2r8o-model has a score of 3.1. This comparison shows clearly that the first model is mcuh better than the second one.

improved alignment

experimental structure	model with template	C-alpha RMSD	TMscore
1U5B_A	2R8O_A		0.1592

iTasser

Numeric evaluation

C-score

2bfd
model1	model2	model3	model4	model5
1.999	-3.781	-4.970	-4.970	-3.781

The C-score is a measure for the quality of predicted models by I-TASSER. C-score ranges between [-5,2], where a C-score of higher value signifies a model with a high confidence.

Comparison to experimental structure

	2bfd				2r8o
No	TMscore	RMSD (DaliLite)	RMSD (sap)	Superposition	TMscore	RMSD (DaliLite)	RMSD (sap)	Superposition
1	0.9709	0.49	0.312	iTasser model 1 for template 2bfd superimposed on target 1U5B	0.5190	3.4	3.377	iTasser model 1 for template 2r8o superimposed on target 1U5B
2	0.8609	1.44	0.354	iTasser model 2 for template 2bfd superimposed on target 1U5B	0.4979	3.2	3.935	iTasser model 2 for template 2r8o superimposed on target 1U5B
3	0.8597	1.43	0.478	iTasser model 3 for template 2bfd superimposed on target 1U5B	0.4871	3.0	3.476	iTasser model 3 for template 2r8o superimposed on target 1U5B
4	0.8549	1.71	0.493	iTasser model 4 for template 2bfd superimposed on target 1U5B	0.5354	4.8	2.449	iTasser model 4 for template 2r8o superimposed on target 1U5B
5	0.8251	1.73	0.348	iTasser model 5 for template 2bfd superimposed on target 1U5B	0.5107	6.0	2.540	iTasser model 5 for template 2r8o superimposed on target 1U5B

All of these models are very good which is shown by the table since they have all a high TMscore and a low C-alpha RMSD score. But this is clear because they are the top 5 hits of iTasser. Perhaps the first model is a bit better than the other 4. This can be expected since the Scores are a bit better than of the other 4 models.

Comparison of the methods

modeller

2BFD_A		2R8O_A		Multi
C-alpha RMSD	TMscore	C-alpha RMSD	TMscore	C-alpha RMSD	TMscore
1.1	0.3526	3.1	0.1749	1.4	0.3596

Swissmodel

2BFD_A		2R8O_A
C-alpha RMSD	TMscore	C-alpha RMSD	TMscore
1.1	0.1640	3.1	0.1639

iTasser

2bfd										2r8o
model1		model2		model3		model4		model5		model1		model2		model3		model4		model5
RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore	RMSD	TMscore
0.49	0.9709	1.44	0.8609	1.43	0.8597	1.71	0.8549	1.73	0.8251	3.4	0.5190	3.2	0.4979	3.0	0.4871	4.8	0.5354	6.0	0.5107

References

back to Maple syrup urine disease main page

back to Secondary_Structure_Prediction_BCKDHA

@@ Line 613: / Line 613: @@
 !Superposition
 |-
-|1 || 0.9709 || 0.49 || 0.312 || [[File:Sup_iTasser_2bfd_model1.png|thumb|100px|iTasser model 1 for template 2bfd superimposed on target 1U5B]] || 0.5190 || 3.4 || 3.377 || [[File:Sup_iTasser_2r8o_model1.png|thumb|100px|iTasser model 1 for template 2r8o superimposed on target 1U5B]]
+|1 || 0.9709 || 0.49 || 0.312 || [[File:Sup_iTasser_2bfd_model1.png|thumb|150px|iTasser model 1 for template 2bfd superimposed on target 1U5B]] || 0.5190 || 3.4 || 3.377 || [[File:Sup_iTasser_2r8o_model1.png|thumb|150px|iTasser model 1 for template 2r8o superimposed on target 1U5B]]
 |-
-|2 || 0.8609 || 1.44 || 0.354 || [[File:Sup_iTasser_2bfd_model2.png|thumb|100px|iTasser model 2 for template 2bfd superimposed on target 1U5B]] || 0.4979 || 3.2 || 3.935 || [[File:Sup_iTasser_2r8o_model2.png|thumb|100px|iTasser model 2 for template 2r8o superimposed on target 1U5B]]
+|2 || 0.8609 || 1.44 || 0.354 || [[File:Sup_iTasser_2bfd_model2.png|thumb|150px|iTasser model 2 for template 2bfd superimposed on target 1U5B]] || 0.4979 || 3.2 || 3.935 || [[File:Sup_iTasser_2r8o_model2.png|thumb|150px|iTasser model 2 for template 2r8o superimposed on target 1U5B]]
 |-
-|3|| 0.8597 || 1.43 || 0.478 || [[File:Sup_iTasser_2bfd_model3.png|thumb|100px|iTasser model 3 for template 2bfd superimposed on target 1U5B]] || 0.4871 || 3.0 || 3.476 || [[File:Sup_iTasser_2r8o_model3.png|thumb|100px|iTasser model 3 for template 2r8o superimposed on target 1U5B]]
+|3|| 0.8597 || 1.43 || 0.478 || [[File:Sup_iTasser_2bfd_model3.png|thumb|150px|iTasser model 3 for template 2bfd superimposed on target 1U5B]] || 0.4871 || 3.0 || 3.476 || [[File:Sup_iTasser_2r8o_model3.png|thumb|150px|iTasser model 3 for template 2r8o superimposed on target 1U5B]]
 |-
-|4 || 0.8549 || 1.71 || 0.493 || [[File:Sup_iTasser_2bfd_model4.png|thumb|100px|iTasser model 4 for template 2bfd superimposed on target 1U5B]] || 0.5354 || 4.8 || 2.449 || [[File:Sup_iTasser_2r8o_model4.png|thumb|100px|iTasser model 4 for template 2r8o superimposed on target 1U5B]]
+|4 || 0.8549 || 1.71 || 0.493 || [[File:Sup_iTasser_2bfd_model4.png|thumb|150px|iTasser model 4 for template 2bfd superimposed on target 1U5B]] || 0.5354 || 4.8 || 2.449 || [[File:Sup_iTasser_2r8o_model4.png|thumb|150px|iTasser model 4 for template 2r8o superimposed on target 1U5B]]
 |-
-|5 || 0.8251 || 1.73 || 0.348 || [[File:Sup_iTasser_2bfd_model5.png|thumb|100px|iTasser model 5 for template 2bfd superimposed on target 1U5B]] || 0.5107 || 6.0 || 2.540 || [[File:Sup_iTasser_2r8o_model5.png|thumb|100px|iTasser model 5 for template 2r8o superimposed on target 1U5B]]
+|5 || 0.8251 || 1.73 || 0.348 || [[File:Sup_iTasser_2bfd_model5.png|thumb|150px|iTasser model 5 for template 2bfd superimposed on target 1U5B]] || 0.5107 || 6.0 || 2.540 || [[File:Sup_iTasser_2r8o_model5.png|thumb|150px|iTasser model 5 for template 2r8o superimposed on target 1U5B]]
 |}

Difference between revisions of "Homology based structure predictions BCKDHA"

Revision as of 17:16, 13 June 2011

Contents

1.Calculation of models

Template selection

Modeller

SWISS-MODEL

iTasser

2.Evaluation of models

General

Modeller

Numeric evaluation

Comparison to experimental structure

Swissmodel

Numeric evaluation

Comparison to experimental structure

iTasser

Numeric evaluation

Comparison to experimental structure

Comparison of the methods

References

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools