Homology based structure predictions

From Bioinformatikpedia
Revision as of 12:42, 9 June 2011 by Landerer (talk | contribs) (Homologous)

Homologous

Because we found no homologous structures in Task 2, we extended our list by using HHSearch.

We will use 13 proteins for creating a multiple alignment for homologous modeling. We choosen sequences to cover the whole protein and we payed specific attention on the transmembrane region.


PDB-ID Identity Description
1s79 37% Kram
2yc6 32% Kram
3p73 28% Kram
1kcg 22% Kram
1jfm 14% Kram
1bii 22% Kram
2p24 21% Kram
1cd1 21% Kram
2wy3 29% Kram
1lqv 14% Kram
3jts 25% Kram
1ow0 22% Kram
1hxm 18% Kram
3lmy 19% Kram

With these Sequences inclcuding 1a6z, we did a multible algniment with t-coffee.


1A6Z_A  RL---------------------LRSHSLHYLFMGASEQDLGLS-L-FEALGYVD-DQ---LFVFYDHE----------------SRR--VEPRTPWVSS
1S79_A  GR--------------------------------------------WILKNDVKN-RS---VYIKGF----------------PTDAT--LDDIKEWLE-
3P73_A  EF----------------------GSHSLRYFLTGMTDPGPGMP-R-FVIVGYVD-DK---IFGTYNSK----------------SRT--AQPIVEMLPQ
1KCG_C  ------------------------DAHSLWYNFTIIHLPRHGQQ-W-CEVQSQVD-QK---NFLSYDCG----------------SDK--VLSMGHLEEQ
1JFM_A  DAHS------------------------LRCNLTI-KDPTPADP-LWYEAKCFVG-EI---LILHLS----------------NINKT--MTSGDPGET-
1BII_A  MGAMAPRTLLLLLAAALGPTQTRAGSHSLRYFVTAVSRPGFGEP-R-YMEVGYVD-NT---EFVRFDSDA--------------ENPR--YEPRARWIEQ
2P24_A  MAIMA---------------------------------PRT-------LVLLLS--GALALTQTWAGSH-SRGED-------------------DI----
1CD1_A  QQKN----------------------YTFRCLQMS-SFANRSWS-R-TDSVVWLG-DL---QTHRWS----------------NDSAT--ISFTKPWSQG
2WY3_A  M-----------------------EPHSLRYNLMVLSQDESVQS-G-FLAEGHLD-GQ---PFLRYDRQ----------------KRR--AKPQGQWAED
1LQV_A  SQDAS---------------------DGLQR-------LHM-------LQISYFR-DP---YHVWYQGNASLGGHLTHVLEGPDTNTT--IIQLQP----
3JTS_A  ------------------------GSHSMRYFYTSMSRPGRWEP-R-FIAVGYVD-DT---QFVRFDSDA--------------ASQR--MEPRAPWVEQ
1OW0_A  ACHP------------------------R---LSL-HRPAL-ED-LL--LGSEA----------------------------------------------
1HXM_A  AIEL---------------------------------VPEHQTVPVS--IGV------------------------------------------------
3LMY_A  MELC------------------------G---LGL-PRPPMLLA-LL--LATLLAAML---ALLTQV----------------ALVVQVAEAARAPSVSA
                                                                                 
1A6Z_A  RISSQMW------------------------------------------------------LQLSQSLKGWDHMFTV----DFWTI----------MENH
1S79_A  D---------------------------------------------------------------------KGQVLNI----Q------------------
3P73_A  -EDQEHW------------------------------------------------------DTQTQKAQGGERDFDW----NLNRL----------PERY
1KCG_C  LYATDAW------------------------------------------------------GKQLEMLREVGQRLRL----ELADT----------ELED
1JFM_A  ANATEVK------------------------------------------------------KCLTQPLKNLCQKLRN----KVSNT----------KVDT
1BII_A  -EGPEYW------------------------------------------------------ERETRRAKGNEQSFRV----DLRTA----------LRYY
2P24_A  EAD---------------------------------------------------------------HV--------------------------------
1CD1_A  KLSNQQW------------------------------------------------------EKLQHMFQVYRVSFTR----DIQEL----------VKMM
2WY3_A  VLGAETW------------------------------------------------------DTETEDLTENGQDLRR----TLTHI-------------K
1LQV_A  LQEPESW------------------------------------------------------ARTQSGLQSYLLQFHG-----LVRL----------VHQ-
3JTS_A  -EGPEYW------------------------------------------------------DRETRNMKAETQNAPV----NLRNL----------RGYY
1OW0_A  -------------------------------------NLTCTLTGL--RDASGVTFTWTPS---SG----K-----------------------------
1HXM_A  -----------------------------------PATLRCSMKGEAIGNYY---INW---YRKTQ---GN--------TMTFIYREKDIYGPGFKDNFQ
3LMY_A  KPGPALWPLPLSVKMTPNLLHLAPENFYISHSPNSTAGPSCTLLEEAFRRYHGYIFGFYKWHHEPAEFQAKTQVQQLLVSITLQSE--------CDAFPN
                                                                               
1A6Z_A  NHS-KESHTLQVILGCEM----------------------------------------------------------------------------------
1S79_A  ----------------------------------------------------------------------------------------------------
3P73_A  NK-SKGSHTMQMMFGCDI----------------------------------------------------------------------------------
1KCG_C  FT-PSGPLTLQVRMSCEC----------------------------------------------------------------------------------
1JFM_A  HKT-NGYPHLQVTMIYPQ----------------------------------------------------------------------------------
1BII_A  NQSAGGSHTLQWMAGCDV----------------------------------------------------------------------------------
2P24_A  -----------GSYGIVV----------------------------------------------------------------------------------
1CD1_A  SPKEDYPIEIQLSAGCEM----------------------------------------------------------------------------------
2WY3_A  DQK-GGLHSLQEIRVCEI----------------------------------------------------------------------------------
1LQV_A  ERTLAFPLTIRCFLGCEL----------------------------------------------------------------------------------
3JTS_A  NQSEAGSHTIQRMYGCDL----------------------------------------------------------------------------------
1OW0_A  ------------------------------------------S---------------------------------------------------------
1HXM_A  GDI-DI---------------------------------------------------------------------------------------------A
3LMY_A  ISS-DESYTLLVKEPVAVLKANRVWGALRGLETFSQLVYQDSYGTFTINESTIIDSPRFSHRGILIDTSRHYLPVKIILKTLDAMAFNKFNVLHWHIVDD
                                                                              
1A6Z_A  --------------------QEDNS--------------------------TEGYWKYGYDGQDHLEFCPDTL--DWRAA-----EPRAWPTKLEWERHK
1S79_A  ---------------------------------------------------MRRTLHKAFKGSIF-----------------------------------
3P73_A  --------------------LEDGS--------------------------IRGYDQYAFDGRDFLAFDMDTM--TFTAA-----DPVAEITKRRWETEG
1KCG_C  --------------------EADGY--------------------------IRGSWQFSFDGRKFLLFDSNNR--KWTVV-----HAGARRMKEKWEKDS
1JFM_A  --------------------SQ-GRT-------------------------PSATWEFNISDSYF-----------------------------------
1BII_A  --------------------ESDGRL-------------------------LRGYWQFAYDGCDYIALNEDLK--TWTAA-----DMAAQITRRKWEQAG
2P24_A  --------------------YQSPGD--------------------------IGQYTFEFDGDELFYVDLDKKETIWM-----LPEFA-Q--LRSFDPQ-
1CD1_A  --------------------YP-GNA-------------------------SESFLHVAFQGKYVVRFWGT----SWQTVP--GAPSWLDLPIKVLNADQ
2WY3_A  --------------------HEDSS--------------------------TRGSRHFYYNGELFLSQNLETQ--ESTVPQSSRAQTLAMNVTNFWKEDA
1LQV_A  --------------------PPEGSR-------------------------AHVFFEVAVNGSSFVSFRPERA--LWQ-----ADTQV------------
3JTS_A  --------------------GPDGRL-------------------------LRGYHQSAYDGKDYIALNEDLR--SWTAA-----DMAAQNTQRKWEAAG
1OW0_A  ------AVQGPPERDLCGCYSV-SSV-------------------------LPGCAEPWNHGKTF------------TCTA---A---------------
1HXM_A  KNLAVLKILAPSERD-EGSYYC-A------------------------------CDTLGM----------------------------------------
3LMY_A  QSFPYQSITFPELSN-KGSYSL-SHVYTPNDVRMVIEYARLRGIRVLPEFDTPGHTLSWGKGQKD------------LLTP---C---------------
                                                                               
1A6Z_A  IRARQNRAYL-----ERDCPAQLQQLLELGRGV--LDQQVPPLVKVT-H----H-VTSSVTTLRCRAL--------------------------------
1S79_A  -------------------------------------VVFDSIES-----------------------AKKFVETP------------------------
3P73_A  TYAERWKHEL-----GTVCVQNLRRYLEHGKAA--LKRRVQPEVRVW-G----K-EADGILTLSCHAH--------------------------------
1KCG_C  GLTTFFKMVS-----MRDCKSWLRDFLMHRKK--------------------------------------------------------------------
1JFM_A  -------------------------------------FTFYTENM-----------------------SWRSANDE------------------------
1BII_A  -AAERDRAYL-----EGECVEWLRRYLKNGNAT--LLRTDPPKAHVT-H----HRRPEGDVTLRCWAL--------------------------------
2P24_A  GGLQNI---A-----TGK--HNLGVLTKRSNST--PATNEAPQATVF---PKSPVLLGQPNTLICFVD--------------------------------
1CD1_A  GTSATVQMLL-----NDTCPLFVRGLLEAGKSD--LEKQEKPVAWLS-SV---PSSAHGHRQLVCHVS--------------------------------
2WY3_A  MKTKTHYRAM-----QADCLQKLQRYLKSGVA---IRRTVPPMVNVT-C----SEVSEGNITVTCRAS--------------------------------
1LQV_A  --------------------------------------------------------TSGVVTFTLQQL--------------------------------
3JTS_A  -EAEQHRTYL-----EGECLEWLRRYLENGKET--LQRADPPKTHVT-H----HPVSDQEATLRCWAL--------------------------------
1OW0_A  --------YP-----ESKTP--LTATL--SKS---G-NTFRPEVHLL-PPPSEELALNELVTLTCLAR--------------------------------
1HXM_A  ---------------GG--E--YTDKLIFGKGT--R-VTVEPRSQPH-TKPSVFV-MKNGTNVACLVK--------------------------------
3LMY_A  --------YSRQNKLDSFGP--INPTL--NTTYSFL-TTFFKEISEVFPDQFIHLG-GDEVEFKCWESNPKIQDFMRQKGFGTDFKKLESFYIQKVLDII
                                                                               
1A6Z_A  -------------------------------------------NYYPQNI----TMKWLKDKQPMDAKEFEP--KDVLPNGDGT----------------
1S79_A  -------------------------GQKYKET----------------DL----LILFKDDYF-------------------------------------
3P73_A  -------------------------------------------GFYPRPI----TISWMKDGMVRDQET-RW--GGIVPNSDGT----------------
1KCG_C  ----------------------------------------------------------------------------------------------------
1JFM_A  ---------------------SGVIMNKWKDD----------------GEFVKQLKFLIHECS-------------------------------------
1BII_A  -------------------------------------------GFYPADI----TLTWQLNGEELTQEM-EL--VETRPAGDGT----------------
2P24_A  -------------------------------------------NIFPPVI----NITWLRNSKSVADGV-YE--TSFFVNRDYS----------------
1CD1_A  -------------------------------------------GFYPKPV----WVMWMRGDQEQ-QGT-HR--GDFLPNADET----------------
2WY3_A  -------------------------------------------SFYPRNI----TLTWRQDGVSLSHNTQQW--GDVLPDGNGT----------------
1LQV_A  -------------------------------------------NAYNR----------------------------------------------------
3JTS_A  -------------------------------------------GFYPAEI----TLTWQRDGEDQTQDT-EL--VETRPAGDGT----------------
1OW0_A  -------------------------------------------GFSPKDV----LVRWLQGSQELPREK-YLTWASRQEPSQGTTT--------------
1HXM_A  -------------------------------------------EFYPKDI----RINLVSSKKITEFD-----PA-IVISPSGK----------------
3LMY_A  ATINKGSIVWQEVFDDKAKLAPGTIVEVWKDSAYPEELSRVTASGFPVIL----SAPWYLDLISYGQDW-R-KYYKVEPLDFGGTQKQKQLFIGGEACLW
                                                                               
1A6Z_A  ----------YQG-----------WITLAVPP-----GEEQRYTCQVEHPGLD-QPL-------------------------------------------
1S79_A  ----------------------------------------------------------------------------------------------------
3P73_A  ----------YHA-----------SAAIDVLP-----EDGDKYWCRVEHASLP-QPGLFSW---------------------------------------
1KCG_C  ----------------------------------------------------------------------------------------------------
1JFM_A  ----------------------------------------------------------------------------------------------------
1BII_A  ----------FQK-----------WASVVVPL-----GKEQKYTCHVEHEGLP-EPLTLRWGKEEPPSSTKTNTVII----AVPVVLGAVVILGAVMAFV
2P24_A  ----------FHK-----------LSYLTFIP-----SDDDIYDCKVEHWGLE-EPVLKHWEPE-IPAPMS-----------------------------
1CD1_A  ----------WYL-----------QATLDVEA-----GEEAGLACRVKHSSLGGQDIILYWDARQAPVGLIVFIV-----------LIMLVVVGAVV---
2WY3_A  ----------YQT-----------WVATRIRQ-----GEEQRFTCYMEHSGNH-GTHPVPSGKVLVLQSQRTDFPYVSAAMPCFVIIIILC-----VPC-
1LQV_A  ------------------------TRYELREF-----LED---TC------------VQYVQKH-ISAENT-----------------------------
3JTS_A  ----------FQK-----------WAAVVVPS-----GKEQRYTCHVQHEGLR-EPLTLRW---------------------------------------
1OW0_A  ----------FAV-----------TSILRVAAEDW--KKGDTFSCMVGHEALP-L---------------------------------------------
1HXM_A  ----------YN--------------AVKLG--KY--EDSNSVTCSVQHDNKT-VHSTDF----EVKTDSTDH-----------------------V---
3LMY_A  GEYVDATNLTPRLWPRASAVGERLWSSKDVRDMDDAYDRLTRHRCRMVERGIA-----------------------------------------------
                                                                               
1A6Z_A  ----------------------------IVIW
1S79_A  -------------------AKKNEERKQNKVE
3P73_A  -----------------------------EPQ
1KCG_C  -----------------------------RLE
1JFM_A  -------------------QKMDEFLKQSKEK
1BII_A  MKRRRNTGGKGGDYALAPGSQSSDMSLPDCKV
2P24_A  -----------------ELTETSGSRLEVLFQ
1CD1_A  -----------------Y-YIWRRRSAYQDIR
2WY3_A  CKKK---------------------TSAAEGP
1LQV_A  -----------------KGSQT----SRSYTS
3JTS_A  ------------------------------EP
1OW0_A  -------------------AFTQKTIDRLAGK
1HXM_A  -----------------K-PKETENTKQPSKS
3LMY_A  ----------------------AQPLYAGYCN

ITasser

Predicted Secondary Structure by I-Tasser

Sequence:   MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF
Predicted:  CCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCSSSSSCCCCCCCCCCSSSSSSSCCCSSSSCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHH
Conf-Score: 985028899999999899875122045421036641367999985269985643743686068998778788540145583478888887676654315558

Sequence:   WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ
Predicted:  HHHHHHHCCCCCCSSSSSSSCCCCCCCCCCCCCCCCCCCCCCSSSSCCCHHHCHHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCHHHHHHHHHCCHHHHHC
Conf-Score: 888755315777644463525565898763541000558873365263022202455666677878887004598888767064299999999747666642

Sequence:   QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV
Predicted:  CCCCCCCCCCCCCCCHHHHCHHHHCCCCCCSSSSSSSCCCCCCCCCCSSSSCCCCCCCCCCCSSSSSCCCCCCCCSSSSCCCCCCCCCSSSSCCCCCCCCCC
Conf-Score: 599877567699854442101541541332479864358754456553541024888652112699807986310267512589998726840688766531

Sequence:   IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
Prediction: CCCCCCHHHHHHHCCHHHHHHHHHCCCCCCCCCCCCCHCCCC
Conf-Score: 010211112222100246665443013678898651020169

Secondary structure elements are shown as H for Alpha helix,S for Beta sheet & C for Coil

Predicted Solvent Accessibility by I-Tasser

Sequence:   MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF
Prediction: 723312000000000101112222011200120120023333331200000102322003123724434241311436413610352044144313323230

Sequence:   WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ
Prediction: 220132133351310001010021136231211333023032003016303403102321432433044143404422010333005103400630351154

Sequence:   QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV
Prediction: 342353313321443300000100101014010203346564435434135233334221320000000347533120214264144202020214542200

Sequence:   IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
Prediction: 000001100000011100000001334446443132333438

Values range from 0 (buried residue) to 9 (highly exposed residue)

I-Tasser predicted five Models with a C-Score from -0.557 to -3.298. They are ranked from one to five as seen below.

Model 1 with a C-Score of -0.557
Model 2 with a C-Score of -2.539
Model 3 with a C-Score of -2.266
Model 4 with a C-Score of -2.772
Model 5 with a C-Score of -3.298

Model1 has a TM-Score of about 0.64 and a RMSD of 7.7Å. For the prediction, I-Tasser used 10 Templates found on PDB which are:

SwissModel

SwissProt is a server based tool provided by the SIB. It combines tools like PSI-PRED and DISOPRED for secondary structure and disordered region prediction.


The model created by SwissModel is based on a self hit, but we had no chance to exclude the protein itself from the prediction. Therefore we also run SwissModel in Alignment-Mode.(TODO)

Automated Mode

predicted model


Model information: Modelled residue range: 26 to 297
Based on template: 1a6zC (2.60 Å)
Sequence Identity [%]: 100
Evalue: 7.66e-163

Quality information: QMEAN Z-Score: -1.035


Estimated absolute model quality
Estimated density of model quality
Z-Score by category
predicted error

Even though the model is based on a self hit, the Z-Score is about -1, which means that the model is one standard deviation from the mean. The model is not quite unlikely but also not the most probable one.

Alignment Mode

Modeller

References