Homology based structure predictions

From Bioinformatikpedia
Revision as of 13:25, 9 June 2011 by Landerer (talk | contribs) (Homologous)

Homologous

Because we found no homologous structures in Task 2, we extended our list by using HHSearch.

HHSearch found just sequences with an indentity below 40% therefore we will use the 12 proteins shown below for creating a multiple alignment for homologous modeling. We choose sequences to cover the whole protein and we pay specific attention on the transmembrane region.


PDB-ID Identity Description
1s79 37% Kram
3p73 28% Kram
1kcg 22% Kram
1jfm 14% Kram
1bii 22% Kram
2p24 21% Kram
1cd1 21% Kram
2wy3 29% Kram
1lqv 14% Kram
3jts 25% Kram
1ow0 22% Kram
1hxm 18% Kram

With these Sequences inclcuding 1a6z, we did a multible algniment with t-coffee(EXPRESSO). This mutlible alignment is later used in the Alignment Mode of SwissModel and Modeller.

  DSSP                                   --EEEEEEEEEEB-SS-SSB--EEE
Q30201          MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEAL
1S79_A          --------------------------------------------------
3P73_A          -----------------------EFGSHSLRYFLTGMTDPGPGMPRFVIV
1KCG_C          -------------------------DAHSLWYNFTIIHLPRHGQQWCEVQ
1JFM_A          -------------------------DAHSLRCNLTIKDPTPADPLWYEAK
1BII_A          -MGAMAPRTLLLLLAAALGPTQTRAGSHSLRYFVTAVSRPGFGEPRYMEV
2P24_A          --------------------------------------------------
1CD1_A          -----------------------QQKNYTFRCLQMSSFANR-SWSRTDSV
2WY3_A          ------------------------MEPHSLRYNLMVLSQDESVQSGFLAE
1LQV_A          -------------------SQDASDGLQRLHMLQISYFR-DPYHVWYQGN
3JTS_A          -------------------------GSHSMRYFYTSMSRPGRWEPRFIAV
1OW0_A          --------------------------------------------------
1HXM_A          --------------------------------------------------
                                                                 
  DSSP          EEETTEEEEEEESSS--EEE--STTS-SSTTTTHHHHHHHHHHHHHHHHH
Q30201          GYVDDQLFVFYDHES--RRVE-PRTPWVSSRISSQMWLQLSQSLKGWDHM
1S79_A          ------------------GRW-IL-KNDVKNRSVYIKGFPTDATLDDIKE
3P73_A          GYVDDKIFGTYNSKS--RTAQ-PIVEML-PQEDQEHWDTQTQKAQGGERD
1KCG_C          SQVDQKNFLSYDCGS--DKVLSMGHL-EEQLYATDAWGKQLEMLREVGQR
1JFM_A          CFVGEILILHLSNIN--KTMT-SG-DPGETANATEVKKCLTQPLKNLCQK
1BII_A          GYVDNTEFVRFDSDAENPRYE-PRARWIE-QEGPEYWERETRRAKGNEQS
2P24_A          --------------------------M----AIMAPRTLVLLLSGALALT
1CD1_A          VWLGDLQTHRWSNDS--ATIS-FTKPWSQGKLSNQQWEKLQHMFQVYRVS
2WY3_A          GHLDGQPFLRYDRQK--RRAK-PQGQWAEDVLGAETWDTETEDLTENGQD
1LQV_A          ASLGGHLTHVLEGPDTNTTII-QLQPL----QEPESWARTQSGLQSYLLQ
3JTS_A          GYVDDTQFVRFDSDAASQRME-PRAPWVE-QEGPEYWDRETRNMKAETQN
1OW0_A          --------------------------------------------------
1HXM_A          -----------------------------------AIELVPEHQTVPVSI
                                                                 
  DSSP          HHHHHHHHTTT-SSS--E--------EEEEEE-EEE-TTS-E-EEE-E--
Q30201          FTVDFWTIMENHN-HSKE--------SHTLQV-ILGCEMQED-NST-E--
1S79_A          WLEDKGQV-LNIQMRRTL--------HKAFKG-SIFVVFDSI-ESA-KKF
3P73_A          FDWNLNRLPERYN-KSKG--------SHTMQM-MFGCDILED-GSI-R--
1KCG_C          LRLELADT---------ELEDFTPSGPLTLQV-RMSCECEAD-GYI-R--
1JFM_A          LRNKVSNT-KVDTHKTNG--------YPHLQV-TMIYPQSQG-RTP-S--
1BII_A          FRVDLRTALRYYNQSAGG--------SHTLQW-MAGCDVESD-GRLLR--
2P24_A          QTWAGSHSRGEDD--IEA--------DHVGSYGIVVYQSP----GD-I--
1CD1_A          FTRDIQELVKMMSPKEDY--------PIEIQL-SAGCEMYPG-NAS-E--
2WY3_A          LRRTLTHI----KDQKGG--------LHSLQE-IRVCEIHED-SST-R--
1LQV_A          FHGLVRLVHQERT--LAF--------PLTIRC-FLGCELPPEGSRA-H--
3JTS_A          APVNLRNLRGYYNQSEAG--------SHTIQR-MYGCDLGPD-GRLLR--
1OW0_A          -----ACHPRLSLHRPAL--------EDLLLG-SEANLTCTL-TGLRD--
1HXM_A          GVPATLRCSMKGEAIGNY--------YINWYR-KTQGNTMTF-IYRE---
                                                                 
  DSSP          ----------EEEETTEE----------------EEEEEGGGTEEEES--
Q30201          ----------GYWKYGYD----------------GQDHLEFCPDTLDW--
1S79_A          VETPGQKYKETDLLILFKDDYFAKKNEERKQNKVE---------------
3P73_A          ----------GYDQYAFD----------------GRDFLAFDMDTMTF--
1KCG_C          ----------GSWQFSFD----------------GRKFLLFDSNNRKW--
1JFM_A          ----------ATWEFNIS----------------DSYFFTFYTENMSW--
1BII_A          ----------GYWQFAYD----------------GCDYIALNEDLKTW--
2P24_A          ----------GQYTFEFD----------------GDELFYVDLDKKET--
1CD1_A          ----------SFLHVAFQ----------------GKYVVRFWG--TSWQT
2WY3_A          ----------GSRHFYYN----------------GELFLSQNLETQES--
1LQV_A          ----------VFFEVAVN----------------GSSFVSFRPERALW--
3JTS_A          ----------GYHQSAYD----------------GKDYIALNEDLRSW--
1OW0_A          ----------ASGVTFTW----------------TPSSGKSAV--QGPPE
1HXM_A          ----------KDIYGPGF----------------KDNFQGDIDIAKNL--
                                                                 
  DSSP          SGG-G----HHH-HHHHHSSTHHH--HHHHHHHHTHHHHHHHHHHHHHTT
Q30201          RAA-E----PRA-WPTKLEWERHK--IRARQNRAYLERDCPAQLQQLLEL
1S79_A          --------------------------------------------------
3P73_A          TAA-D----PVA-EITKRRWETEG--TYAERWKHELGTVCVQNLRRYLEH
1KCG_C          TVV-H----AGA-RRMKEKWEKDS--GLTTFFKMVSMRDCKSWLRDFLMH
1JFM_A          RSA-N----DES-GVIMNKWKDDG--EFVKQLKFLI-HECSQKMDEFLKQ
1BII_A          TAA-D----MAA-QITRRKWEQA---GAAERDRAYLEGECVEWLRRYLKN
2P24_A          IWM-------------LPEFAQLR--SFDPQGGLQNIATGKHNLGVLTKR
1CD1_A          VPGAP----SWL-DLPIKVLNADQ--GTSATVQMLLNDTCPLFVRGLLEA
2WY3_A          TVP-QSSRAQTLAMNVTNFW-KEDAMKTKTHYRAMQ-ADCLQKLQRYLKS
1LQV_A          QAD-TQVTSGVV-TFTLQQLNAYN--RTRYELREFLEDTCVQYVQKHISA
3JTS_A          TAA-D----MAA-QNTQRKWEAA---GEAEQHRTYLEGECLEWLRRYLEN
1OW0_A          R--DL----CGC-YSVSSVLPGCA--EPWNHGKTFTCTAAYPESKTPLTA
1HXM_A          AVL-K----ILA-PSERDEGSYYC--ACDTLGMGGEYTDKLIFGKGTRVT
                                                                  
  DSSP          TSS--B--EEEEEEEE-SS-----E-EEEEEEEEEBSS--EEEEEETTEE  
Q30201          GRGVLDQQVPPLVKVTHHVT----S-SVTTLRCRALNYYPQNITMKWLKD
1S79_A          --------------------------------------------------
3P73_A          GKAALKRRVQPEVRVWGKEA----D-GILTLSCHAHGFYPRPITISWMKD
1KCG_C          RKKRLE--------------------------------------------
1JFM_A          SKEK----------------------------------------------
1BII_A          GNATLLRTDPPKAHVTHHRR----PEGDVTLRCWALGFYPADITLTWQLN
2P24_A          SNSTPATNEAPQATVFPKSP--VLLGQPNTLICFVDNIFPPVINITWLRN
1CD1_A          GKSDLEKQEKPVAWLSSVP---SSAHGHRQLVCHVSGFYPKPVWVMWMRG
2WY3_A          GVAIRRTVPPMVNVTCSEVS----EGNITVTCRASSFYPRNITLTWRQDG
1LQV_A          ENTKGSQTSRSYTS------------------------------------
3JTS_A          GKETLQRADPPKTHVTHHPV----SDQEATLRCWALGFYPAEITLTWQRD
1OW0_A          TLSKSGNTFRPEVHLLPPPSEELALNELVTLTCLARGFSPKDVLVRWLQG
1HXM_A          VEPRSQPHTKPSVFVMKNG---------TNVACLVKEFYPKDIRINLVSS
                                                                  
  DSSP          --GGGS---EEEE-TTS-E----EEEEEEEE-TTGGGGEE---EEEE-TT
Q30201          K-QPMDAKEFEPKDVLPNG----DGTYQGWITLAVPPGEE---QRYTCQV
1S79_A          --------------------------------------------------
3P73_A          --GMVRDQETRWGGIVPNS----DGTYHASAAIDVLPEDG---DKYWCRV
1KCG_C          --------------------------------------------------
1JFM_A          --------------------------------------------------
1BII_A          --GEELTQEMELVETRPAG----DGTFQKWASVVVPLGKE---QKYTCHV
2P24_A          --SKSVADGVYETSFFVNR----DYSFHKLSYLTFIPSDD---DIYDCKV
1CD1_A          --DQ-EQQGTHRGDFLPNA----DETWYLQATLDVEAGEE---AGLACRV
2WY3_A          --VSLSHNTQQWGDVLPDG----NGTYQTWVATRIRQGEE---QRFTCYM
1LQV_A          --------------------------------------------------
3JTS_A          --GEDQTQDTELVETRPAG----DGTFQKWAAVVVPSGKE---QRYTCHV
1OW0_A          SQEL-PREKYLTW-ASRQEPSQGTTTFAVTSILRVAAEDWKKGDTFSCMV
1HXM_A          -----KKITEFDPAIVISP----SGKYNAVKLGKYE--DS---NSVTCSV
                                                                  
  DSSP          SSS-EEE-E-
Q30201          EHPGLDQ-PLIVIWEPSPSGTLVIGVISGIAVFVVILFIGILFIILRKRQ
1S79_A          --------------------------------------------------
3P73_A          EHASLPQ-PGLFSWEPQ---------------------------------
1KCG_C          --------------------------------------------------
1JFM_A          --------------------------------------------------
1BII_A          EHEGLPE-PLTLRWGKEEPPSSTKTNTVIIAVPVVLGAVVILGAVMAFVM
2P24_A          EHWGLEE-PVLKHWEPEIPAPMSELTETSGSRLEVLFQ------------
1CD1_A          KHSSLGG-QDIILYWDARQAPVGLIVFIVLIMLVVVGAVVYYIWRRRSAY
2WY3_A          EHSGNHG-THPVPSGKVLVLQSQRTDFPYVSAAMPCFVIIIILCVPCCKK
1LQV_A          --------------------------------------------------
3JTS_A          QHEGLRE-PLTLRWEP----------------------------------
1OW0_A          GHEALPLAFTQKTIDRLAGK------------------------------
1HXM_A          QHDNK---TVHSTDFEVKTDSTDHVKPKETENTKQPSKS-----------
                                                                  
  DSSP
Q30201          GSRGAMGHYVLAERE----------------
1S79_A          -------------------------------
3P73_A          -------------------------------
1KCG_C          -------------------------------
1JFM_A          -------------------------------
1BII_A          KRRRNTGGKGGDYALAPGSQSSDMSLPDCKV
2P24_A          -------------------------------
1CD1_A          QDIR---------------------------
2WY3_A          KTSAAEGP-----------------------
1LQV_A          -------------------------------
3JTS_A          -------------------------------
1OW0_A          -------------------------------
1HXM_A          -------------------------------

ITasser

Predicted Secondary Structure by I-Tasser

Sequence:   MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF
Predicted:  CCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCSSSSSCCCCCCCCCCSSSSSSSCCCSSSSCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHH
Conf-Score: 985028899999999899875122045421036641367999985269985643743686068998778788540145583478888887676654315558

Sequence:   WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ
Predicted:  HHHHHHHCCCCCCSSSSSSSCCCCCCCCCCCCCCCCCCCCCCSSSSCCCHHHCHHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCHHHHHHHHHCCHHHHHC
Conf-Score: 888755315777644463525565898763541000558873365263022202455666677878887004598888767064299999999747666642

Sequence:   QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV
Predicted:  CCCCCCCCCCCCCCCHHHHCHHHHCCCCCCSSSSSSSCCCCCCCCCCSSSSCCCCCCCCCCCSSSSSCCCCCCCCSSSSCCCCCCCCCSSSSCCCCCCCCCC
Conf-Score: 599877567699854442101541541332479864358754456553541024888652112699807986310267512589998726840688766531

Sequence:   IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
Prediction: CCCCCCHHHHHHHCCHHHHHHHHHCCCCCCCCCCCCCHCCCC
Conf-Score: 010211112222100246665443013678898651020169

Secondary structure elements are shown as H for Alpha helix,S for Beta sheet & C for Coil

Predicted Solvent Accessibility by I-Tasser

Sequence:   MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF
Prediction: 723312000000000101112222011200120120023333331200000102322003123724434241311436413610352044144313323230

Sequence:   WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ
Prediction: 220132133351310001010021136231211333023032003016303403102321432433044143404422010333005103400630351154

Sequence:   QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV
Prediction: 342353313321443300000100101014010203346564435434135233334221320000000347533120214264144202020214542200

Sequence:   IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
Prediction: 000001100000011100000001334446443132333438

Values range from 0 (buried residue) to 9 (highly exposed residue)

I-Tasser predicted five Models with a C-Score from -0.557 to -3.298. They are ranked from one to five as seen below.

Model 1 with a C-Score of -0.557
Model 2 with a C-Score of -2.539
Model 3 with a C-Score of -2.266
Model 4 with a C-Score of -2.772
Model 5 with a C-Score of -3.298

Model1 has a TM-Score of about 0.64 and a RMSD of 7.7Å. For the prediction, I-Tasser used 10 Templates found on PDB which are:

SwissModel

SwissProt is a server based tool provided by the SIB. It combines tools like PSI-PRED and DISOPRED for secondary structure and disordered region prediction.


The model created by SwissModel is based on a self hit, but we had no chance to exclude the protein itself from the prediction. Therefore we also run SwissModel in Alignment-Mode.(TODO)

Automated Mode

predicted model


Model information: Modelled residue range: 26 to 297
Based on template: 1a6zC (2.60 Å)
Sequence Identity [%]: 100
Evalue: 7.66e-163

Quality information: QMEAN Z-Score: -1.035


Estimated absolute model quality
Estimated density of model quality
Z-Score by category
predicted error

Even though the model is based on a self hit, the Z-Score is about -1, which means that the model is one standard deviation from the mean. The model is not quite unlikely but also not the most probable one.

Alignment Mode

Modeller

References