Difference between revisions of "Homology based structure predictions"

From Bioinformatikpedia
(Homologous)
(Homologous)
Line 2: Line 2:
 
Because we found no homologous structures in Task 2, we extended our list by using HHSearch.
 
Because we found no homologous structures in Task 2, we extended our list by using HHSearch.
   
We will use 13 proteins for creating a multiple alignment for homologous modeling. We choosen sequences to cover the whole protein and we payed specific attention on the transmembrane region.
+
HHSearch found just sequences with an indentity below 40% therefore we will use the 12 proteins shown below for creating a multiple alignment for homologous modeling. We choose sequences to cover the whole protein and we pay specific attention on the transmembrane region.
   
   

Revision as of 13:25, 9 June 2011

Homologous

Because we found no homologous structures in Task 2, we extended our list by using HHSearch.

HHSearch found just sequences with an indentity below 40% therefore we will use the 12 proteins shown below for creating a multiple alignment for homologous modeling. We choose sequences to cover the whole protein and we pay specific attention on the transmembrane region.


PDB-ID Identity Description
1s79 37% Kram
3p73 28% Kram
1kcg 22% Kram
1jfm 14% Kram
1bii 22% Kram
2p24 21% Kram
1cd1 21% Kram
2wy3 29% Kram
1lqv 14% Kram
3jts 25% Kram
1ow0 22% Kram
1hxm 18% Kram

With these Sequences inclcuding 1a6z, we did a multible algniment with t-coffee(EXPRESSO). This mutlible alignment is later used in the Alignment Mode of SwissModel and Modeller.

  DSSP                                   --EEEEEEEEEEB-SS-SSB--EEE
Q30201          MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEAL
1S79_A          --------------------------------------------------
3P73_A          -----------------------EFGSHSLRYFLTGMTDPGPGMPRFVIV
1KCG_C          -------------------------DAHSLWYNFTIIHLPRHGQQWCEVQ
1JFM_A          -------------------------DAHSLRCNLTIKDPTPADPLWYEAK
1BII_A          -MGAMAPRTLLLLLAAALGPTQTRAGSHSLRYFVTAVSRPGFGEPRYMEV
2P24_A          --------------------------------------------------
1CD1_A          -----------------------QQKNYTFRCLQMSSFANR-SWSRTDSV
2WY3_A          ------------------------MEPHSLRYNLMVLSQDESVQSGFLAE
1LQV_A          -------------------SQDASDGLQRLHMLQISYFR-DPYHVWYQGN
3JTS_A          -------------------------GSHSMRYFYTSMSRPGRWEPRFIAV
1OW0_A          --------------------------------------------------
1HXM_A          --------------------------------------------------
                                                                 
  DSSP          EEETTEEEEEEESSS--EEE--STTS-SSTTTTHHHHHHHHHHHHHHHHH
Q30201          GYVDDQLFVFYDHES--RRVE-PRTPWVSSRISSQMWLQLSQSLKGWDHM
1S79_A          ------------------GRW-IL-KNDVKNRSVYIKGFPTDATLDDIKE
3P73_A          GYVDDKIFGTYNSKS--RTAQ-PIVEML-PQEDQEHWDTQTQKAQGGERD
1KCG_C          SQVDQKNFLSYDCGS--DKVLSMGHL-EEQLYATDAWGKQLEMLREVGQR
1JFM_A          CFVGEILILHLSNIN--KTMT-SG-DPGETANATEVKKCLTQPLKNLCQK
1BII_A          GYVDNTEFVRFDSDAENPRYE-PRARWIE-QEGPEYWERETRRAKGNEQS
2P24_A          --------------------------M----AIMAPRTLVLLLSGALALT
1CD1_A          VWLGDLQTHRWSNDS--ATIS-FTKPWSQGKLSNQQWEKLQHMFQVYRVS
2WY3_A          GHLDGQPFLRYDRQK--RRAK-PQGQWAEDVLGAETWDTETEDLTENGQD
1LQV_A          ASLGGHLTHVLEGPDTNTTII-QLQPL----QEPESWARTQSGLQSYLLQ
3JTS_A          GYVDDTQFVRFDSDAASQRME-PRAPWVE-QEGPEYWDRETRNMKAETQN
1OW0_A          --------------------------------------------------
1HXM_A          -----------------------------------AIELVPEHQTVPVSI
                                                                 
  DSSP          HHHHHHHHTTT-SSS--E--------EEEEEE-EEE-TTS-E-EEE-E--
Q30201          FTVDFWTIMENHN-HSKE--------SHTLQV-ILGCEMQED-NST-E--
1S79_A          WLEDKGQV-LNIQMRRTL--------HKAFKG-SIFVVFDSI-ESA-KKF
3P73_A          FDWNLNRLPERYN-KSKG--------SHTMQM-MFGCDILED-GSI-R--
1KCG_C          LRLELADT---------ELEDFTPSGPLTLQV-RMSCECEAD-GYI-R--
1JFM_A          LRNKVSNT-KVDTHKTNG--------YPHLQV-TMIYPQSQG-RTP-S--
1BII_A          FRVDLRTALRYYNQSAGG--------SHTLQW-MAGCDVESD-GRLLR--
2P24_A          QTWAGSHSRGEDD--IEA--------DHVGSYGIVVYQSP----GD-I--
1CD1_A          FTRDIQELVKMMSPKEDY--------PIEIQL-SAGCEMYPG-NAS-E--
2WY3_A          LRRTLTHI----KDQKGG--------LHSLQE-IRVCEIHED-SST-R--
1LQV_A          FHGLVRLVHQERT--LAF--------PLTIRC-FLGCELPPEGSRA-H--
3JTS_A          APVNLRNLRGYYNQSEAG--------SHTIQR-MYGCDLGPD-GRLLR--
1OW0_A          -----ACHPRLSLHRPAL--------EDLLLG-SEANLTCTL-TGLRD--
1HXM_A          GVPATLRCSMKGEAIGNY--------YINWYR-KTQGNTMTF-IYRE---
                                                                 
  DSSP          ----------EEEETTEE----------------EEEEEGGGTEEEES--
Q30201          ----------GYWKYGYD----------------GQDHLEFCPDTLDW--
1S79_A          VETPGQKYKETDLLILFKDDYFAKKNEERKQNKVE---------------
3P73_A          ----------GYDQYAFD----------------GRDFLAFDMDTMTF--
1KCG_C          ----------GSWQFSFD----------------GRKFLLFDSNNRKW--
1JFM_A          ----------ATWEFNIS----------------DSYFFTFYTENMSW--
1BII_A          ----------GYWQFAYD----------------GCDYIALNEDLKTW--
2P24_A          ----------GQYTFEFD----------------GDELFYVDLDKKET--
1CD1_A          ----------SFLHVAFQ----------------GKYVVRFWG--TSWQT
2WY3_A          ----------GSRHFYYN----------------GELFLSQNLETQES--
1LQV_A          ----------VFFEVAVN----------------GSSFVSFRPERALW--
3JTS_A          ----------GYHQSAYD----------------GKDYIALNEDLRSW--
1OW0_A          ----------ASGVTFTW----------------TPSSGKSAV--QGPPE
1HXM_A          ----------KDIYGPGF----------------KDNFQGDIDIAKNL--
                                                                 
  DSSP          SGG-G----HHH-HHHHHSSTHHH--HHHHHHHHTHHHHHHHHHHHHHTT
Q30201          RAA-E----PRA-WPTKLEWERHK--IRARQNRAYLERDCPAQLQQLLEL
1S79_A          --------------------------------------------------
3P73_A          TAA-D----PVA-EITKRRWETEG--TYAERWKHELGTVCVQNLRRYLEH
1KCG_C          TVV-H----AGA-RRMKEKWEKDS--GLTTFFKMVSMRDCKSWLRDFLMH
1JFM_A          RSA-N----DES-GVIMNKWKDDG--EFVKQLKFLI-HECSQKMDEFLKQ
1BII_A          TAA-D----MAA-QITRRKWEQA---GAAERDRAYLEGECVEWLRRYLKN
2P24_A          IWM-------------LPEFAQLR--SFDPQGGLQNIATGKHNLGVLTKR
1CD1_A          VPGAP----SWL-DLPIKVLNADQ--GTSATVQMLLNDTCPLFVRGLLEA
2WY3_A          TVP-QSSRAQTLAMNVTNFW-KEDAMKTKTHYRAMQ-ADCLQKLQRYLKS
1LQV_A          QAD-TQVTSGVV-TFTLQQLNAYN--RTRYELREFLEDTCVQYVQKHISA
3JTS_A          TAA-D----MAA-QNTQRKWEAA---GEAEQHRTYLEGECLEWLRRYLEN
1OW0_A          R--DL----CGC-YSVSSVLPGCA--EPWNHGKTFTCTAAYPESKTPLTA
1HXM_A          AVL-K----ILA-PSERDEGSYYC--ACDTLGMGGEYTDKLIFGKGTRVT
                                                                  
  DSSP          TSS--B--EEEEEEEE-SS-----E-EEEEEEEEEBSS--EEEEEETTEE  
Q30201          GRGVLDQQVPPLVKVTHHVT----S-SVTTLRCRALNYYPQNITMKWLKD
1S79_A          --------------------------------------------------
3P73_A          GKAALKRRVQPEVRVWGKEA----D-GILTLSCHAHGFYPRPITISWMKD
1KCG_C          RKKRLE--------------------------------------------
1JFM_A          SKEK----------------------------------------------
1BII_A          GNATLLRTDPPKAHVTHHRR----PEGDVTLRCWALGFYPADITLTWQLN
2P24_A          SNSTPATNEAPQATVFPKSP--VLLGQPNTLICFVDNIFPPVINITWLRN
1CD1_A          GKSDLEKQEKPVAWLSSVP---SSAHGHRQLVCHVSGFYPKPVWVMWMRG
2WY3_A          GVAIRRTVPPMVNVTCSEVS----EGNITVTCRASSFYPRNITLTWRQDG
1LQV_A          ENTKGSQTSRSYTS------------------------------------
3JTS_A          GKETLQRADPPKTHVTHHPV----SDQEATLRCWALGFYPAEITLTWQRD
1OW0_A          TLSKSGNTFRPEVHLLPPPSEELALNELVTLTCLARGFSPKDVLVRWLQG
1HXM_A          VEPRSQPHTKPSVFVMKNG---------TNVACLVKEFYPKDIRINLVSS
                                                                  
  DSSP          --GGGS---EEEE-TTS-E----EEEEEEEE-TTGGGGEE---EEEE-TT
Q30201          K-QPMDAKEFEPKDVLPNG----DGTYQGWITLAVPPGEE---QRYTCQV
1S79_A          --------------------------------------------------
3P73_A          --GMVRDQETRWGGIVPNS----DGTYHASAAIDVLPEDG---DKYWCRV
1KCG_C          --------------------------------------------------
1JFM_A          --------------------------------------------------
1BII_A          --GEELTQEMELVETRPAG----DGTFQKWASVVVPLGKE---QKYTCHV
2P24_A          --SKSVADGVYETSFFVNR----DYSFHKLSYLTFIPSDD---DIYDCKV
1CD1_A          --DQ-EQQGTHRGDFLPNA----DETWYLQATLDVEAGEE---AGLACRV
2WY3_A          --VSLSHNTQQWGDVLPDG----NGTYQTWVATRIRQGEE---QRFTCYM
1LQV_A          --------------------------------------------------
3JTS_A          --GEDQTQDTELVETRPAG----DGTFQKWAAVVVPSGKE---QRYTCHV
1OW0_A          SQEL-PREKYLTW-ASRQEPSQGTTTFAVTSILRVAAEDWKKGDTFSCMV
1HXM_A          -----KKITEFDPAIVISP----SGKYNAVKLGKYE--DS---NSVTCSV
                                                                  
  DSSP          SSS-EEE-E-
Q30201          EHPGLDQ-PLIVIWEPSPSGTLVIGVISGIAVFVVILFIGILFIILRKRQ
1S79_A          --------------------------------------------------
3P73_A          EHASLPQ-PGLFSWEPQ---------------------------------
1KCG_C          --------------------------------------------------
1JFM_A          --------------------------------------------------
1BII_A          EHEGLPE-PLTLRWGKEEPPSSTKTNTVIIAVPVVLGAVVILGAVMAFVM
2P24_A          EHWGLEE-PVLKHWEPEIPAPMSELTETSGSRLEVLFQ------------
1CD1_A          KHSSLGG-QDIILYWDARQAPVGLIVFIVLIMLVVVGAVVYYIWRRRSAY
2WY3_A          EHSGNHG-THPVPSGKVLVLQSQRTDFPYVSAAMPCFVIIIILCVPCCKK
1LQV_A          --------------------------------------------------
3JTS_A          QHEGLRE-PLTLRWEP----------------------------------
1OW0_A          GHEALPLAFTQKTIDRLAGK------------------------------
1HXM_A          QHDNK---TVHSTDFEVKTDSTDHVKPKETENTKQPSKS-----------
                                                                  
  DSSP
Q30201          GSRGAMGHYVLAERE----------------
1S79_A          -------------------------------
3P73_A          -------------------------------
1KCG_C          -------------------------------
1JFM_A          -------------------------------
1BII_A          KRRRNTGGKGGDYALAPGSQSSDMSLPDCKV
2P24_A          -------------------------------
1CD1_A          QDIR---------------------------
2WY3_A          KTSAAEGP-----------------------
1LQV_A          -------------------------------
3JTS_A          -------------------------------
1OW0_A          -------------------------------
1HXM_A          -------------------------------

ITasser

Predicted Secondary Structure by I-Tasser

Sequence:   MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF
Predicted:  CCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCSSSSSCCCCCCCCCCSSSSSSSCCCSSSSCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHH
Conf-Score: 985028899999999899875122045421036641367999985269985643743686068998778788540145583478888887676654315558

Sequence:   WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ
Predicted:  HHHHHHHCCCCCCSSSSSSSCCCCCCCCCCCCCCCCCCCCCCSSSSCCCHHHCHHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCHHHHHHHHHCCHHHHHC
Conf-Score: 888755315777644463525565898763541000558873365263022202455666677878887004598888767064299999999747666642

Sequence:   QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV
Predicted:  CCCCCCCCCCCCCCCHHHHCHHHHCCCCCCSSSSSSSCCCCCCCCCCSSSSCCCCCCCCCCCSSSSSCCCCCCCCSSSSCCCCCCCCCSSSSCCCCCCCCCC
Conf-Score: 599877567699854442101541541332479864358754456553541024888652112699807986310267512589998726840688766531

Sequence:   IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
Prediction: CCCCCCHHHHHHHCCHHHHHHHHHCCCCCCCCCCCCCHCCCC
Conf-Score: 010211112222100246665443013678898651020169

Secondary structure elements are shown as H for Alpha helix,S for Beta sheet & C for Coil

Predicted Solvent Accessibility by I-Tasser

Sequence:   MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF
Prediction: 723312000000000101112222011200120120023333331200000102322003123724434241311436413610352044144313323230

Sequence:   WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ
Prediction: 220132133351310001010021136231211333023032003016303403102321432433044143404422010333005103400630351154

Sequence:   QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV
Prediction: 342353313321443300000100101014010203346564435434135233334221320000000347533120214264144202020214542200

Sequence:   IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
Prediction: 000001100000011100000001334446443132333438

Values range from 0 (buried residue) to 9 (highly exposed residue)

I-Tasser predicted five Models with a C-Score from -0.557 to -3.298. They are ranked from one to five as seen below.

Model 1 with a C-Score of -0.557
Model 2 with a C-Score of -2.539
Model 3 with a C-Score of -2.266
Model 4 with a C-Score of -2.772
Model 5 with a C-Score of -3.298

Model1 has a TM-Score of about 0.64 and a RMSD of 7.7Å. For the prediction, I-Tasser used 10 Templates found on PDB which are:

SwissModel

SwissProt is a server based tool provided by the SIB. It combines tools like PSI-PRED and DISOPRED for secondary structure and disordered region prediction.


The model created by SwissModel is based on a self hit, but we had no chance to exclude the protein itself from the prediction. Therefore we also run SwissModel in Alignment-Mode.(TODO)

Automated Mode

predicted model


Model information: Modelled residue range: 26 to 297
Based on template: 1a6zC (2.60 Å)
Sequence Identity [%]: 100
Evalue: 7.66e-163

Quality information: QMEAN Z-Score: -1.035


Estimated absolute model quality
Estimated density of model quality
Z-Score by category
predicted error

Even though the model is based on a self hit, the Z-Score is about -1, which means that the model is one standard deviation from the mean. The model is not quite unlikely but also not the most probable one.

Alignment Mode

Modeller

References