Homology based structure predictions
Contents
Homologous
Because we found no homologous structures in Task 2, we extended our list by using HHSearch.
HHSearch found just sequences with an indentity below 40% therefore we will use the 12 proteins shown below for creating a multiple alignment for homologous modeling. We choose sequences to cover the whole protein and we pay specific attention on the transmembrane region.
PDB-ID | Identity | Description |
1s79 | 37% | Kram |
3p73 | 28% | Kram |
1kcg | 22% | Kram |
1jfm | 14% | Kram |
1bii | 22% | Kram |
2p24 | 21% | Kram |
1cd1 | 21% | Kram |
2wy3 | 29% | Kram |
1lqv | 14% | Kram |
3jts | 25% | Kram |
1ow0 | 22% | Kram |
1hxm | 18% | Kram |
With these Sequences including the HFE-Gen(Q30201), we did a multible sequence alignment with t-coffee(EXPRESSO). This mutlible sequence alignment is later used in the Alignment Mode of SwissModel and Modeller.
DSSP --EEEEEEEEEEB-SS-SSB--EEE Q30201 MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEAL 1S79_A -------------------------------------------------- 3P73_A -----------------------EFGSHSLRYFLTGMTDPGPGMPRFVIV 1KCG_C -------------------------DAHSLWYNFTIIHLPRHGQQWCEVQ 1JFM_A -------------------------DAHSLRCNLTIKDPTPADPLWYEAK 1BII_A -MGAMAPRTLLLLLAAALGPTQTRAGSHSLRYFVTAVSRPGFGEPRYMEV 2P24_A -------------------------------------------------- 1CD1_A -----------------------QQKNYTFRCLQMSSFANR-SWSRTDSV 2WY3_A ------------------------MEPHSLRYNLMVLSQDESVQSGFLAE 1LQV_A -------------------SQDASDGLQRLHMLQISYFR-DPYHVWYQGN 3JTS_A -------------------------GSHSMRYFYTSMSRPGRWEPRFIAV 1OW0_A -------------------------------------------------- 1HXM_A -------------------------------------------------- DSSP EEETTEEEEEEESSS--EEE--STTS-SSTTTTHHHHHHHHHHHHHHHHH Q30201 GYVDDQLFVFYDHES--RRVE-PRTPWVSSRISSQMWLQLSQSLKGWDHM 1S79_A ------------------GRW-IL-KNDVKNRSVYIKGFPTDATLDDIKE 3P73_A GYVDDKIFGTYNSKS--RTAQ-PIVEML-PQEDQEHWDTQTQKAQGGERD 1KCG_C SQVDQKNFLSYDCGS--DKVLSMGHL-EEQLYATDAWGKQLEMLREVGQR 1JFM_A CFVGEILILHLSNIN--KTMT-SG-DPGETANATEVKKCLTQPLKNLCQK 1BII_A GYVDNTEFVRFDSDAENPRYE-PRARWIE-QEGPEYWERETRRAKGNEQS 2P24_A --------------------------M----AIMAPRTLVLLLSGALALT 1CD1_A VWLGDLQTHRWSNDS--ATIS-FTKPWSQGKLSNQQWEKLQHMFQVYRVS 2WY3_A GHLDGQPFLRYDRQK--RRAK-PQGQWAEDVLGAETWDTETEDLTENGQD 1LQV_A ASLGGHLTHVLEGPDTNTTII-QLQPL----QEPESWARTQSGLQSYLLQ 3JTS_A GYVDDTQFVRFDSDAASQRME-PRAPWVE-QEGPEYWDRETRNMKAETQN 1OW0_A -------------------------------------------------- 1HXM_A -----------------------------------AIELVPEHQTVPVSI DSSP HHHHHHHHTTT-SSS--E--------EEEEEE-EEE-TTS-E-EEE-E-- Q30201 FTVDFWTIMENHN-HSKE--------SHTLQV-ILGCEMQED-NST-E-- 1S79_A WLEDKGQV-LNIQMRRTL--------HKAFKG-SIFVVFDSI-ESA-KKF 3P73_A FDWNLNRLPERYN-KSKG--------SHTMQM-MFGCDILED-GSI-R-- 1KCG_C LRLELADT---------ELEDFTPSGPLTLQV-RMSCECEAD-GYI-R-- 1JFM_A LRNKVSNT-KVDTHKTNG--------YPHLQV-TMIYPQSQG-RTP-S-- 1BII_A FRVDLRTALRYYNQSAGG--------SHTLQW-MAGCDVESD-GRLLR-- 2P24_A QTWAGSHSRGEDD--IEA--------DHVGSYGIVVYQSP----GD-I-- 1CD1_A FTRDIQELVKMMSPKEDY--------PIEIQL-SAGCEMYPG-NAS-E-- 2WY3_A LRRTLTHI----KDQKGG--------LHSLQE-IRVCEIHED-SST-R-- 1LQV_A FHGLVRLVHQERT--LAF--------PLTIRC-FLGCELPPEGSRA-H-- 3JTS_A APVNLRNLRGYYNQSEAG--------SHTIQR-MYGCDLGPD-GRLLR-- 1OW0_A -----ACHPRLSLHRPAL--------EDLLLG-SEANLTCTL-TGLRD-- 1HXM_A GVPATLRCSMKGEAIGNY--------YINWYR-KTQGNTMTF-IYRE--- DSSP ----------EEEETTEE----------------EEEEEGGGTEEEES-- Q30201 ----------GYWKYGYD----------------GQDHLEFCPDTLDW-- 1S79_A VETPGQKYKETDLLILFKDDYFAKKNEERKQNKVE--------------- 3P73_A ----------GYDQYAFD----------------GRDFLAFDMDTMTF-- 1KCG_C ----------GSWQFSFD----------------GRKFLLFDSNNRKW-- 1JFM_A ----------ATWEFNIS----------------DSYFFTFYTENMSW-- 1BII_A ----------GYWQFAYD----------------GCDYIALNEDLKTW-- 2P24_A ----------GQYTFEFD----------------GDELFYVDLDKKET-- 1CD1_A ----------SFLHVAFQ----------------GKYVVRFWG--TSWQT 2WY3_A ----------GSRHFYYN----------------GELFLSQNLETQES-- 1LQV_A ----------VFFEVAVN----------------GSSFVSFRPERALW-- 3JTS_A ----------GYHQSAYD----------------GKDYIALNEDLRSW-- 1OW0_A ----------ASGVTFTW----------------TPSSGKSAV--QGPPE 1HXM_A ----------KDIYGPGF----------------KDNFQGDIDIAKNL-- DSSP SGG-G----HHH-HHHHHSSTHHH--HHHHHHHHTHHHHHHHHHHHHHTT Q30201 RAA-E----PRA-WPTKLEWERHK--IRARQNRAYLERDCPAQLQQLLEL 1S79_A -------------------------------------------------- 3P73_A TAA-D----PVA-EITKRRWETEG--TYAERWKHELGTVCVQNLRRYLEH 1KCG_C TVV-H----AGA-RRMKEKWEKDS--GLTTFFKMVSMRDCKSWLRDFLMH 1JFM_A RSA-N----DES-GVIMNKWKDDG--EFVKQLKFLI-HECSQKMDEFLKQ 1BII_A TAA-D----MAA-QITRRKWEQA---GAAERDRAYLEGECVEWLRRYLKN 2P24_A IWM-------------LPEFAQLR--SFDPQGGLQNIATGKHNLGVLTKR 1CD1_A VPGAP----SWL-DLPIKVLNADQ--GTSATVQMLLNDTCPLFVRGLLEA 2WY3_A TVP-QSSRAQTLAMNVTNFW-KEDAMKTKTHYRAMQ-ADCLQKLQRYLKS 1LQV_A QAD-TQVTSGVV-TFTLQQLNAYN--RTRYELREFLEDTCVQYVQKHISA 3JTS_A TAA-D----MAA-QNTQRKWEAA---GEAEQHRTYLEGECLEWLRRYLEN 1OW0_A R--DL----CGC-YSVSSVLPGCA--EPWNHGKTFTCTAAYPESKTPLTA 1HXM_A AVL-K----ILA-PSERDEGSYYC--ACDTLGMGGEYTDKLIFGKGTRVT DSSP TSS--B--EEEEEEEE-SS-----E-EEEEEEEEEBSS--EEEEEETTEE Q30201 GRGVLDQQVPPLVKVTHHVT----S-SVTTLRCRALNYYPQNITMKWLKD 1S79_A -------------------------------------------------- 3P73_A GKAALKRRVQPEVRVWGKEA----D-GILTLSCHAHGFYPRPITISWMKD 1KCG_C RKKRLE-------------------------------------------- 1JFM_A SKEK---------------------------------------------- 1BII_A GNATLLRTDPPKAHVTHHRR----PEGDVTLRCWALGFYPADITLTWQLN 2P24_A SNSTPATNEAPQATVFPKSP--VLLGQPNTLICFVDNIFPPVINITWLRN 1CD1_A GKSDLEKQEKPVAWLSSVP---SSAHGHRQLVCHVSGFYPKPVWVMWMRG 2WY3_A GVAIRRTVPPMVNVTCSEVS----EGNITVTCRASSFYPRNITLTWRQDG 1LQV_A ENTKGSQTSRSYTS------------------------------------ 3JTS_A GKETLQRADPPKTHVTHHPV----SDQEATLRCWALGFYPAEITLTWQRD 1OW0_A TLSKSGNTFRPEVHLLPPPSEELALNELVTLTCLARGFSPKDVLVRWLQG 1HXM_A VEPRSQPHTKPSVFVMKNG---------TNVACLVKEFYPKDIRINLVSS DSSP --GGGS---EEEE-TTS-E----EEEEEEEE-TTGGGGEE---EEEE-TT Q30201 K-QPMDAKEFEPKDVLPNG----DGTYQGWITLAVPPGEE---QRYTCQV 1S79_A -------------------------------------------------- 3P73_A --GMVRDQETRWGGIVPNS----DGTYHASAAIDVLPEDG---DKYWCRV 1KCG_C -------------------------------------------------- 1JFM_A -------------------------------------------------- 1BII_A --GEELTQEMELVETRPAG----DGTFQKWASVVVPLGKE---QKYTCHV 2P24_A --SKSVADGVYETSFFVNR----DYSFHKLSYLTFIPSDD---DIYDCKV 1CD1_A --DQ-EQQGTHRGDFLPNA----DETWYLQATLDVEAGEE---AGLACRV 2WY3_A --VSLSHNTQQWGDVLPDG----NGTYQTWVATRIRQGEE---QRFTCYM 1LQV_A -------------------------------------------------- 3JTS_A --GEDQTQDTELVETRPAG----DGTFQKWAAVVVPSGKE---QRYTCHV 1OW0_A SQEL-PREKYLTW-ASRQEPSQGTTTFAVTSILRVAAEDWKKGDTFSCMV 1HXM_A -----KKITEFDPAIVISP----SGKYNAVKLGKYE--DS---NSVTCSV DSSP SSS-EEE-E- Q30201 EHPGLDQ-PLIVIWEPSPSGTLVIGVISGIAVFVVILFIGILFIILRKRQ 1S79_A -------------------------------------------------- 3P73_A EHASLPQ-PGLFSWEPQ--------------------------------- 1KCG_C -------------------------------------------------- 1JFM_A -------------------------------------------------- 1BII_A EHEGLPE-PLTLRWGKEEPPSSTKTNTVIIAVPVVLGAVVILGAVMAFVM 2P24_A EHWGLEE-PVLKHWEPEIPAPMSELTETSGSRLEVLFQ------------ 1CD1_A KHSSLGG-QDIILYWDARQAPVGLIVFIVLIMLVVVGAVVYYIWRRRSAY 2WY3_A EHSGNHG-THPVPSGKVLVLQSQRTDFPYVSAAMPCFVIIIILCVPCCKK 1LQV_A -------------------------------------------------- 3JTS_A QHEGLRE-PLTLRWEP---------------------------------- 1OW0_A GHEALPLAFTQKTIDRLAGK------------------------------ 1HXM_A QHDNK---TVHSTDFEVKTDSTDHVKPKETENTKQPSKS----------- DSSP Q30201 GSRGAMGHYVLAERE---------------- 1S79_A ------------------------------- 3P73_A ------------------------------- 1KCG_C ------------------------------- 1JFM_A ------------------------------- 1BII_A KRRRNTGGKGGDYALAPGSQSSDMSLPDCKV 2P24_A ------------------------------- 1CD1_A QDIR--------------------------- 2WY3_A KTSAAEGP----------------------- 1LQV_A ------------------------------- 3JTS_A ------------------------------- 1OW0_A ------------------------------- 1HXM_A -------------------------------
ITasser
Predicted Secondary Structure by I-Tasser
Sequence: MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF Predicted: CCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCSSSSSCCCCCCCCCCSSSSSSSCCCSSSSCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHH Conf-Score: 985028899999999899875122045421036641367999985269985643743686068998778788540145583478888887676654315558 Sequence: WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ Predicted: HHHHHHHCCCCCCSSSSSSSCCCCCCCCCCCCCCCCCCCCCCSSSSCCCHHHCHHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCHHHHHHHHHCCHHHHHC Conf-Score: 888755315777644463525565898763541000558873365263022202455666677878887004598888767064299999999747666642 Sequence: QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV Predicted: CCCCCCCCCCCCCCCHHHHCHHHHCCCCCCSSSSSSSCCCCCCCCCCSSSSCCCCCCCCCCCSSSSSCCCCCCCCSSSSCCCCCCCCCSSSSCCCCCCCCCC Conf-Score: 599877567699854442101541541332479864358754456553541024888652112699807986310267512589998726840688766531 Sequence: IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE Prediction: CCCCCCHHHHHHHCCHHHHHHHHHCCCCCCCCCCCCCHCCCC Conf-Score: 010211112222100246665443013678898651020169
Secondary structure elements are shown as H for Alpha helix,S for Beta sheet & C for Coil
Predicted Solvent Accessibility by I-Tasser
Sequence: MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDF Prediction: 723312000000000101112222011200120120023333331200000102322003123724434241311436413610352044144313323230 Sequence: WTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQ Prediction: 220132133351310001010021136231211333023032003016303403102321432433044143404422010333005103400630351154 Sequence: QVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPSPSGTLV Prediction: 342353313321443300000100101014010203346564435434135233334221320000000347533120214264144202020214542200 Sequence: IGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE Prediction: 000001100000011100000001334446443132333438
Values range from 0 (buried residue) to 9 (highly exposed residue)
I-Tasser predicted five Models with a C-Score from -0.557 to -3.298. They are ranked from one to five as seen below.
Model1 has a TM-Score of about 0.64 and a RMSD of 7.7Å. For the prediction, I-Tasser used 10 Templates found on PDB which are:
SwissModel
SwissProt is a server based tool provided by the SIB. It combines tools like PSI-PRED and DISOPRED for secondary structure and disordered region prediction.
The model created by SwissModel is based on a self hit, but we had no chance to exclude the protein itself from the prediction. Therefore we also run SwissModel in Alignment-Mode.(TODO)
Automated Mode
Model information:
Modelled residue range: 26 to 297
Based on template: 1a6zC (2.60 Å)
Sequence Identity [%]: 100
Evalue: 7.66e-163
Quality information:
QMEAN Z-Score: -1.035
Even though the model is based on a self hit, the Z-Score is about -1, which means that the model is one standard deviation from the mean. The model is not quite unlikely but also not the most probable one.