Difference between revisions of "Task 4: Homology based structure predictions"
(→Modeller) |
(→Discussion) |
||
(126 intermediate revisions by 2 users not shown) | |||
Line 181: | Line 181: | ||
|} |
|} |
||
+ | Important remark: the scores for Prob, E-Value and P-Value could not be calculated, that is why they are 1. |
||
− | |||
==== Template structure selection ==== |
==== Template structure selection ==== |
||
Line 191: | Line 191: | ||
* < 40% sequence identity: 1ltz |
* < 40% sequence identity: 1ltz |
||
− | === |
+ | === Alignment Refinement === |
− | Modeller uses two types of files to be run. The first one contains the used alignment in the PIR-format (see [http://salilab.org/modeller/manual/node463.html PIR FORMAT]), the second one is a python script, which tells Modeller which steps it has to perform. There are some examples in <code>/apps/modeller9.9/examples/automodel</code>. But python seems to be falsely configured on the virtual machines (at least the linux virtual machine). |
||
− | * The used fix of the python installation is described at the [[resource_software|software section]]. |
||
− | * The Modeller modules were still not importable by Python, that is why it was necessary to reinstall Modeller. The steps for this are described in [[resource_software|software section]]. |
||
+ | We used the reference for a search in PFAM. There were two PFAM-domains detected on the reference sequence: ACT and Biopterin_H |
||
− | After all this stuff we tried to write a alignment-file with the hhsearch alignments and a python-file (according to the example <code>/apps/modeller9.9/examples/automodel/model-default.py</code>. But modeller seems to be very sensible due to missing acids in the coordinate section of the pdb, which are of course mentioned in the sequence used in the alignment. |
||
+ | Then we used the sequence of the three proteins 1PHZ, 1TOH and 1LTZ to run a search against PFAM and used the alignments |
||
+ | with the HMMs of ACT and Biopterin_H and the alignment of the reference sequence with the HMMs of ACT and Biopterin_H |
||
+ | as seeds for the improved alignments. |
||
+ | The crucial parts of the reference sequence according to the annotation in UniProt was already aligned by the seeds. |
||
+ | We predicted the secondary structure of the three sequences of the proteins (TODO ref... seems to be better to predict) and |
||
+ | tried to extend the seeds. In PHZ a large gap could be filled by high sequence identity. Afterwards we deleted the unaligned |
||
+ | ends of the reference sequence to improve the resulting model. The essential part of the protein, the domains, should now be |
||
+ | better modeled. |
||
+ | <code> |
||
− | To avoid this problem there are at least two possibilities: |
||
+ | Seeds by PFAM - ACT: |
||
− | * Repair the pdb by the script repairPDB. Map the sequence in the alignment on the sequence in the coordinate section of the used pdb (a lot of work - need some kind of alignment and a pdb-parser... both is not trivial) |
||
+ | {| border="1" |
||
− | * The other possibility is to let Modeller create the alignment on the basis of the repaired PDB (with repairPDB). |
||
+ | ! REFERENCE |
||
+ | | SLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKkDEYEFFTHLD-KRSL |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | SLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYL------ |
||
+ | |} |
||
+ | Seeds by PFAM - Biopterin_H |
||
− | We have chosen the second option, which is really nice described in the Modeller tutorial (see [http://salilab.org/modeller/tutorial/basic.html basic modeller tutorial]). |
||
+ | {| border="1" |
||
+ | ! REFERENCE |
||
+ | | PWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYK<br> |
||
+ | THACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPE<br> |
||
+ | PDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-S<br> |
||
+ | EKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSE<br> |
||
+ | IGILCSALQK |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | PWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYK<br> |
||
+ | THACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPE<br> |
||
+ | PDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-S<br> |
||
+ | DKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT-------------<br> |
||
+ | <nowiki>----------</nowiki> |
||
+ | |- |
||
+ | ! 1TOH |
||
+ | | PWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYA<br> |
||
+ | THACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPE<br> |
||
+ | PDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSL-S<br> |
||
+ | EEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDE<br> |
||
+ | LHTLAHALS- |
||
+ | |- |
||
+ | ! 1LTZ |
||
+ | | ------------------------------------------------------PQPLDRYSAEDHATWATLYQRQCKLLP<br> |
||
+ | GRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQE<br> |
||
+ | PDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLdS<br> |
||
+ | ASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQ----------------------------------------------<br> |
||
+ | <nowiki>----------</nowiki> |
||
+ | |} |
||
+ | unadjusted alignment of 1PHZ |
||
− | At the example of 1PHZ. We prepared three files. |
||
+ | {| border="1" |
||
+ | ! REFERENCE |
||
+ | | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF<br> |
||
+ | THLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ<br> |
||
+ | FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF<br> |
||
+ | RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK<br> |
||
+ | LATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVR<br> |
||
+ | NFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEE<br> |
||
+ | EECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHH<br> |
||
+ | HHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCC<br> |
||
+ | EEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHH<br> |
||
+ | HHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHH<br> |
||
+ | HHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | ------------------GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFF<br> |
||
+ | TYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQ<br> |
||
+ | FADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGF<br> |
||
+ | RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK<br> |
||
+ | LATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVR<br> |
||
+ | TFAATIPRPFSVRYDPYTQRVEVLDNT------------------------- |
||
+ | |- |
||
+ | ! PSI |
||
+ | | ------------------CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEE<br> |
||
+ | EEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHH<br> |
||
+ | HHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCC<br> |
||
+ | EEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHH<br> |
||
+ | HHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHH<br> |
||
+ | HHHHHCCCCCCCCCCCCCCEEEECCCC------------------------- |
||
+ | |} |
||
+ | unadjusted alignment of 1TOH |
||
− | * The alignment file, which contains only the reference sequence: pah.ali <br> <code> >P1;PAH<br> sequence:reference::::::::<br>MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEEN<br>DVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKD<br>TVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPI<br>PRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQ<br>FLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGH<br>VPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLL<br>SSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATI<br>PRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK</code> |
||
+ | {| border = "1" |
||
+ | ! REFERENCE |
||
+ | | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF<br> |
||
+ | THLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ<br> |
||
+ | FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF<br> |
||
+ | RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK<br> |
||
+ | LATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVR<br> |
||
+ | NFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEE<br> |
||
+ | EECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHH<br> |
||
+ | HHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCC<br> |
||
+ | EEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHH<br> |
||
+ | HHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHH<br> |
||
+ | HHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | ------------------GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFF<br> |
||
+ | TYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQ<br> |
||
+ | FADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGF<br> |
||
+ | RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK<br> |
||
+ | LATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVR<br> |
||
+ | TFAATIPRPFSVRYDPYTQRVEVLDNT------------------------- |
||
+ | |- |
||
+ | ! PSI |
||
+ | | ------------------CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEE<br> |
||
+ | EEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHH<br> |
||
+ | HHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCC<br> |
||
+ | EEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHH<br> |
||
+ | HHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHH<br> |
||
+ | HHHHHCCCCCCCCCCCCCCEEEECCCC------------------------- |
||
+ | |} |
||
+ | unadjusted alignment of 1LTZ |
||
− | * The python script, which tells Modeller to create an alignment <br><code>from modeller import *<br><br>env = environ()<br>aln = alignment(env)<br>mdl = model(env, file='../structures/1phz.pdb', model_segment=('FIRST:A','LAST:A'))<br>aln.append_model(mdl, align_codes='1phz', atom_files='../structures/1phz.pdb')<br>aln.append(file='pah.ali', align_codes='PAH')<br>aln.align2d()<br>aln.write(file='phz.ali', alignment_format='PIR')<br> |
||
+ | {| border="1" |
||
− | </code> |
||
+ | ! REFERENCE |
||
+ | | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF<br> |
||
+ | THLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ<br> |
||
+ | FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF<br> |
||
+ | RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIE<br> |
||
+ | KLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-SEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEK<br> |
||
+ | VRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEE<br> |
||
+ | EECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHH<br> |
||
+ | HHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCC<br> |
||
+ | EEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCH HHHH<br> |
||
+ | HHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHH CCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHH<br> |
||
+ | HHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1LTZ |
||
+ | | -----------------F--------------------------V-----V-----------------------------<br> |
||
+ | <nowiki>--------PDITT-----RKNVG-----LSHDANDFTLP------QPLDR-------YSA--------------------</nowiki><br> |
||
+ | <nowiki>---------------------EDHATWATLYQRQCKLLPGRACDEFLEGLERLE----VDADRVPDFNKLNEKLMAATGW</nowiki><br> |
||
+ | KIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKALGALP<br> |
||
+ | MLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLDSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQL---<br> |
||
+ | <nowiki>---FDAD----FAPLY------LQLAD-AQPWG--AGDIAP------DDL--VL</nowiki> |
||
+ | |- |
||
+ | ! PSI |
||
+ | | -----------------C--------------------------C-----C-----------------------------<br> |
||
+ | <nowiki>--------CCCCC-----CCCCC-----CCCCCCCCCCC------CCCCC-------CCH--------------------</nowiki><br> |
||
+ | <nowiki>---------------------HHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCC----CCCCCCCCCHHHHHHHHHHHCC</nowiki><br> |
||
+ | EEEECCCCCCHHHHHHHHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHH<br> |
||
+ | HHHHHHEEEEEEEEEECCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHH---<br> |
||
+ | <nowiki>---HHHC----CHHHH------HHHHH-CCCCC--CCCCCC------CCC--CC</nowiki> |
||
+ | |} |
||
+ | adjusted alignment of 1LTZ |
||
− | * Then we tried to optimize the alignment. (for the different criteria see below) |
||
+ | {| border="1" |
||
+ | ! REFERENCE |
||
+ | | PIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSR<br> |
||
+ | DFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFG<br> |
||
+ | LCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAAT |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHH<br> |
||
+ | HHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEE<br> |
||
+ | EECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHH |
||
+ | |- |
||
+ | ! 1LTZ |
||
+ | | PQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDD<br> |
||
+ | VFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEF<br> |
||
+ | GLINTPAGMRIYGAGILSSKSESIYCLdSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQLFDADFAPL |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHCCEEEECCCCCCHHHHHH<br> |
||
+ | HHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHEEEEEEEEEE<br> |
||
+ | CCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHCCHHHHHHH |
||
+ | |} |
||
+ | |||
+ | adjusted alignment of 1PHZ |
||
+ | {| border="1" |
||
+ | ! REFERENCE |
||
+ | | GQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKIL<br> |
||
+ | RHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVE<br> |
||
+ | YMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGL<br> |
||
+ | AFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGD<br> |
||
+ | SIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYT<br> |
||
+ | QRIEVLDNTQQ |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHH<br> |
||
+ | CCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCC<br> |
||
+ | CCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHC<br> |
||
+ | CCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCC<br> |
||
+ | CEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCC<br> |
||
+ | CEEEECCCHHH |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYLDKRTKPVLGSIIKSL<br> |
||
+ | RNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVE<br> |
||
+ | YTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGL<br> |
||
+ | AFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEG<br> |
||
+ | DSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDP<br> |
||
+ | YTQRVEVLDNT |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEEEEECCCCCCCHHHHHHHH<br> |
||
+ | HCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCC<br> |
||
+ | CCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCCEEEECCCCCCHHHHHHHH<br> |
||
+ | HCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCH-HHHHHHHHHEEEEEEEEEEEEC<br> |
||
+ | CCCEEECCCCCCCCCCCCCCC-CCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHHHHHHHCCCCCCCCCCC<br> |
||
+ | CCCEEEECCCC |
||
+ | |} |
||
+ | |||
+ | adjusted alignment of 1TOH |
||
+ | {| border="1" |
||
+ | ! REFERENCE |
||
+ | | VPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSL<br> |
||
+ | YKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMY<br> |
||
+ | TPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYC<br> |
||
+ | LSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSI<br> |
||
+ | NSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH<br> |
||
+ | HCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCC<br> |
||
+ | CCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHH<br> |
||
+ | HCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHH<br> |
||
+ | HHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1TOH |
||
+ | | VPWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGL<br> |
||
+ | YATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMH<br> |
||
+ | SPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLH<br> |
||
+ | SL-SEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLE<br> |
||
+ | GVQDELHTLAHALSA |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCC-----------CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH<br> |
||
+ | HHHCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCC<br> |
||
+ | CCCCCHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHCEEEEEEEEEEEEECCCCEECCCCCCCCCCHHCCC<br> |
||
+ | CCCCCCCCCCCHHHHHCCCCCCCCCCCEEEEECCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEECCCCHHHHHHHHHHH<br> |
||
+ | HHHHHHHHHHHHHHC |
||
+ | |} |
||
− | * The python script, which tells Modeller to create a model<br><code>from modeller.automodel import *<br><br>a = automodel(env, alnfile='phz.ali',<br> knowns='1phz', sequence='PAH',<br> assess_methods=(assess.DOPE, assess.GA341))<br>a.starting_model = 1<br>a.ending_model = 1<br>a.make()<br> |
||
</code> |
</code> |
||
+ | === Homology modeling with Modeller === |
||
+ | Modeller uses three files. The first one contains the used alignment in the PIR-format (see [http://salilab.org/modeller/manual/node463.html PIR FORMAT]). This file should at least contain the sequence of the target, the actual alignment can be calculated by Modeller or specified in this file. The second one is a python script, which tells Modeller which steps it has to perform. There are some examples in <code>/apps/modeller9.9/examples/automodel</code>. And the third file is the pdb-file of the used template. But python seems to be falsely configured on the virtual machines (at least the linux virtual machine with the non-sudo user). |
||
+ | * The used fix of the python installation is described at the [[resource_software|software section]]. |
||
+ | * The Modeller modules were still not importable by Python, that is why it was necessary to reinstall Modeller. The steps for this are described in [[resource_software|software section]]. |
||
+ | In the automated workflow Modeller calculated the alignment by himself (see [http://salilab.org/modeller/tutorial/basic.html basic Modeller tutorial]). |
||
− | As already mentioned above, we took a closer look at the sequence above. |
||
− | + | In the adjusted workfow we changed the alignmentfile to match our alignment. |
|
− | * We tried to shift the alignment, such that the dssp secondary structures of the reference (used an alignment to 2PAH from task 3) and the template (new dssp run) have a higher coverage |
||
− | * We tried to match the corresponding annotated functional sites (PDB). |
||
+ | We prepared three files for the modeling with one template(e.g. for 1PHZ). |
||
− | 1phz |
||
+ | |||
− | Filename molpdf DOPE score GA341 score |
||
+ | * The alignment input file, which contains only the reference sequence: pah.ali. This file is the basis for Modellers own alignment to the template. <br> <code> >P1;PAH<br> sequence:reference::::::::<br>MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEEN<br>DVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKD<br>TVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPI<br>PRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQ<br>FLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGH<br>VPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLL<br>SSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATI<br>PRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK*</code> |
||
− | ---------------------------------------------------------------------- |
||
+ | |||
− | PAH.B99990001.pdb 2568.91309 -53912.11328 1.00000 |
||
+ | * We splitted the python script in order to be able to modify the alignment file, which was created by Modeller in the first part of the script and used by the second part of the script. |
||
+ | |||
+ | * The python script, which tells Modeller to create an alignment <br><code>from modeller import *<br><br>env = environ()<br>aln = alignment(env)<br>mdl = model(env, file='../structures/1phz.pdb', model_segment=('FIRST:A','LAST:A'))<br>aln.append_model(mdl, align_codes='1phz', atom_files='../structures/1phz.pdb')<br>aln.append(file='pah.ali', align_codes='PAH')<br>aln.align2d()<br>aln.write(file='phz.ali', alignment_format='PIR')<br> |
||
+ | </code> |
||
+ | |||
+ | * The python script, which tells Modeller to create a model<br><code>from modeller.automodel import *<br><br>a = automodel(env, alnfile='phz.ali',<br> knowns='1phz', sequence='PAH',<br> assess_methods=(assess.DOPE, assess.GA341))<br>a.starting_model = 1<br>a.ending_model = 1<br>a.make()<br> |
||
+ | </code> |
||
+ | For the modeling with multiple templates we used the alignment file described above and a new script (see [http://salilab.org/modeller/tutorial/advanced.html advanced Modeller tutorial]). We used the sturctures of 1MLW, 1TOH, 1PHZ as templates |
||
− | 1toh |
||
− | Filename molpdf DOPE score GA341 score |
||
− | ---------------------------------------------------------------------- |
||
− | PAH.B99990001.pdb 20481.25977 -37609.82031 1.00000 |
||
+ | * <code>from modeller import *<br>from modeller.automodel import *<br><br>env = environ()<br>aln = alignment(env)<br><br>path = '../structures/'<br>l = (path+'1MLW'+'.pdb', path+'1TOH'+'.pdb', path+'1PHZ'+'.pdb')<br><br>for (code) in l:<br> mdl = model(env, file=code, model_segment=('FIRST:A','LAST:A'))<br> aln.append_model(mdl, align_codes=code, atom_files=code)<br>aln.append(file='pah.ali', align_codes='PAH')<br>aln.align2d()<br>aln.write(file='mult.ali', alignment_format='PIR')<br><br>a = automodel(env, alnfile='mult.ali',<br> knowns=l, sequence='PAH',<br> assess_methods=(assess.DOPE, <br>ssess.GA341))<br>a.starting_model = 1<br>a.ending_model = 1<br>a.make()</code> |
||
− | 1ltz |
||
− | Filename molpdf DOPE score GA341 score |
||
− | ---------------------------------------------------------------------- |
||
− | PAH.B99990001.pdb 6824.36182 -37422.13672 0.98868 |
||
− | === Homology |
+ | === Homology modeling with Swissmodel === |
==== Standard workflow ==== |
==== Standard workflow ==== |
||
Line 255: | Line 482: | ||
| 1PHZ Chain: A |
| 1PHZ Chain: A |
||
| P00439 (Phenylalanine-4-hydroxylase) |
| P00439 (Phenylalanine-4-hydroxylase) |
||
− | | [[File:1phz A template auto model.png|400px]] |
+ | | [[File:1phz A template auto model.png|thumb|400px|'''Figure 1:''' A screenshot of the input form of Swissmodel for the target PAH and the template 1PHZ.]] |
|- |
|- |
||
| > 40% sequence identity |
| > 40% sequence identity |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| P00439 (Phenylalanine-4-hydroxylase) |
| P00439 (Phenylalanine-4-hydroxylase) |
||
− | | [[File:1toh A template auto model.png|400px]] |
+ | | [[File:1toh A template auto model.png|thumb|400px|'''Figure 2:''' A screenshot of the input form of Swissmodel for the target PAH and the template 1TOH.]] |
|- |
|- |
||
| < 40% sequence identity |
| < 40% sequence identity |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| P00439 (Phenylalanine-4-hydroxylase) |
| P00439 (Phenylalanine-4-hydroxylase) |
||
− | | [[File:1ltz A template auto model.png|400px]] |
+ | | [[File:1ltz A template auto model.png|thumb|400px|'''Figure 3:''' A screenshot of the input form of Swissmodel for the target PAH and the template 1LTZ.]] |
+ | |} |
||
+ | |||
+ | |||
+ | ==== Workflow with own alignment ==== |
||
+ | |||
+ | The workflow of Swissmodel, where you can use your own alignment, is the automated mode. For this mode the alignment has to be specified. Swissmodel |
||
+ | accepts the alignment in different formats. We have chosen FASTA. |
||
+ | In a new window you have to specify which FASTA id is the target and which FASTA id is the template. |
||
+ | For the template you have to specify the corresponding pdb-id with chain. Afterwards you have to check the updated alignment with respect to the pdb structure. |
||
+ | |||
+ | The screenshots of the workflow for 1phz are shown below. |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! 1. Specify the alignment |
||
+ | ! 2. Specify target and template with structure |
||
+ | ! 3. Checking of the adjusted alignment |
||
+ | |- |
||
+ | | [[File:Pah swiss 1.png|thumb|400px|'''Figure 4:''' A screenshot of the input form of Swissmodel for the target PAH and the template 1PHZ with own alignment.]] |
||
+ | | [[File:Pah swiss 2.png|thumb|400px|'''Figure 5:''' A screenshot of the input form of Swissmodel for the target PAH and the template 1PHZ with own alignment. Specification of the target and template sequence and the corresponding template structure.]] |
||
+ | | [[File:Pah swiss 3.png|thumb|400px|'''Figure 6:''' A screenshot of the input form of Swissmodel for the target PAH and the template 1PHZ with own alignment. Adjustment of the alignment to the specified template structure.]] |
||
|} |
|} |
||
=== Homology modelling with iTasser === |
=== Homology modelling with iTasser === |
||
+ | |||
+ | [[Image:Pah itasser 1.png|thumb|top|'''Figure 7:''' A screenshot of the iTasser workflow with template restriction for the target PAH and the template 1PHZ.]] |
||
+ | [[Image:Pah itasser 2.png|thumb|top|'''Figure 8:''' A screenshot of the iTasser workflow with alignment and template restriction for the target PAH and the template 1PHZ.]] |
||
[http://zhanglab.ccmb.med.umich.edu/I-TASSER/ ITasser] is a prediction server, which participated in several CASP competitions. |
[http://zhanglab.ccmb.med.umich.edu/I-TASSER/ ITasser] is a prediction server, which participated in several CASP competitions. |
||
− | It claims of itself to be the best one. |
+ | It claims of itself to be the best one. |
The runtime of the jobs is approximately 24 to 48 hours. The server seems to receive a lot of jobs, that is why it is not allowed |
The runtime of the jobs is approximately 24 to 48 hours. The server seems to receive a lot of jobs, that is why it is not allowed |
||
to add more than one job at a time from the same ip-address. Therefore it is probably not possible to run all six variants described |
to add more than one job at a time from the same ip-address. Therefore it is probably not possible to run all six variants described |
||
in the task. |
in the task. |
||
+ | |||
+ | iTasser offers several options. For the automated workflow we just specified a template to be used (a screenshot can be seen in figure 1). |
||
+ | For the workflow with our adjusted alignment we specified the template with alignment (a screenshot can be seen in figure 2). |
||
+ | |||
+ | The file which was needed for the workflow with the adjusted alignment contains the alignment in FASTA format and |
||
+ | the ATOM coordinates of the template:<br> |
||
+ | <code> |
||
+ | * >TARGET<br>GQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTN<br>IIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNY<br>RHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLR<br>PVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYI<br>EKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFN<br>DAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQ<br>>1phy:A<br>GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFFTYLDKRTKPVLGS<br>IIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNY<br>RHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLR<br>PVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EY<br>IEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAES<br>FSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT<br><br>ATOM 1 N GLY A 19 40.338 -7.649 17.712 1.00117.94 N <br>ATOM 2 CA GLY A 19 40.523 -8.622 16.594 1.00119.56 C <br>ATOM 3 C GLY A 19 39.585 -8.344 15.430 1.00119.47 C <br>ATOM 4 O GLY A 19 39.005 -9.256 14.849 1.00122.09 O <br>...<br>ATOM 3284 OG1 THR A 427 12.515 -6.231 40.981 1.00107.96 O <br> ATOM 3285 CG2 THR A 427 14.463 -5.654 39.679 1.00107.51 C <br> END |
||
+ | </code> |
||
+ | |||
+ | === Model-refinement with 3D-JigSaw === |
||
+ | Without a reference structure it is hard to say which predicted model is the best. There are methods and servers, which try to rank several models. If their performance would be perfect, the modeling of proteins was a solved problem. But their performance is sometimes far from perfect. One of these severs is [http://bmm.cancerresearchuk.org/~populus/populus_submit.html 3D-Jigsaw]. In fact 3D-Jigsaw is a complete protein modeling server which takes the models and tries to improve them and rank them afterwards. |
||
+ | |||
+ | The ranking of the models needs a lot of resources on the JigSaw server. Our goal is to select five of the six models for each group. Therefore we try to kick out models which are too similar. That is why we calculated the pairwise TMScore for the different models of one prediction class and selected only one member of a high-scoring pair. |
||
+ | |||
+ | ====Selection of models==== |
||
+ | =====1phz - sequence identity > 60 %===== |
||
+ | {| border="1" |
||
+ | ! |
||
+ | ! modeller |
||
+ | ! modeller adjusted |
||
+ | ! swissmodel |
||
+ | ! swissmodel adjusted |
||
+ | ! iTasser |
||
+ | ! iTasser adjusted |
||
+ | |- |
||
+ | ! modeller |
||
+ | | 1.0000 |
||
+ | | 0.0941 |
||
+ | | 0.9773 |
||
+ | | 0.0933 |
||
+ | | 0.8911 |
||
+ | | 0.0958 |
||
+ | |- |
||
+ | ! modeller adjusted |
||
+ | | 0.0885 |
||
+ | | 1.0000 |
||
+ | | 0.0918 |
||
+ | | 0.9808 |
||
+ | | 0.0887 |
||
+ | | 0.8141 |
||
+ | |- |
||
+ | ! swissmodel |
||
+ | | 0.8845 |
||
+ | | 0.0915 |
||
+ | | 1.0000 |
||
+ | | 0.0906 |
||
+ | | 0.8883 |
||
+ | | 0.0942 |
||
+ | |- |
||
+ | ! swissmodel adjusted |
||
+ | | 0.0879 |
||
+ | | 0.9808 |
||
+ | | 0.0909 |
||
+ | | 1.0000 |
||
+ | | 0.0877 |
||
+ | | 0.8181 |
||
+ | |- |
||
+ | ! iTasser |
||
+ | | 0.8911 |
||
+ | | 0.0937 |
||
+ | | 0.9815 |
||
+ | | 0.0932 |
||
+ | | 1.0000 |
||
+ | | 0.0959 |
||
+ | |- |
||
+ | ! iTasser adjusted |
||
+ | | 0.0902 |
||
+ | | 0.8141 |
||
+ | | 0.0945 |
||
+ | | 0.8181 |
||
+ | | 0.0898 |
||
+ | | 1.0000 |
||
+ | |} |
||
+ | The similarity between the models of modeller with adjusted alignment and swissmodel with adjusted alignment is too high. We decided to kick the model of modeller. |
||
+ | |||
+ | ===== 1toh - sequence identity > 40 % ===== |
||
+ | {| border="1" |
||
+ | ! |
||
+ | ! modeller |
||
+ | ! modeller adjusted |
||
+ | ! swissmodel |
||
+ | ! swissmodel adjusted |
||
+ | ! iTasser |
||
+ | ! iTasser adjusted |
||
+ | |- |
||
+ | ! modeller |
||
+ | | 1.0000 |
||
+ | | 0.0941 |
||
+ | | 0.7839 |
||
+ | | 0.0929 |
||
+ | | 0.5782 |
||
+ | | 0.0910 |
||
+ | |- |
||
+ | ! modeller adjusted |
||
+ | | 0.0759 |
||
+ | | 1.0000 |
||
+ | | 0.0928 |
||
+ | | 0.9379 |
||
+ | | 0.0812 |
||
+ | | 0.7005 |
||
+ | |- |
||
+ | ! swissmodel |
||
+ | | 0.5818 |
||
+ | | 0.0926 |
||
+ | | 1.0000 |
||
+ | | 0.0893 |
||
+ | | 0.6602 |
||
+ | | 0.0878 |
||
+ | |- |
||
+ | ! swissmodel adjusted |
||
+ | | 0.0750 |
||
+ | | 0.9379 |
||
+ | | 0.0895 |
||
+ | | 1.0000 |
||
+ | | 0.0792 |
||
+ | | 0.7118 |
||
+ | |- |
||
+ | ! iTasser |
||
+ | | 0.5782 |
||
+ | | 0.1008 |
||
+ | | 0.8902 |
||
+ | | 0.0980 |
||
+ | | 1.0000 |
||
+ | | 0.0961 |
||
+ | |- |
||
+ | ! iTasser adjusted |
||
+ | | 0.0733 |
||
+ | | 0.7005 |
||
+ | | 0.0881 |
||
+ | | 0.7118 |
||
+ | | 0.0779 |
||
+ | | 1.0000 |
||
+ | |} |
||
+ | The similarity between the models of modeller with adjusted alignment and swissmodel with adjusted alignment is too high. We decided to kick the model of modeller. |
||
+ | |||
+ | ===== 1ltz - sequence identity < 40 % ===== |
||
+ | {| border="1" |
||
+ | ! |
||
+ | ! modeller |
||
+ | ! modeller adjusted |
||
+ | ! swissmodel |
||
+ | ! swissmodel adjusted |
||
+ | ! iTasser |
||
+ | |- |
||
+ | ! modeller |
||
+ | | 1.0000 |
||
+ | | 0.0898 |
||
+ | | 0.8383 |
||
+ | | 0.0889 |
||
+ | | 0.4521 |
||
+ | |- |
||
+ | ! modeller adjusted |
||
+ | | 0.0589 |
||
+ | | 1.0000 |
||
+ | | 0.0661 |
||
+ | | 0.9722 |
||
+ | | 0.0592 |
||
+ | |- |
||
+ | ! swissmodel |
||
+ | | 0.4601 |
||
+ | | 0.0695 |
||
+ | | 1.0000 |
||
+ | | 0.0689 |
||
+ | | 0.4566 |
||
+ | |- |
||
+ | ! swissmodel adjusted |
||
+ | | 0.0577 |
||
+ | | 0.9722 |
||
+ | | 0.0656 |
||
+ | | 1.0000 |
||
+ | | 0.0601 |
||
+ | |- |
||
+ | ! iTasser |
||
+ | | 0.4521 |
||
+ | | 0.0941 |
||
+ | | 0.8266 |
||
+ | | 0.0951 |
||
+ | | 1.0000 |
||
+ | |} |
||
== Evaluation of the calculated models == |
== Evaluation of the calculated models == |
||
Line 388: | Line 819: | ||
==== Modeller ==== |
==== Modeller ==== |
||
+ | |||
+ | ===== Description of the quantitative Modeller scores ===== |
||
+ | |||
+ | '''molpdf (molecular PDF):''' This measure is the sum of all restrains and is the standard score of Modeller. This score is not absolute which means it can be only used for ranking models from the same alignment. Lower means better for this score. (Source:[http://salilab.org/modeller/tutorial/cryoem/assess.html] ) |
||
+ | |||
+ | '''DOPE (Discrete Optimized Protein Energy):''' "this statistical potential is based on an improved reference state that corresponds to non interacting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures". It is an absolute score. Lower means better for this score. (Source: [http://en.wikipedia.org/wiki/Discrete_optimized_protein_energy]) |
||
+ | |||
+ | '''GA341:''' "The method uses the percentage sequence identity between the template and the model as a parameter." The range is from 0 to 1. Higher means better. However, this method is not as good as the other described methods to judge the quality of a model. (Source: [http://salilab.org/modeller/manual/node195.html] and [http://salilab.org/modeller/tutorial/cryoem/assess.html]) |
||
+ | |||
+ | ===== Results ===== |
||
We have chosen three scores to be calculated by Modeller: molpdf, DOPE and GA341 |
We have chosen three scores to be calculated by Modeller: molpdf, DOPE and GA341 |
||
Line 393: | Line 834: | ||
{| border="1" |
{| border="1" |
||
! model |
! model |
||
− | ! |
+ | ! molpdf |
! DOPE |
! DOPE |
||
! GA341 |
! GA341 |
||
Line 417: | Line 858: | ||
| 1.00000 |
| 1.00000 |
||
|- |
|- |
||
− | | |
+ | | 1toh with adjusted alignment |
| 1790.78723 |
| 1790.78723 |
||
| -38113.64453 |
| -38113.64453 |
||
Line 434: | Line 875: | ||
==== Swissmodel ==== |
==== Swissmodel ==== |
||
+ | |||
+ | ===== Description of the quantitative scores of Swissmodel ===== |
||
+ | |||
+ | '''QMEAN Z-Score:''' The QMEAN Z-Scores is basically a score which compares the QMEAN global score of our model with experimentally solved structures (e.g. by X-Ray) of approximately the same size (a difference of +/- 10% is allowed). Higher means better for this score. (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#QMEAN4] and [http://swissmodel.expasy.org/qmean/cgi/index.cgi?page=help]) |
||
+ | |||
+ | '''QMEANscore4:''' This is the QMEAN global score to judge the overall quality of the calculated model. The score values range from 0 to 1, higher means better. This score basically comprehends information from sub scores which focus on one aspect which are: C_beta interaction energy, all-atom pairwise energy, solvation energy and torsion angle energy. (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#QMEAN4]) |
||
+ | |||
+ | '''Estimated absolute model quality:''' The first plot in this column shows with grey and black circles the QMEANscore4 of experimentally solved structures. The red cross marks the position of our model in this plot. Hence, it is possible to compare our score to other scores of experimental structures of same size. The second plot of this column is a density plot and shows the distribution of the QMEANscore4 of all experimental structures which were used to calculate the Z-Score. The red line marks the position of our model in this plot (Source: [http://swissmodel.expasy.org/qmean/cgi/index.cgi?page=help]) |
||
+ | |||
+ | '''Score components:''' This plot shows the Z-Score of different quality measures of a the model. Large negative Z-scores indicate that the quality for this aspect of the model is worse. (Source: [http://swissmodel.expasy.org/qmean/cgi/index.cgi?page=help]) |
||
+ | |||
+ | '''Coloring by residue error:''' The figure in this column shows the estimated residue error along the 3D structure of the modeled target sequence. Blue means that the residue error is smaller than 1 Angstrom and red parts indicate a residue error greater than > 3.5 Angstrom. (Source: [http://swissmodel.expasy.org/qmean/cgi/index.cgi?page=help]) |
||
+ | |||
+ | '''Residue error plot:''' "model energy profile with estimated residue errors along the sequence" (Source: [http://swissmodel.expasy.org/qmean/cgi/index.cgi?page=help]) |
||
+ | |||
+ | '''C_beta interaction energy:''' this value asses the long range interactions, the raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [http://swissmodel.expasy.org/qmean/cgi/index.cgi?page=help]) |
||
+ | |||
+ | '''C_beta interaction energy:''' this value asses the long range interactions by only using the C-beta atoms of each residue. The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#QMEAN4]) |
||
+ | |||
+ | '''All-atom pairwise energy:''' this value asses the long range interactions by using all atoms of each residue. The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#QMEAN4]) |
||
+ | |||
+ | '''Solvation energy:''' "A solvation potential investigates the burial status of the residues." The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#QMEAN4]) |
||
+ | |||
+ | '''Torsion angle energy:''' This energy score investigates the local geometry by a torsion angle potential over three consecutive residues. The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#QMEAN4]) |
||
+ | |||
+ | '''Anolea:''' "The atomic empirical mean force potential ANOLEA (Melo et al.) is used to assess packing quality of the models. The program performs energy calculations on a protein chain, evaluating the "Non- Local Environment" (NLE) of each heavy atom in the molecule. The y-axis of the plot represents the energy for each amino acid of the protein chain. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid. " (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#A]) |
||
+ | |||
+ | '''QMEAN:''' "is a composite scoring function for both the estimation of the global quality of the entire model as well as for the local per-residue analysis of different regions within a model." (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#A]) |
||
+ | |||
+ | '''Gromos:''' "GROMOS (van Gunsteren et al.) is a general-purpose molecular dynamics computer simulation package for the study of biomolecular systems and can be applied to the analysis of conformations obtained by experiment or by computer simulation. |
||
+ | |||
+ | The y-axis of the plot represents the energy for each amino acid of the protein chain. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid. " (Source: [http://swissmodel.expasy.org/workspace/index.php?func=special_help&=#A]) |
||
===== Automated Mode: Modelling with template structure 1phz_A (>60%) ===== |
===== Automated Mode: Modelling with template structure 1phz_A (>60%) ===== |
||
Line 448: | Line 921: | ||
|- |
|- |
||
| 0.715 |
| 0.715 |
||
− | | [[File:QMEAN plots 1phz template.pdb plot.png |
+ | | [[File:QMEAN plots 1phz template.pdb plot.png|thumb|300px|'''Figure 9:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1PHZ without alignment restrictions with a non-redundant set of pdb structures and their QMEAN4 scores.]] [[File:QMEAN plots 1phz template pdb plot.png density plot.png|thumb|300px|'''Figure 10:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1PHZ without alignment restrictions with the density of QMEAN4 scores of a non-redundant set of pdb structures.]] |
− | | [[File:QMEAN plots 1phz template plot.png slider.png |
+ | | [[File:QMEAN plots 1phz template plot.png slider.png|thumb|300px|'''Figure 11:''' A plot of Swissmodel showing the different parts of the QMEAN4 score for the result of the created model for the target PAH and the template 1PHZ without alignment restrictions.]] |
|} |
|} |
||
Line 461: | Line 934: | ||
! Residue error plot |
! Residue error plot |
||
|- |
|- |
||
− | | [[File:1phz template coloring by residue error.jpeg |
+ | | [[File:1phz template coloring by residue error.jpeg|thumb|300px|'''Figure 12:''' The model created by Swissmodel for the target PAH and the template 1PHZ without alignment restrictions colored by the estimated residue error.]] |
− | | [[File:QMEAN plots energy profile plots 1phz template.pdb local energy profile QMEANlocal.png |
+ | | [[File:QMEAN plots energy profile plots 1phz template.pdb local energy profile QMEANlocal.png|thumb|300px|'''Figure 13:''' A plot showing the estimated residue error for the model created by Swissmodel for the target PAH and the template 1PHZ without alignment restrictions.]] |
|} |
|} |
||
Line 499: | Line 972: | ||
'''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
'''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
||
− | [[File:Local quality estimation 1phz template annolea qmean gromos.png |
+ | [[File:Local quality estimation 1phz template annolea qmean gromos.png|thumb|center|300px|'''Figure 14:''' A plot of Swissmodel showing an estimation of the quality of the model for the target PAH and the template 1PHZ without alignment restrictions by using the scores of Anolea / QMEAN / Gromos.]] |
Line 515: | Line 988: | ||
|- |
|- |
||
| 0.604 |
| 0.604 |
||
− | | [[File:1toh QMEAN plots Batch.1.short.pdb plot.png | |
+ | | [[File:1toh QMEAN plots Batch.1.short.pdb plot.png |thumb|300px|'''Figure 15:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1TOH without alignment restrictions with a non-redundant set of pdb structures and their QMEAN4 scores.]] [[File:1toh QMEAN plots Batch.1.short.pdb plot.png density plot.png|thumb|300px|'''Figure 16:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1TOH without alignment restrictions with the density of QMEAN4 scores of a non-redundant set of pdb structures.]] |
− | | [[File:1toh QMEAN plots Batch.1.short.pdb plot.png slider.png |
+ | | [[File:1toh QMEAN plots Batch.1.short.pdb plot.png slider.png|thumb|300px|'''Figure 17:''' A plot of Swissmodel showing the different parts of the QMEAN4 score for the result of the created model for the target PAH and the template 1TOH without alignment restrictions.]] |
|} |
|} |
||
Line 528: | Line 1,001: | ||
! Residue error plot |
! Residue error plot |
||
|- |
|- |
||
− | | [[File:1toh residue error structure.jpeg |
+ | | [[File:1toh residue error structure.jpeg|thumb|300px|'''Figure 18:''' The model created by Swissmodel for the target PAH and the template 1TOH without alignment restrictions colored by the estimated residue error.]] |
− | | [[File:1toh QMEAN plots energy profile plots Batch.1.short.pdb local energy profile QMEANlocal.png |
+ | | [[File:1toh QMEAN plots energy profile plots Batch.1.short.pdb local energy profile QMEANlocal.png|thumb|300px|'''Figure 19:''' A plot showing the estimated residue error for the model created by Swissmodel for the target PAH and the template 1TOH without alignment restrictions.]] |
|} |
|} |
||
Line 567: | Line 1,040: | ||
'''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
'''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
||
− | [[File:1toh local mode quality estimation anolea qmean gromos.png | |
+ | [[File:1toh local mode quality estimation anolea qmean gromos.png |thumb|center|300px|'''Figure 20:''' A plot of Swissmodel showing an estimation of the quality of the model for the target PAH and the template 1TOH without alignment restrictions by using the scores of Anolea / QMEAN / Gromos.]] |
Line 585: | Line 1,058: | ||
|- |
|- |
||
| 0.47 |
| 0.47 |
||
− | | [[File:1ltz QMEAN plots Batch.1.short.pdb plot.png |
+ | | [[File:1ltz QMEAN plots Batch.1.short.pdb plot.png|thumb|300px|'''Figure 21:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1LTZ without alignment restrictions with a non-redundant set of pdb structures and their QMEAN4 scores.]] [[File:1 ltz QMEAN plots Batch.1.short.pdb plot.png density plot.png|thumb|300px|'''Figure 22:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1LTZ without alignment restrictions with the density of QMEAN4 scores of a non-redundant set of pdb structures.]] |
− | | [[File:1 ltz QMEAN plots Batch.1.short.pdb plot.png slider.png |
+ | | [[File:1 ltz QMEAN plots Batch.1.short.pdb plot.png slider.png|thumb|300px|'''Figure 23:''' A plot of Swissmodel showing the different parts of the QMEAN4 score for the result of the created model for the target PAH and the template 1LTZ without alignment restrictions.]] |
|} |
|} |
||
Line 598: | Line 1,071: | ||
! Residue error plot |
! Residue error plot |
||
|- |
|- |
||
− | | [[File:1ltz residue error pdb plot.jpeg |
+ | | [[File:1ltz residue error pdb plot.jpeg|thumb|300px|'''Figure 24:''' The model created by Swissmodel for the target PAH and the template 1LTZ without alignment restrictions colored by the estimated residue error.]] |
− | | [[File:1ltz QMEAN plots energy profile plots Batch.1.short.pdb local energy profile QMEANlocal.png |
+ | | [[File:1ltz QMEAN plots energy profile plots Batch.1.short.pdb local energy profile QMEANlocal.png|thumb|300px|'''Figure 25:''' A plot showing the estimated residue error for the model created by Swissmodel for the target PAH and the template 1LTZ without alignment restrictions.]] |
|} |
|} |
||
Line 638: | Line 1,111: | ||
'''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
'''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
||
− | [[File:1ltz local model quality estimation anolea qmean gromos.png |
+ | [[File:1ltz local model quality estimation anolea qmean gromos.png|thumb|center|300px|'''Figure 26:''' A plot of Swissmodel showing an estimation of the quality of the model for the target PAH and the template 1LTZ without alignment restrictions by using the scores of Anolea / QMEAN / Gromos.]] |
+ | |||
+ | |||
+ | ===== Alignment Mode: Modeling with adjusted alignment 1phz_A (>60%) ===== |
||
+ | |||
+ | '''QMEAN Z-Score:''' -2.786 |
||
+ | |||
+ | '''QMEAN4 global scores:''' |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! QMEANscore4 |
||
+ | ! Estimated absolute model quality |
||
+ | ! Score components |
||
+ | |- |
||
+ | | 0.6 |
||
+ | | [[File:QMEAN plots 1phz imp template.pdb plot.png|thumb|300px|'''Figure 27:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1PHZ with alignment restriction with a non-redundant set of pdb structures and their QMEAN4 scores.]] [[File:QMEAN plots 1phz imp template pdb plot.png density plot.png|thumb|300px|'''Figure 28:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1PHZ with alignment restriction with the density of QMEAN4 scores of a non-redundant set of pdb structures.]] |
||
+ | | [[File:QMEAN plots 1phz imp template plot.png slider.png |thumb|300px|'''Figure 29:''' A plot of Swissmodel showing the different parts of the QMEAN4 score for the result of the created model for the target PAH and the template 1PHZ with alignment restriction.]] |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Local scores:''' |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! Coloring by residue error |
||
+ | ! Residue error plot |
||
+ | |- |
||
+ | | [[File:1phz imp template coloring by residue error.jpeg|thumb|300px|'''Figure 30:''' The model created by Swissmodel for the target PAH and the template 1PHZ with alignment restriction colored by the estimated residue error.]] |
||
+ | | [[File:QMEAN plots energy profile plots 1phz imp template.pdb local energy profile QMEANlocal.png|thumb|300px|'''Figure 31:''' A plot showing the estimated residue error for the model created by Swissmodel for the target PAH and the template 1PHZ with alignment restriction.]] |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Global scores: QMEAN4:''' |
||
+ | |||
+ | {| border="1" |
||
+ | ! Scoring function term |
||
+ | ! Raw score |
||
+ | ! Z-score |
||
+ | |- |
||
+ | | C_beta interaction energy |
||
+ | | -87.10 |
||
+ | | -1.26 |
||
+ | |- |
||
+ | | All-atom pairwise energy |
||
+ | | -8126.09 |
||
+ | | -1.51 |
||
+ | |- |
||
+ | | Solvation energy |
||
+ | | -29.98 |
||
+ | | -0.84 |
||
+ | |- |
||
+ | | Torsion angle energy |
||
+ | | -58.30 |
||
+ | | -2.39 |
||
+ | |- |
||
+ | | QMEAN4 score |
||
+ | | 0.600 |
||
+ | | -2.79 |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
||
+ | |||
+ | [[File:Local quality estimation 1phz imp template annolea qmean gromos.png|thumb|center|300px|'''Figure 32:''' A plot of Swissmodel showing an estimation of the quality of the model for the target PAH and the template 1PHZ with alignment restriction by using the scores of Anolea / QMEAN / Gromos.]] |
||
+ | |||
+ | |||
+ | ===== Alignment Mode: Modeling with adjusted alignment 1toh_A (>40%) ===== |
||
+ | |||
+ | '''QMEAN Z-Score:''' -4.498 |
||
+ | |||
+ | '''QMEAN4 global scores:''' |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! QMEANscore4 |
||
+ | ! Estimated absolute model quality |
||
+ | ! Score components |
||
+ | |- |
||
+ | | 0.494 |
||
+ | | [[File:QMEAN plots 1toh imp template.pdb plot.png|thumb|300px|'''Figure 33:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1TOH with alignment restriction with a non-redundant set of pdb structures and their QMEAN4 scores.]] [[File:QMEAN plots 1toh imp template pdb plot.png density plot.png|thumb|300px|'''Figure 34:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1TOH with alignment restrictions with the density of QMEAN4 scores of a non-redundant set of pdb structures.]] |
||
+ | | [[File:QMEAN plots 1toh imp template plot.png slider.png|thumb|300px|'''Figure 35:''' A plot of Swissmodel showing the different parts of the QMEAN4 score for the result of the created model for the target PAH and the template 1TOH with alignment restrictions.]] |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Local scores:''' |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! Coloring by residue error |
||
+ | ! Residue error plot |
||
+ | |- |
||
+ | | [[File:1toh imp template coloring by residue error.jpeg|thumb|300px|'''Figure 36:''' The model created by Swissmodel for the target PAH and the template 1TOH with alignment restriction colored by the estimated residue error.]] |
||
+ | | [[File:QMEAN plots energy profile plots 1toh imp template.pdb local energy profile QMEANlocal.png|thumb|300px|'''Figure 37:''' A plot showing the estimated residue error for the model created by Swissmodel for the target PAH and the template 1TOH with alignment restriction.]] |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Global scores: QMEAN4:''' |
||
+ | |||
+ | {| border="1" |
||
+ | ! Scoring function term |
||
+ | ! Raw score |
||
+ | ! Z-score |
||
+ | |- |
||
+ | | C_beta interaction energy |
||
+ | | -21.70 |
||
+ | | -2.19 |
||
+ | |- |
||
+ | | All-atom pairwise energy |
||
+ | | -4082.14 |
||
+ | | -2.32 |
||
+ | |- |
||
+ | | Solvation energy |
||
+ | | -3.04 |
||
+ | | -3.04 |
||
+ | |- |
||
+ | | Torsion angle energy |
||
+ | | -34.43 |
||
+ | | -2.83 |
||
+ | |- |
||
+ | | QMEAN4 score |
||
+ | | 0.494 |
||
+ | | -4.50 |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | |||
+ | '''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
||
+ | |||
+ | [[File:Local quality estimation 1toh imp template annolea qmean gromos.png|thumb|center|300px|'''Figure 38:''' A plot of Swissmodel showing an estimation of the quality of the model for the target PAH and the template 1TOH with alignment restriction by using the scores of Anolea / QMEAN / Gromos.]] |
||
+ | |||
+ | ===== Alignment Mode: Modeling with adjusted alignment 1ltz_A (<40%) ===== |
||
+ | |||
+ | '''QMEAN Z-Score:''' -4.673 |
||
+ | |||
+ | '''QMEAN4 global scores:''' |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! QMEANscore4 |
||
+ | ! Estimated absolute model quality |
||
+ | ! Score components |
||
+ | |- |
||
+ | | 0.447 |
||
+ | | [[File:QMEAN plots 1ltz imp template.pdb plot.png |thumb|300px|'''Figure 39:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1LTZ with alignment restriction with a non-redundant set of pdb structures and their QMEAN4 scores.]] [[File:QMEAN plots 1ltz imp template pdb plot.png density plot.png|thumb|300px|'''Figure 40:''' A plot of Swissmodel comparing the result of the created model for the target PAH and the template 1LTZ with alignment restriction with the density of QMEAN4 scores of a non-redundant set of pdb structures.]] |
||
+ | | [[File:QMEAN plots 1ltz imp template plot.png slider.png|thumb|300px|'''Figure 41:''' A plot of Swissmodel showing the different parts of the QMEAN4 score for the result of the created model for the target PAH and the template 1LTZ with alignment restriction.]] |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Local scores:''' |
||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! Coloring by residue error |
||
+ | ! Residue error plot |
||
+ | |- |
||
+ | | [[File:1ltz imp template coloring by residue error.jpeg|thumb|300px|'''Figure 42:''' The model created by Swissmodel for the target PAH and the template 1LTZ with alignment restriction colored by the estimated residue error.]] |
||
+ | | [[File:QMEAN plots energy profile plots 1ltz imp template.pdb local energy profile QMEANlocal.png|thumb|300px|'''Figure 43:''' A plot showing the estimated residue error for the model created by Swissmodel for the target PAH and the template 1LTZ with alignment restriction.]] |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | '''Global scores: QMEAN4:''' |
||
+ | |||
+ | {| border="1" |
||
+ | ! Scoring function term |
||
+ | ! Raw score |
||
+ | ! Z-score |
||
+ | |- |
||
+ | | C_beta interaction energy |
||
+ | | -18.68 |
||
+ | | -2.69 |
||
+ | |- |
||
+ | | All-atom pairwise energy |
||
+ | | -1884.30 |
||
+ | | -2.99 |
||
+ | |- |
||
+ | | Solvation energy |
||
+ | | -9.58 |
||
+ | | -1.72 |
||
+ | |- |
||
+ | | Torsion angle energy |
||
+ | | -7.34 |
||
+ | | -3.55 |
||
+ | |- |
||
+ | | QMEAN4 score |
||
+ | | 0.447 |
||
+ | | -4.67 |
||
+ | |} |
||
+ | |||
+ | '''Local Model Quality Estimation: Anolea / QMEAN / Gromos:''' |
||
+ | |||
+ | [[File:Local quality estimation 1ltz imp template annolea qmean gromos.png|thumb|300px|center|'''Figure 44:''' A plot of Swissmodel showing an estimation of the quality of the model for the target PAH and the template 1LTZ with alignment restriction by using the scores of Anolea / QMEAN / Gromos.]] |
||
==== iTasser ==== |
==== iTasser ==== |
||
+ | iTasser produces up to five models. For each model it calculates a C-Score. For the best model an estimation of the TM-Score and the RMSD with respect to the native structure are calculated. |
||
+ | |||
+ | "C-score is a confidence score for estimating the quality of predicted models by I-TASSER. It is calculated |
||
+ | based on the significance of threading template alignments and the convergence parameters of the structure |
||
+ | assembly simulations. C-score is typically in the range of [-5,2], where a C-score of higher value signifies |
||
+ | a model with a high confidence and vice-versa." (source: [http://zhanglab.ccmb.med.umich.edu/I-TASSER/output/S73106/cscore.txt scores by iTasser]) |
||
+ | |||
+ | {| border="1" |
||
+ | | Model |
||
+ | | C-Score |
||
+ | | estimated TM-score |
||
+ | | estimated RMSD |
||
+ | |- |
||
+ | | 1phz |
||
+ | | 0.150 |
||
+ | | 0.73±0.11 |
||
+ | | 6.8±4.0Å |
||
+ | |- |
||
+ | | 1toh |
||
+ | | 0.074 |
||
+ | | 0.72±0.11 |
||
+ | | 6.9±4.1Å |
||
+ | |- |
||
+ | | 1ltz |
||
+ | | -0.380 |
||
+ | | 0.66±0.13 |
||
+ | | 7.9±4.4Å |
||
+ | |- |
||
+ | | 1phz adjusted |
||
+ | | 1.650 |
||
+ | | 0.95±0.05 |
||
+ | | 3.5±2.4Å |
||
+ | |- |
||
+ | | 1toh adjusted |
||
+ | | 1.944 |
||
+ | | 0.99±0.04 |
||
+ | | 2.6±1.9Å |
||
+ | |- |
||
+ | | 1ltz adjusted |
||
+ | | 2.053 |
||
+ | | 0.99±0.04 |
||
+ | | 1.7±1.5Å |
||
+ | |} |
||
+ | |||
+ | The upper and lower bounds of the RMSD and TM-Score estimations of the unadjusted alignments are between nonsense and quite good. |
||
+ | It is hard to say, if the model is good regarding only these vague estimations. |
||
+ | For the adjusted alignments the estimated deviations of these scores are much smaller. |
||
+ | Regarding these scores the models of the adjusted alignments were highly improved. |
||
+ | |||
+ | ====3D-JigSaw==== |
||
+ | 3D-JigSaw works in iterations. In each iteration new models are calculated. |
||
+ | |||
+ | In the output 3D-JigSaw gives a plot of the energy of the calculated models in the different generations |
||
+ | (this seems to be normalized, such that the lowes predicted energy is mapped to zero). |
||
+ | Therefore one can see in which generation the models hit the the zero energy. This can be regarded |
||
+ | as a measure of the difficulty to model the target sequence with the given models. The easier the earliar |
||
+ | this value will be reached. |
||
+ | |||
+ | For each selected model in the final output, 3D-JigSaw gives the energy of the model itself, the coverage |
||
+ | of the model and the target sequence, where the model starts in the target sequence and where it ends. |
||
+ | |||
+ | Additionally 3D-JigSaw calculates the ramachandran plot for each of the final models. |
||
+ | This plot shows how often it was not possible to model a peptide bond, such that its phi- and psi-angle |
||
+ | lie in the defined areas of the ramachandran plot. Together with the predicted energy of the model |
||
+ | the user can estimate the quality of the model. |
||
+ | |||
+ | In one run the energies of the final models are quite the same. Comparing the different runs the models of 1phz have |
||
+ | lower energies than the models of 1toh, which have lower energies than the models of 1ltz. Which indicates |
||
+ | that it was possible to calculate more accurate models with initial models to templates of higher sequence identity. |
||
+ | |||
+ | =====1phz - sequence identity > 60 % ===== |
||
+ | [[File:Pah jig 1phz Energy.gif|thumb|1000px|center|'''Figure 45:''' A plot of 3D-JigSaw showing the energy of the models in different generations for the target PAH and chosen models of the template 1phz.]] |
||
+ | |||
+ | {| border="1" |
||
+ | ! MODEL |
||
+ | ! ENERGY |
||
+ | ! COVERAGE |
||
+ | ! START |
||
+ | ! END |
||
+ | ! RAMACHANDRAN PLOT |
||
+ | |- |
||
+ | | MODEL_1 |
||
+ | | -638.18 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1phz rmp model 1.png|thumb|300px|'''Figure 46:''' The ramachandran plot of the first model created by 3D-JigSaw for the target PAH and chosen models of the template 1phz.]] |
||
+ | |- |
||
+ | | MODEL_2 |
||
+ | | -637.27 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1phz rmp model 2.png|thumb|300px|'''Figure 47:''' The ramachandran plot of the second model created by 3D-JigSaw for the target PAH and chosen models of the template 1phz.]] |
||
+ | |- |
||
+ | | MODEL_3 |
||
+ | | -634.89 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1phz rmp model 3.png|thumb|300px|'''Figure 48:''' The ramachandran plot of the third model created by 3D-JigSaw for the target PAH and chosen models of the template 1phz.]] |
||
+ | |- |
||
+ | | MODEL_4 |
||
+ | | -634.77 |
||
+ | | 0.96 |
||
+ | | 19 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1phz rmp model 4.png|thumb|300px|'''Figure 49:''' The ramachandran plot of the fourth model created by 3D-JigSaw for the target PAH and chosen models of the template 1phz.]] |
||
+ | |- |
||
+ | | MODEL_5 |
||
+ | | -632.78 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1phz rmp model 5.png|thumb|300px|'''Figure 50:''' The ramachandran plot of the fifth model created by 3D-JigSaw for the target PAH and chosen models of the template 1phz.]] |
||
+ | |} |
||
+ | |||
+ | =====1toh - sequence identity > 40 % ===== |
||
+ | [[File:pah_jig_1toh_Energy.gif|thumb|1000px|center|'''Figure 51:''' A plot of 3D-JigSaw showing the energy of the models in different generations for the target PAH and chosen models of the template 1toh.]] |
||
+ | |||
+ | {| border="1" |
||
+ | ! MODEL |
||
+ | ! ENERGY |
||
+ | ! COVERAGE |
||
+ | ! START |
||
+ | ! END |
||
+ | ! RAMACHANDRAN PLOT |
||
+ | |- |
||
+ | | MODEL_1 |
||
+ | | -557.84 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:pah_jig_1toh_rmp_model_1.png|thumb|300px|'''Figure 52:''' The ramachandran plot of the first model created by 3D-JigSaw for the target PAH and chosen models of the template 1toh.]] |
||
+ | |- |
||
+ | | MODEL_2 |
||
+ | | -557.83 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:pah_jig_1toh_rmp_model_2.png|thumb|300px|'''Figure 53:''' The ramachandran plot of the second model created by 3D-JigSaw for the target PAH and chosen models of the template 1toh.]] |
||
+ | |- |
||
+ | | MODEL_3 |
||
+ | | -556.11 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 451 |
||
+ | | [[File:pah_jig_1toh_rmp_model_3.png|thumb|300px|'''Figure 54:''' The ramachandran plot of the third model created by 3D-JigSaw for the target PAH and chosen models of the template 1toh.]] |
||
+ | |- |
||
+ | | MODEL_4 |
||
+ | | -548.57 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:pah_jig_1toh_rmp_model_4.png|thumb|300px|'''Figure 55:''' The ramachandran plot of the fourth model created by 3D-JigSaw for the target PAH and chosen models of the template 1toh.]] |
||
+ | |- |
||
+ | | MODEL_5 |
||
+ | | -547.22 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:pah_jig_1toh_rmp_model_5.png|thumb|300px|'''Figure 56:''' The ramachandran plot of the fifth model created by 3D-JigSaw for the target PAH and chosen models of the template 1toh.]] |
||
+ | |} |
||
+ | |||
+ | =====1ltz - sequence identity < 40 % ===== |
||
+ | [[File:pah_jig_1ltz_Energy.gif|thumb|1000px|center|'''Figure 57:''' A plot of 3D-JigSaw showing the energy of the models in different generations for the target PAH and chosen models of the template 1ltz.]] |
||
+ | |||
+ | |||
+ | {|border="1" |
||
+ | ! MODEL |
||
+ | ! ENERGY |
||
+ | ! COVERAGE |
||
+ | ! START |
||
+ | ! END |
||
+ | ! RAMACHANDRAN PLOT |
||
+ | |- |
||
+ | | MODEL_1 |
||
+ | | -492.57 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1ltz rmp model 1.png|thumb|300px|'''Figure 58:''' The ramachandran plot of the first model created by 3D-JigSaw for the target PAH and chosen models of the template 1ltz.]] |
||
+ | |- |
||
+ | | MODEL_2 |
||
+ | | -490.72 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1ltz rmp model 2.png|thumb|300px|'''Figure 59:''' The ramachandran plot of the second model created by 3D-JigSaw for the target PAH and chosen models of the template 1ltz.]] |
||
+ | |- |
||
+ | | MODEL_3 |
||
+ | | -489.81 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1ltz rmp model 3.png|thumb|300px|'''Figure 60:''' The ramachandran plot of the third model created by 3D-JigSaw for the target PAH and chosen models of the template 1ltz.]] |
||
+ | |- |
||
+ | | MODEL_4 |
||
+ | | -489.75 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1ltz rmp model 4.png|thumb|300px|'''Figure 61:''' The ramachandran plot of the fourth model created by 3D-JigSaw for the target PAH and chosen models of the template 1ltz.]] |
||
+ | |- |
||
+ | | MODEL_5 |
||
+ | | -489.73 |
||
+ | | 1.00 |
||
+ | | 1 |
||
+ | | 452 |
||
+ | | [[File:Pah jig 1ltz rmp model 5.png|thumb|300px|'''Figure 62:''' The ramachandran plot of the first model created by 3D-JigSaw for the target PAH and chosen models of the template 1ltz.]] |
||
+ | |} |
||
=== Comparison to experimental structure === |
=== Comparison to experimental structure === |
||
− | To calculate the C-alpha RMSD we used [http://www.ebi.ac.uk/Tools/dalilite/index.html? DaliLite]. |
+ | To calculate the C-alpha RMSD we used [http://www.ebi.ac.uk/Tools/dalilite/index.html? DaliLite]. Later we changed to the command line tool <code>sap file1.pdb file2.pdb</code>. |
− | To calculate the TM-Score we used the [http://zhanglab.ccmb.med.umich.edu/TM-score/ TM-score webservice] from the University of Michigan. |
+ | To calculate the TM-Score we used the [http://zhanglab.ccmb.med.umich.edu/TM-score/ TM-score webservice] from the University of Michigan alternative. Later we changed to the command line tool <code>TMS model.pdb native.pdb</code>. |
To calculate the RMSD of the 6A radius of the catalytic center we had to first identify the catalytic center. We defined the center position of the catalytic side as the position where our Fe2+ atom is. With the position in hand we now have to extract the residues in a 6A radius around this Fe2+ atom. In order to do so we executed the following steps: |
To calculate the RMSD of the 6A radius of the catalytic center we had to first identify the catalytic center. We defined the center position of the catalytic side as the position where our Fe2+ atom is. With the position in hand we now have to extract the residues in a 6A radius around this Fe2+ atom. In order to do so we executed the following steps: |
||
Line 655: | Line 1,538: | ||
* Then we extracted the selected residues into two objects each object contains only the residues of either the apo/complexed structure or the modeled structure |
* Then we extracted the selected residues into two objects each object contains only the residues of either the apo/complexed structure or the modeled structure |
||
* Then we saved both objects in seperate PDB structures |
* Then we saved both objects in seperate PDB structures |
||
− | * Now we used the [http://blue11.bch.msu.edu/mmtsb/rms.pl rms.pl script] to calculate the all atom RMSD with the following command "./rms.pl -out all first.pdb second.pdb" |
+ | * Now we used the [http://blue11.bch.msu.edu/mmtsb/rms.pl rms.pl script] to calculate the all atom RMSD with the following command "./rms.pl -out all first.pdb second.pdb". This script is already installed in /apps/bin and can be called in the commandline by rms.pl. |
+ | |||
+ | We had a lot of models, therefore it was useful to write a program to do the evaluation automatically ([[pah_hom_autom_eval|sourcecode]]). |
||
==== Modeller ==== |
==== Modeller ==== |
||
Line 676: | Line 1,561: | ||
| 0.9 |
| 0.9 |
||
| 0.9711 |
| 0.9711 |
||
+ | | 0.276 |
||
− | | |
||
|- |
|- |
||
| 1PHZ Chain: A |
| 1PHZ Chain: A |
||
Line 682: | Line 1,567: | ||
| Complexed |
| Complexed |
||
| 0.9 |
| 0.9 |
||
− | | 0. |
+ | | 0.6494 |
+ | | 0.271 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
Line 689: | Line 1,574: | ||
| Apo |
| Apo |
||
| 1.9 |
| 1.9 |
||
− | | 0. |
+ | | 0.6502 |
+ | | 0.282 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
Line 697: | Line 1,582: | ||
| 1.9 |
| 1.9 |
||
| 0.8850 |
| 0.8850 |
||
+ | | 0.242 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
Line 704: | Line 1,589: | ||
| 2.3 |
| 2.3 |
||
| 0.6986 |
| 0.6986 |
||
+ | | 1.638 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
Line 711: | Line 1,596: | ||
| 2.3 |
| 2.3 |
||
| 0.6985 |
| 0.6985 |
||
+ | | 1.581 |
||
− | | |
||
|- |
|- |
||
| Multiple |
| Multiple |
||
Line 718: | Line 1,603: | ||
| 2.1 |
| 2.1 |
||
| 0.1886 |
| 0.1886 |
||
− | | |
+ | | 3.814 |
|- |
|- |
||
| Multiple |
| Multiple |
||
Line 725: | Line 1,610: | ||
| 2.1 |
| 2.1 |
||
| 0.1879 |
| 0.1879 |
||
− | | |
+ | | 3.809 |
|} |
|} |
||
Line 744: | Line 1,629: | ||
| 1.0 |
| 1.0 |
||
| 0.1957 |
| 0.1957 |
||
+ | | 0.298 |
||
− | | |
||
|- |
|- |
||
| 1PHZ Chain: A |
| 1PHZ Chain: A |
||
Line 751: | Line 1,636: | ||
| 1.0 |
| 1.0 |
||
| 0.1954 |
| 0.1954 |
||
+ | | 0.310 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
Line 758: | Line 1,643: | ||
| 1,1 |
| 1,1 |
||
| 0.1592 |
| 0.1592 |
||
+ | | 0.276 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
Line 765: | Line 1,650: | ||
| 1.1 |
| 1.1 |
||
| 0.1592 |
| 0.1592 |
||
+ | | 0.328 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
Line 772: | Line 1,657: | ||
| 1.7 |
| 1.7 |
||
| 0.1021 |
| 0.1021 |
||
+ | | 3.562 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
Line 779: | Line 1,664: | ||
| 1.7 |
| 1.7 |
||
| 0.1020 |
| 0.1020 |
||
+ | | 0.832 |
||
− | | |
||
|} |
|} |
||
Line 853: | Line 1,738: | ||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.645 |
− | | |
+ | | 0.0880 |
+ | | 0.355 |
||
− | | |
||
|- |
|- |
||
| 1PHZ Chain: A |
| 1PHZ Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.615 |
− | | |
+ | | 0.0878 |
+ | | 0.267 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.996 |
− | | |
+ | | 0.0894 |
+ | | 0.227 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.993 |
− | | |
+ | | 0.0894 |
+ | | 0.262 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 1.542 |
− | | |
+ | | 0.0782 |
+ | | 1.204 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 1.542 |
− | | |
+ | | 0.0790 |
+ | | 3.839 |
||
− | | |
||
|} |
|} |
||
Line 910: | Line 1,795: | ||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.722 |
− | | |
+ | | 0.6596 |
+ | | 0.241 |
||
− | | |
||
|- |
|- |
||
| 1PHZ Chain: A |
| 1PHZ Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.698 |
− | | |
+ | | 0.6604 |
+ | | 0.279 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.749 |
− | | |
+ | | 0.6571 |
+ | | 0.299 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.731 |
− | | |
+ | | 0.6578 |
+ | | 0.297 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.644 |
− | | |
+ | | 0.6606 |
+ | | 0.374 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.610 |
− | | |
+ | | 0.6620 |
+ | | 0.351 |
||
− | | |
||
|} |
|} |
||
− | |||
===== Adjusted Alignment Workflow ===== |
===== Adjusted Alignment Workflow ===== |
||
Line 965: | Line 1,849: | ||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.829 |
− | | |
+ | | 0.0903 |
+ | | 0.424 |
||
− | | |
||
|- |
|- |
||
| 1PHZ Chain: A |
| 1PHZ Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.816 |
− | | |
+ | | 0.0909 |
+ | | 0.441 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 0.720 |
− | | |
+ | | 0.0875 |
+ | | 0.449 |
||
− | | |
||
|- |
|- |
||
| 1TOH Chain: A |
| 1TOH Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 0.671 |
− | | |
+ | | 0.0873 |
+ | | 0.416 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| 1J8T |
| 1J8T |
||
| Apo |
| Apo |
||
− | | |
+ | | 1.382 |
− | | |
+ | | 0.0776 |
+ | | 0.700 |
||
− | | |
||
|- |
|- |
||
| 1LTZ Chain: A |
| 1LTZ Chain: A |
||
| 1J8U |
| 1J8U |
||
| Complexed |
| Complexed |
||
− | | |
+ | | 1.372 |
− | | |
+ | | 0.0775 |
+ | | 0.851 |
||
− | | |
||
|} |
|} |
||
− | === |
+ | ==== 3D JigSaw ==== |
+ | |||
+ | ===== 1phz ===== |
||
+ | |||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! Template Structure |
||
+ | ! Compared To |
||
+ | ! Apo/Complexed |
||
+ | ! C-alpha RMSD |
||
+ | ! TM score |
||
+ | ! All Atoms RMSD, 6A |
||
+ | |- |
||
+ | | Model 1 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 1.121 |
||
+ | | 0.6466 |
||
+ | | 0.237 |
||
+ | |- |
||
+ | | Model 1 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 1.106 |
||
+ | | 0.6475 |
||
+ | | 0.439 |
||
+ | |- |
||
+ | | Model 2 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 1.055 |
||
+ | | 0.6422 |
||
+ | | 0.265 |
||
+ | |- |
||
+ | | Model 2 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 1.040 |
||
+ | | 0.6428 |
||
+ | | 0.310 |
||
+ | |- |
||
+ | | Model 3 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 1.035 |
||
+ | | 0.6340 |
||
+ | | 0.261 |
||
+ | |- |
||
+ | | Model 3 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 1.025 |
||
+ | | 0.6347 |
||
+ | | 0.277 |
||
+ | |- |
||
+ | | Model 4 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 1.069 |
||
+ | | 0.6664 |
||
+ | | 0.450 |
||
+ | |- |
||
+ | | Model 4 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 1.063 |
||
+ | | 0.6670 |
||
+ | | 0.512 |
||
+ | |- |
||
+ | | Model 5 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 1.055 |
||
+ | | 0.6422 |
||
+ | | 0.251 |
||
+ | |- |
||
+ | | Model 5 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 1.040 |
||
+ | | 0.6428 |
||
+ | | 0.369 |
||
+ | |} |
||
+ | |||
+ | ===== 1toh ===== |
||
+ | |||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! Template Structure |
||
+ | ! Compared To |
||
+ | ! Apo/Complexed |
||
+ | ! C-alpha RMSD |
||
+ | ! TM score |
||
+ | ! All Atoms RMSD, 6A |
||
+ | |- |
||
+ | | Model 1 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 5.443 |
||
+ | | 0.5090 |
||
+ | | 0.734 |
||
+ | |- |
||
+ | | Model 1 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 5.453 |
||
+ | | 0.5095 |
||
+ | | 0.766 |
||
+ | |- |
||
+ | | Model 2 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 5.443 |
||
+ | | 0.5090 |
||
+ | | 0.734 |
||
+ | |- |
||
+ | | Model 2 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 5.453 |
||
+ | | 0.5095 |
||
+ | | 0.766 |
||
+ | |- |
||
+ | | Model 3 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 4.832 |
||
+ | | 0.4450 |
||
+ | | 1.879 |
||
+ | |- |
||
+ | | Model 3 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 4.838 |
||
+ | | 0.4456 |
||
+ | | 0.276 |
||
+ | |- |
||
+ | | Model 4 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 3.262 |
||
+ | | 0.5717 |
||
+ | | 0.432 |
||
+ | |- |
||
+ | | Model 4 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 3.258 |
||
+ | | 0.5722 |
||
+ | | 0.453 |
||
+ | |- |
||
+ | | Model 5 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 3.137 |
||
+ | | 0.5708 |
||
+ | | 0.432 |
||
+ | |- |
||
+ | | Model 5 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 3.128 |
||
+ | | 0.5715 |
||
+ | | 0.453 |
||
+ | |} |
||
+ | |||
+ | ===== 1ltz ===== |
||
+ | |||
+ | |||
+ | {| border="1" |
||
+ | |- |
||
+ | ! Template Structure |
||
+ | ! Compared To |
||
+ | ! Apo/Complexed |
||
+ | ! C-alpha RMSD |
||
+ | ! TM score |
||
+ | ! All Atoms RMSD, 6A |
||
+ | |- |
||
+ | | Model 1 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 2.538 |
||
+ | | 0.6247 |
||
+ | | 0.344 |
||
+ | |- |
||
+ | | Model 1 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 2.528 |
||
+ | | 0.6255 |
||
+ | | 0.361 |
||
+ | |- |
||
+ | | Model 2 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 2.540 |
||
+ | | 0.6244 |
||
+ | | 0.358 |
||
+ | |- |
||
+ | | Model 2 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 2.531 |
||
+ | | 0.6255 |
||
+ | | 0.385 |
||
+ | |- |
||
+ | | Model 3 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 2.540 |
||
+ | | 0.6244 |
||
+ | | 0.358 |
||
+ | |- |
||
+ | | Model 3 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 2.531 |
||
+ | | 0.6255 |
||
+ | | 0.384 |
||
+ | |- |
||
+ | | Model 4 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 2.541 |
||
+ | | 0.6244 |
||
+ | | 0.349 |
||
+ | |- |
||
+ | | Model 4 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 2.531 |
||
+ | | 0.6255 |
||
+ | | 0.377 |
||
+ | |- |
||
+ | | Model 5 |
||
+ | | 1J8T |
||
+ | | Apo |
||
+ | | 2.540 |
||
+ | | 0.6244 |
||
+ | | 0.349 |
||
+ | |- |
||
+ | | Model 5 |
||
+ | | 1J8U |
||
+ | | Complexed |
||
+ | | 2.531 |
||
+ | | 0.6255 |
||
+ | | 0.377 |
||
+ | |} |
||
+ | |||
+ | == Discussion == |
||
+ | [[Image:Pah_multi_modeller.png|thumb|top|'''Figure 63:''' One of the non-sense models created by modeller with multiple templates]] |
||
+ | |||
+ | Our chosen templates were not that easy for homology modelling. Phenylalanine hydroxylase contains two domains |
||
+ | Biopterin_H and ACT. Only 1phz contains both domains. |
||
+ | All three templates have large gaps in the Biopterin_H domain, especially in 1ltz the first part of the domain |
||
+ | seems to be missing. |
||
+ | |||
+ | Modeller was the fastest method and relatively easy to handle. But it has several drawbacks. |
||
+ | The alignment by Modeller is sometimes far from good. For example the Biopterin_H domain of |
||
+ | 1ltz was splitted by Modeller. This is also reflected by the low TM-Scores of 1ltz and 1toh. |
||
+ | But it was also relatively easy to improve the alignment, such that the alignments are much |
||
+ | more similar to the family prediction by PFAM. For 1phz the alignment was almost perfect and |
||
+ | only small adjustments were necessary. For the other two templates we deleted large parts at |
||
+ | the beginning and at the end of the target, which were not covered by the templates. Most modeling methods tend to do nonsense especially at the beginning |
||
+ | and end of a model, if there is no information given by the template (one kind of possible nonsense is shown |
||
+ | in Figure 63). All three adjusted alignments are shorter than the |
||
+ | target sequence. Therefore the TM-score in the experimental validation is not reliable, because it is dependent of the model length. |
||
+ | Regarding the RMSD and the RMSD of the residues around the Fe in the native structures, the adjustment of the |
||
+ | alignments was at least for 1ltz and 1toh very successful. In our opinion the adjustment of an |
||
+ | alignment is almost always wise, but it is especially crucial for templates with low sequence identity. |
||
+ | We tried to use the automated workflow of Modeller with multiple templates. Even with templates with |
||
+ | high sequence identity to the target, the created model was obviously not reasonable. That is why |
||
+ | we did not use any of these multiple template models. |
||
+ | |||
+ | Swissmodel was run in the server version. The modeling took less than ten minutes. |
||
+ | Regarding the RMSD and the RMSD of the residues around the Fe in the native structures |
||
+ | the usage of the adjusted alignments seems to improve the modeling. |
||
+ | It surprised that Swissmodel performed better for the low identity templates |
||
+ | than Modeller. But created for 1phz a model worse than that of Modeller. |
||
+ | Perhaps Swissmodel does to much in the automatic workflow and loses information |
||
+ | of templates with high sequence identity. |
||
+ | |||
+ | iTasser ran for about 2 days per model. Users are only allowed to upload one |
||
+ | request at a time. Therefore iTasser was no fun. iTasser seems to do a lot of |
||
+ | stuff. Therefore the models were worse with our adjusted alignments. But over all |
||
+ | the models of the automated iTasser workflow are not that good. Adjusted models of |
||
+ | Modeller and SwissModel are better in the RMSD and Fe-RMSD. The TM-Score is always |
||
+ | below the automated workflow models of Modeller and SwissModel. |
||
+ | This result was surprising, because iTasser is known from the CASP contest to be |
||
+ | at the top of the automated prediction servers. Perhaps simple homology modeling |
||
+ | and manually curated alignments are more powerful than these servers. |
||
+ | |||
+ | The results of 3D-jigsaw are quite disappointing. It looks as if 3d-jigsaw was |
||
+ | too much influenced by the iTasser models. Perhaps with another set of models |
||
+ | 3d-jigsaw would be more reasonable. But at least it was not able to distinguish |
||
+ | the "good" models of the "bad" models in the sets and therefore it is doubtful |
||
+ | that in praxis it would perform better. Users usually do not know how good |
||
+ | or bad their models are and that is why they use such a server. The ranking |
||
+ | of the chosen models is not really analyzable, because all the produced models |
||
+ | are almost of the same quality. A cool feature of 3D-JigSaw is the ramachandran |
||
+ | plot, which is calculated for the several models. With the ramachandran plot |
||
+ | one can check if the Phi- and Psi-angles of the peptide-bonds are in defined areas |
||
+ | or how often the model seems to break the restrictions in these angles. |
||
+ | |||
+ | === Qualitative comparison of the best models for each method === |
||
+ | |||
+ | In order to select the top model of each method we had to create a ranking to identify these high scoring models. The ranking for the modeller results were done by first sorting them in descending order by the values of the column GA341. This was done because GA341 is an alignment independent score and can thus be used to compare models from different alignments. However, this was not sufficient enough to create a total order since some of our models had 1.0 for this column. So in order to get a total order we now sorted these values by the score molpdf to get our final ranking. |
||
+ | |||
+ | The final ranking for Modeller is as follows: |
||
+ | |||
+ | {| border="1" |
||
+ | ! Rank |
||
+ | ! Model |
||
+ | ! Molpdf |
||
+ | ! DOPE |
||
+ | ! GA341 |
||
+ | |- |
||
+ | | 1 |
||
+ | | 1ltz with adjusted alignment |
||
+ | | 1197.55518 |
||
+ | | -26429.44531 |
||
+ | | 1 |
||
+ | |- |
||
+ | | 2 |
||
+ | | 1toh with adjusted alignment |
||
+ | | 1790.78723 |
||
+ | | -38113.64453 |
||
+ | | 1 |
||
+ | |- |
||
+ | | 3 |
||
+ | | 1phz with adjusted alignment |
||
+ | | 2567.52441 |
||
+ | | -49139.3125 |
||
+ | | 1 |
||
+ | |- |
||
+ | | 4 |
||
+ | | 1phz |
||
+ | | 2568.91309 |
||
+ | | -53912.11328 |
||
+ | | 1 |
||
+ | |- |
||
+ | | 5 |
||
+ | | 1toh |
||
+ | | 20481.25977 |
||
+ | | -37609.82031 |
||
+ | | 1 |
||
+ | |- |
||
+ | | 6 |
||
+ | | 1ltz |
||
+ | | 6824.36182 |
||
+ | | -37422.13672 |
||
+ | | 0.98868 |
||
+ | |- |
||
+ | | 7 |
||
+ | | multiple alignment |
||
+ | | 61561.77344 |
||
+ | | -29682.17578 |
||
+ | | 0.00012 |
||
+ | |} |
||
+ | |||
+ | |||
+ | For the ranking of our swissmodel results we used the global QMEANscore4. We received the following ranking: |
||
+ | |||
+ | {| border="1" |
||
+ | ! Rank |
||
+ | ! Model |
||
+ | ! QMEANscore4 |
||
+ | |- |
||
+ | | 1 |
||
+ | | 1phz_A |
||
+ | | 0.715 |
||
+ | |- |
||
+ | | 2 |
||
+ | | 1toh_A |
||
+ | | 0.604 |
||
+ | |- |
||
+ | | 3 |
||
+ | | adjusted alignment 1phz_A |
||
+ | | 0.6 |
||
+ | |- |
||
+ | | 4 |
||
+ | | adjusted alignment 1toh_A |
||
+ | | 0.494 |
||
+ | |- |
||
+ | | 5 |
||
+ | | 1ltz_A |
||
+ | | 0.47 |
||
+ | |- |
||
+ | | 6 |
||
+ | | adjusted alignment 1ltz_A |
||
+ | | 0.447 |
||
+ | |} |
||
+ | |||
+ | |||
+ | To rank the resulting models of iTasser we used the C-Score. We received the following ranking: |
||
+ | |||
+ | {| border="1" |
||
+ | ! Rank |
||
+ | ! Model |
||
+ | ! C-Score |
||
+ | ! estimated TM-score |
||
+ | ! estimated RMSD |
||
+ | |- |
||
+ | | 1 |
||
+ | | 1ltz adjusted |
||
+ | | 2.053 |
||
+ | | 0.99±0.04 |
||
+ | | 1.7±1.5Å |
||
+ | |- |
||
+ | | 2 |
||
+ | | 1toh adjusted |
||
+ | | 1.944 |
||
+ | | 0.99±0.04 |
||
+ | | 2.6±1.9Å |
||
+ | |- |
||
+ | | 3 |
||
+ | | 1phz adjusted |
||
+ | | 1.65 |
||
+ | | 0.95±0.05 |
||
+ | | 3.5±2.4Å |
||
+ | |- |
||
+ | | 4 |
||
+ | | 1phz |
||
+ | | 0.15 |
||
+ | | 0.73±0.11 |
||
+ | | 6.8±4.0Å |
||
+ | |- |
||
+ | | 5 |
||
+ | | 1toh |
||
+ | | 0.074 |
||
+ | | 0.72±0.11 |
||
+ | | 6.9±4.1Å |
||
+ | |- |
||
+ | | 6 |
||
+ | | 1ltz |
||
+ | | -0.38 |
||
+ | | 0.66±0.13 |
||
+ | | 7.9±4.4Å |
||
+ | |} |
||
+ | |||
+ | |||
+ | [[Image:3top.png|thumb|top|'''Figure 64:''' Shows the top three models aligned to the experimental structure 1J8T of PAH. Green: Modeller; Blue: 1J8T; Pink: Swissmodel; Yellow: iTasser]] |
||
+ | The top ranking model for Modeller is the model with an template from 1ltz with an adjusted alignment, for Swissmodel 1phz_A and for iTasser 1ltz with an adjusted alignment again. |
||
+ | |||
+ | Surprisingly the models of Swissmodel ranked completely different as the models of Modeller and iTasser. To be more accurate, the order of iTasser and Modeller is exactly the same. On top are the models with the adjusted alignment followed by the models based on the native alignment. Another interesting fact we observed for iTasser and Modeller is that for these methods not the template with the highest sequence identity to our target protein was the best but instead the one with the least sequence identity to our target protein ranked best for them. Swissmodel assigned the model with the highest sequence identity to our target sequence the best score, namely 1phz_A. |
||
+ | |||
+ | |||
+ | If we compare this ranking with the C-alpha RMSD values of 1J8T or 1J8U, our control structures, then we see that the quantitative scores of Modeller (molpdf and GA341) do not really reflect the real similarity to the experimentally solved structure as it should be the case. We get for 1ltz_A with adjusted alignment a 1.7 C-alpha RMSD which is only the 4th best if ranked after C-alpha RMSD. So we may conclude that the scores given are quite useless for real life predictions because normally if someone does homology modelling there is no experimental structure at hand to compare to. So the scientist has to completly rely on the given scores of the method. |
||
+ | |||
+ | The reliability of the QMEANscore4 seems to be more reliable when compared to the C-Alpha RMSD. The top structure by QMEANscore4 is on the second rank when ranked by C-Alpha RMSD values. |
||
+ | |||
+ | The reliability of the C-Score which is given by iTasser seems to be even more unreliable than the Modeller score. The top structure by C-Score is on the last rank (rank 6) when ranked by C-Alpha RMSD values. This makes the iTasser score also really unreliable. |
||
+ | |||
+ | |||
+ | Finally we visualized the top 3 structures of each method with Poymol and aligned them to the reference structure 1J8T (see figure 64). The first obvious observation we could make is that the Swissmodel has an extend chain which is not present in the models of iTasser and Modeller. Which is somehow reasonable since Swissmodels top structure used as a template 1phz_A and, iTasser and Modeller used 1ltz_A. The second observation we could make is that the loop regions are very differently modeled. Also some helices seem to be more variable in their position than others. Overall we can say that at least the catalytic center seems to be quite conserved in all three models. |
Latest revision as of 11:27, 30 August 2011
Contents
- 1 Task description
- 2 Calculation of models
- 3 Evaluation of the calculated models
- 4 Discussion
Task description
The full description of this task can be found here.
In this task we are going to learn more about several methods of homology modelling. There exists only a small number of known protein structures. Therefore if someone wants to predict the structure of a somehow new protein (newly sequenced, a mutant, etc.). He could use known structures to calculate a model of the unknown structure.
Homology modelling bases on an alignment between the target-sequence and one or more template structures. The aligned coordinates are directly used for the model. That is why it is important to select a good template and alignment.
Calculation of models
Overview of available homologous structures
Search
We used hhsearch with the standard parameter to find homologous structures of our protein. The following command was executed:
- ./hhsearch -i reference_pah_aa.fasta -d pdb70.db -b 500 -o hhsearch.out
We received the following hits:
No. | PDB ID | Description | Prob | E-Value | P-Value | Score | SS | Cols | Query HMM | Template HMM | Residues | Sequence Identity |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1phz_A | Protein (phenylalanine | 1 | 1 | 1 | 1084.4 | 0 | 429 | 1-429 | 1-429 | (429) | 92% |
2 | 1j8u_A | Phenylalanine-4-hydroxy | 1 | 1 | 1 | 894.5 | 0 | 325 | 103-427 | 1-325 | (325) | 100% |
3 | 1toh_A | Tyroh tyrosine hydroxy | 1 | 1 | 1 | 890.7 | 0 | 342 | 111-452 | 2-343 | (343) | 60% |
4 | 1mlw_A | Tryptophan 5-monooxygen | 1 | 1 | 1 | 804.2 | 0 | 300 | 116-415 | 2-301 | (301) | 66% |
5 | 1ltz_A | Phenylalanine-4-hydroxy | 1 | 1 | 1 | 504.9 | 0 | 265 | 144-414 | 2-269 | (297) | 30% |
6 | 2v27_A | Phenylalanine hydroxyla | 1 | 1 | 1 | 471.1 | 0 | 254 | 167-424 | 4-271 | (275) | 30% |
7 | 2qmx_A | Prephenate dehydratase; | 1 | 1 | 1 | 70.0 | 0 | 53 | 33-85 | 199-251 | (283) | 40% |
8 | 2qmw_A | PDT prephenate dehydra | 1 | 1 | 1 | 66.1 | 0 | 51 | 35-85 | 190-240 | (267) | 37% |
9 | 3luy_A | Probable chorismate mut | 1 | 1 | 1 | 66.0 | 0 | 53 | 33-85 | 207-259 | (329) | 28% |
10 | 1y7p_A | Hypothetical protein AF | 1 | 1 | 1 | 19.9 | 0 | 38 | 36-73 | 6-43 | (223) | 16% |
Important remark: the scores for Prob, E-Value and P-Value could not be calculated, that is why they are 1.
Template structure selection
We selected the following structures as our template structures:
- > 60% sequence identity: 1phz
- > 40% sequence identity: 1toh
- < 40% sequence identity: 1ltz
Alignment Refinement
We used the reference for a search in PFAM. There were two PFAM-domains detected on the reference sequence: ACT and Biopterin_H Then we used the sequence of the three proteins 1PHZ, 1TOH and 1LTZ to run a search against PFAM and used the alignments with the HMMs of ACT and Biopterin_H and the alignment of the reference sequence with the HMMs of ACT and Biopterin_H as seeds for the improved alignments. The crucial parts of the reference sequence according to the annotation in UniProt was already aligned by the seeds. We predicted the secondary structure of the three sequences of the proteins (TODO ref... seems to be better to predict) and tried to extend the seeds. In PHZ a large gap could be filled by high sequence identity. Afterwards we deleted the unaligned ends of the reference sequence to improve the resulting model. The essential part of the protein, the domains, should now be better modeled.
Seeds by PFAM - ACT:
REFERENCE
SLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKkDEYEFFTHLD-KRSL
1PHZ
SLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYL------
Seeds by PFAM - Biopterin_H
REFERENCE
PWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYK
THACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPE
PDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-S
EKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSE
IGILCSALQK
1PHZ
PWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYK
THACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPE
PDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-S
DKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT-------------
----------
1TOH
PWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYA
THACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPE
PDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSL-S
EEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDE
LHTLAHALS-
1LTZ
------------------------------------------------------PQPLDRYSAEDHATWATLYQRQCKLLP
GRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQE
PDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLdS
ASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQ----------------------------------------------
----------
unadjusted alignment of 1PHZ
REFERENCE
MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF
THLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ
FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF
RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK
LATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVR
NFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK
PSI
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEE
EECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHH
HHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCC
EEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHH
HHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHH
HHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC
1PHZ
------------------GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFF
TYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQ
FADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGF
RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK
LATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVR
TFAATIPRPFSVRYDPYTQRVEVLDNT-------------------------
PSI
------------------CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEE
EEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHH
HHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCC
EEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHH
HHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHH
HHHHHCCCCCCCCCCCCCCEEEECCCC-------------------------
unadjusted alignment of 1TOH
REFERENCE
MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF
THLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ
FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF
RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK
LATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVR
NFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK
PSI
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEE
EECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHH
HHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCC
EEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHH
HHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHH
HHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC
1PHZ
------------------GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFF
TYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQ
FADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGF
RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEK
LATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVR
TFAATIPRPFSVRYDPYTQRVEVLDNT-------------------------
PSI
------------------CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEE
EEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHH
HHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCC
EEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHH
HHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHH
HHHHHCCCCCCCCCCCCCCEEEECCCC-------------------------
unadjusted alignment of 1LTZ
REFERENCE
MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF
THLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ
FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF
RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIE
KLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-SEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEK
VRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK
PSI
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEE
EECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHH
HHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCC
EEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCH HHHH
HHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHH CCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHH
HHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC
1LTZ
-----------------F--------------------------V-----V-----------------------------
--------PDITT-----RKNVG-----LSHDANDFTLP------QPLDR-------YSA--------------------
---------------------EDHATWATLYQRQCKLLPGRACDEFLEGLERLE----VDADRVPDFNKLNEKLMAATGW
KIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKALGALP
MLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLDSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQL---
---FDAD----FAPLY------LQLAD-AQPWG--AGDIAP------DDL--VL
PSI
-----------------C--------------------------C-----C-----------------------------
--------CCCCC-----CCCCC-----CCCCCCCCCCC------CCCCC-------CCH--------------------
---------------------HHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCC----CCCCCCCCCHHHHHHHHHHHCC
EEEECCCCCCHHHHHHHHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHH
HHHHHHEEEEEEEEEECCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHH---
---HHHC----CHHHH------HHHHH-CCCCC--CCCCCC------CCC--CC
adjusted alignment of 1LTZ
REFERENCE
PIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSR
DFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFG
LCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAAT
PSI
CCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHH
HHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEE
EECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHH
1LTZ
PQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDD
VFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEF
GLINTPAGMRIYGAGILSSKSESIYCLdSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQLFDADFAPL
PSI
CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHCCEEEECCCCCCHHHHHH
HHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHEEEEEEEEEE
CCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHCCHHHHHHH
adjusted alignment of 1PHZ
REFERENCE
GQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKIL
RHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVE
YMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGL
AFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGD
SIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYT
QRIEVLDNTQQ
PSI
CCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHH
CCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCC
CCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHC
CCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCC
CEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCC
CEEEECCCHHH
1PHZ
GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYLDKRTKPVLGSIIKSL
RNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVE
YTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGL
AFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEG
DSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDP
YTQRVEVLDNT
PSI
CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEEEEECCCCCCCHHHHHHHH
HCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCC
CCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCCEEEECCCCCCHHHHHHHH
HCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCH-HHHHHHHHHEEEEEEEEEEEEC
CCCEEECCCCCCCCCCCCCCC-CCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHHHHHHHCCCCCCCCCCC
CCCEEEECCCC
adjusted alignment of 1TOH
REFERENCE
VPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSL
YKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMY
TPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYC
LSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSI
NSEIGILCSALQKIK
PSI
CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH
HCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCC
CCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHH
HCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHH
HHHHHHHHHHHHHHC
1TOH
VPWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGL
YATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMH
SPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLH
SL-SEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLE
GVQDELHTLAHALSA
PSI
CCCCCCCCCCCCCC-----------CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH
HHHCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCC
CCCCCHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHCEEEEEEEEEEEEECCCCEECCCCCCCCCCHHCCC
CCCCCCCCCCCHHHHHCCCCCCCCCCCEEEEECCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEECCCCHHHHHHHHHHH
HHHHHHHHHHHHHHC
Homology modeling with Modeller
Modeller uses three files. The first one contains the used alignment in the PIR-format (see PIR FORMAT). This file should at least contain the sequence of the target, the actual alignment can be calculated by Modeller or specified in this file. The second one is a python script, which tells Modeller which steps it has to perform. There are some examples in /apps/modeller9.9/examples/automodel
. And the third file is the pdb-file of the used template. But python seems to be falsely configured on the virtual machines (at least the linux virtual machine with the non-sudo user).
- The used fix of the python installation is described at the software section.
- The Modeller modules were still not importable by Python, that is why it was necessary to reinstall Modeller. The steps for this are described in software section.
In the automated workflow Modeller calculated the alignment by himself (see basic Modeller tutorial). In the adjusted workfow we changed the alignmentfile to match our alignment.
We prepared three files for the modeling with one template(e.g. for 1PHZ).
- The alignment input file, which contains only the reference sequence: pah.ali. This file is the basis for Modellers own alignment to the template.
>P1;PAH
sequence:reference::::::::
MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEEN
DVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKD
TVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPI
PRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQ
FLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGH
VPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLL
SSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATI
PRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK*
- We splitted the python script in order to be able to modify the alignment file, which was created by Modeller in the first part of the script and used by the second part of the script.
- The python script, which tells Modeller to create an alignment
from modeller import *
env = environ()
aln = alignment(env)
mdl = model(env, file='../structures/1phz.pdb', model_segment=('FIRST:A','LAST:A'))
aln.append_model(mdl, align_codes='1phz', atom_files='../structures/1phz.pdb')
aln.append(file='pah.ali', align_codes='PAH')
aln.align2d()
aln.write(file='phz.ali', alignment_format='PIR')
- The python script, which tells Modeller to create a model
from modeller.automodel import *
a = automodel(env, alnfile='phz.ali',
knowns='1phz', sequence='PAH',
assess_methods=(assess.DOPE, assess.GA341))
a.starting_model = 1
a.ending_model = 1
a.make()
For the modeling with multiple templates we used the alignment file described above and a new script (see advanced Modeller tutorial). We used the sturctures of 1MLW, 1TOH, 1PHZ as templates
from modeller import *
from modeller.automodel import *
env = environ()
aln = alignment(env)
path = '../structures/'
l = (path+'1MLW'+'.pdb', path+'1TOH'+'.pdb', path+'1PHZ'+'.pdb')
for (code) in l:
mdl = model(env, file=code, model_segment=('FIRST:A','LAST:A'))
aln.append_model(mdl, align_codes=code, atom_files=code)
aln.append(file='pah.ali', align_codes='PAH')
aln.align2d()
aln.write(file='mult.ali', alignment_format='PIR')
a = automodel(env, alnfile='mult.ali',
knowns=l, sequence='PAH',
assess_methods=(assess.DOPE,
ssess.GA341))
a.starting_model = 1
a.ending_model = 1
a.make()
Homology modeling with Swissmodel
Standard workflow
The standard workflow of Swissmodel is the automated mode. For this mode only the UniProt accession number or the amino acid sequence of the target protein is required. As an optional parameter it is possible to enter the template structure as well. However, if this field is left blank Swissmodel will search automatically for a suitable template.
The input for all three models is as follows:
Category | Template | Target | Image |
---|---|---|---|
> 60% sequence identity | 1PHZ Chain: A | P00439 (Phenylalanine-4-hydroxylase) | |
> 40% sequence identity | 1TOH Chain: A | P00439 (Phenylalanine-4-hydroxylase) | |
< 40% sequence identity | 1LTZ Chain: A | P00439 (Phenylalanine-4-hydroxylase) |
Workflow with own alignment
The workflow of Swissmodel, where you can use your own alignment, is the automated mode. For this mode the alignment has to be specified. Swissmodel accepts the alignment in different formats. We have chosen FASTA. In a new window you have to specify which FASTA id is the target and which FASTA id is the template. For the template you have to specify the corresponding pdb-id with chain. Afterwards you have to check the updated alignment with respect to the pdb structure.
The screenshots of the workflow for 1phz are shown below.
1. Specify the alignment | 2. Specify target and template with structure | 3. Checking of the adjusted alignment |
---|---|---|
Homology modelling with iTasser
ITasser is a prediction server, which participated in several CASP competitions. It claims of itself to be the best one. The runtime of the jobs is approximately 24 to 48 hours. The server seems to receive a lot of jobs, that is why it is not allowed to add more than one job at a time from the same ip-address. Therefore it is probably not possible to run all six variants described in the task.
iTasser offers several options. For the automated workflow we just specified a template to be used (a screenshot can be seen in figure 1). For the workflow with our adjusted alignment we specified the template with alignment (a screenshot can be seen in figure 2).
The file which was needed for the workflow with the adjusted alignment contains the alignment in FASTA format and
the ATOM coordinates of the template:
- >TARGET
GQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTN
IIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNY
RHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLR
PVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYI
EKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFN
DAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQ
>1phy:A
GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFFTYLDKRTKPVLGS
IIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNY
RHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLR
PVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EY
IEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAES
FSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT
ATOM 1 N GLY A 19 40.338 -7.649 17.712 1.00117.94 N
ATOM 2 CA GLY A 19 40.523 -8.622 16.594 1.00119.56 C
ATOM 3 C GLY A 19 39.585 -8.344 15.430 1.00119.47 C
ATOM 4 O GLY A 19 39.005 -9.256 14.849 1.00122.09 O
...
ATOM 3284 OG1 THR A 427 12.515 -6.231 40.981 1.00107.96 O
ATOM 3285 CG2 THR A 427 14.463 -5.654 39.679 1.00107.51 C
END
Model-refinement with 3D-JigSaw
Without a reference structure it is hard to say which predicted model is the best. There are methods and servers, which try to rank several models. If their performance would be perfect, the modeling of proteins was a solved problem. But their performance is sometimes far from perfect. One of these severs is 3D-Jigsaw. In fact 3D-Jigsaw is a complete protein modeling server which takes the models and tries to improve them and rank them afterwards.
The ranking of the models needs a lot of resources on the JigSaw server. Our goal is to select five of the six models for each group. Therefore we try to kick out models which are too similar. That is why we calculated the pairwise TMScore for the different models of one prediction class and selected only one member of a high-scoring pair.
Selection of models
1phz - sequence identity > 60 %
modeller | modeller adjusted | swissmodel | swissmodel adjusted | iTasser | iTasser adjusted | |
---|---|---|---|---|---|---|
modeller | 1.0000 | 0.0941 | 0.9773 | 0.0933 | 0.8911 | 0.0958 |
modeller adjusted | 0.0885 | 1.0000 | 0.0918 | 0.9808 | 0.0887 | 0.8141 |
swissmodel | 0.8845 | 0.0915 | 1.0000 | 0.0906 | 0.8883 | 0.0942 |
swissmodel adjusted | 0.0879 | 0.9808 | 0.0909 | 1.0000 | 0.0877 | 0.8181 |
iTasser | 0.8911 | 0.0937 | 0.9815 | 0.0932 | 1.0000 | 0.0959 |
iTasser adjusted | 0.0902 | 0.8141 | 0.0945 | 0.8181 | 0.0898 | 1.0000 |
The similarity between the models of modeller with adjusted alignment and swissmodel with adjusted alignment is too high. We decided to kick the model of modeller.
1toh - sequence identity > 40 %
modeller | modeller adjusted | swissmodel | swissmodel adjusted | iTasser | iTasser adjusted | |
---|---|---|---|---|---|---|
modeller | 1.0000 | 0.0941 | 0.7839 | 0.0929 | 0.5782 | 0.0910 |
modeller adjusted | 0.0759 | 1.0000 | 0.0928 | 0.9379 | 0.0812 | 0.7005 |
swissmodel | 0.5818 | 0.0926 | 1.0000 | 0.0893 | 0.6602 | 0.0878 |
swissmodel adjusted | 0.0750 | 0.9379 | 0.0895 | 1.0000 | 0.0792 | 0.7118 |
iTasser | 0.5782 | 0.1008 | 0.8902 | 0.0980 | 1.0000 | 0.0961 |
iTasser adjusted | 0.0733 | 0.7005 | 0.0881 | 0.7118 | 0.0779 | 1.0000 |
The similarity between the models of modeller with adjusted alignment and swissmodel with adjusted alignment is too high. We decided to kick the model of modeller.
1ltz - sequence identity < 40 %
modeller | modeller adjusted | swissmodel | swissmodel adjusted | iTasser | |
---|---|---|---|---|---|
modeller | 1.0000 | 0.0898 | 0.8383 | 0.0889 | 0.4521 |
modeller adjusted | 0.0589 | 1.0000 | 0.0661 | 0.9722 | 0.0592 |
swissmodel | 0.4601 | 0.0695 | 1.0000 | 0.0689 | 0.4566 |
swissmodel adjusted | 0.0577 | 0.9722 | 0.0656 | 1.0000 | 0.0601 |
iTasser | 0.4521 | 0.0941 | 0.8266 | 0.0951 | 1.0000 |
Evaluation of the calculated models
Selection of the reference structures
We had the following choice of reference structures for PAH:
Entry | Method | Resolution (A) | Chain | Positions |
---|---|---|---|---|
1DMW | X-Ray | 2.00 | A | 118-424 |
1J8T | X-Ray | 1.70 | A | 103-427 |
1J8U | X-Ray | 1.50 | A | 103-427 |
1KW0 | X-Ray | 2.50 | A | 103-427 |
1LRM | X-Ray | 2.10 | A | 103-427 |
1MMK | X-Ray | 2.00 | A | 103-427 |
1MMT | X-Ray | 2.00 | A | 103-427 |
1PAH | X-Ray | 2.00 | A | 117-424 |
1TDW | X-Ray | 2.10 | A | 117-424 |
1TG2 | X-Ray | 2.20 | A | 117-424 |
2PAH | X-Ray | 3.10 | A/B | 118-452 |
3PAH | X-Ray | 2.00 | A | 117-424 |
4PAH | X-Ray | 2.00 | A | 117-424 |
5PAH | X-Ray | 2.10 | A | 117-424 |
6PAH | X-Ray | 2.15 | A | 117-424 |
All these structures have in common that they did not solve the structure of the whole PAH protein. In addition, there is no complete true apo structure available either. All structures have at least a Fe2+ atom bound. So we defined these structures as our apo structure.
Finally, we decided to select 1J8T (apo) and 1J8U (complexed). As mentioned before our apo structure has complexed Fe2+ and our complexed structure is complexed with Fe2+ and BH4 (5,6,7,8-TETRAHYDROBIOPTERIN). The reason for our decision was that both structures are solved from the same group which somehow guaranties a more consistent methodology as if we had selected structures from two different groups. Another reason is the resolution, both structures are the two with the best resolved resolution which is 1.5 Angstrom and 1.7 Angstrom for 1J8U and 1J8T respectively. Finally for more easy comparison, both structures include the same range of amino acids which is from 103 to 427.
Numeric evaluation of the calculated models
Modeller
Description of the quantitative Modeller scores
molpdf (molecular PDF): This measure is the sum of all restrains and is the standard score of Modeller. This score is not absolute which means it can be only used for ranking models from the same alignment. Lower means better for this score. (Source:[1] )
DOPE (Discrete Optimized Protein Energy): "this statistical potential is based on an improved reference state that corresponds to non interacting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures". It is an absolute score. Lower means better for this score. (Source: [2])
GA341: "The method uses the percentage sequence identity between the template and the model as a parameter." The range is from 0 to 1. Higher means better. However, this method is not as good as the other described methods to judge the quality of a model. (Source: [3] and [4])
Results
We have chosen three scores to be calculated by Modeller: molpdf, DOPE and GA341
model | molpdf | DOPE | GA341 |
---|---|---|---|
1phz | 2568.91309 | -53912.11328 | 1.00000 |
1toh | 20481.25977 | -37609.82031 | 1.00000 |
1ltz | 6824.36182 | -37422.13672 | 0.98868 |
1phz with adjusted alignment | 2567.52441 | -49139.31250 | 1.00000 |
1toh with adjusted alignment | 1790.78723 | -38113.64453 | 1.00000 |
1ltz with adjusted alignment | 1197.55518 | -26429.44531 | 1.00000 |
multiple alignment | 61561.77344 | -29682.17578 | 0.00012 |
Swissmodel
Description of the quantitative scores of Swissmodel
QMEAN Z-Score: The QMEAN Z-Scores is basically a score which compares the QMEAN global score of our model with experimentally solved structures (e.g. by X-Ray) of approximately the same size (a difference of +/- 10% is allowed). Higher means better for this score. (Source: [5] and [6])
QMEANscore4: This is the QMEAN global score to judge the overall quality of the calculated model. The score values range from 0 to 1, higher means better. This score basically comprehends information from sub scores which focus on one aspect which are: C_beta interaction energy, all-atom pairwise energy, solvation energy and torsion angle energy. (Source: [7])
Estimated absolute model quality: The first plot in this column shows with grey and black circles the QMEANscore4 of experimentally solved structures. The red cross marks the position of our model in this plot. Hence, it is possible to compare our score to other scores of experimental structures of same size. The second plot of this column is a density plot and shows the distribution of the QMEANscore4 of all experimental structures which were used to calculate the Z-Score. The red line marks the position of our model in this plot (Source: [8])
Score components: This plot shows the Z-Score of different quality measures of a the model. Large negative Z-scores indicate that the quality for this aspect of the model is worse. (Source: [9])
Coloring by residue error: The figure in this column shows the estimated residue error along the 3D structure of the modeled target sequence. Blue means that the residue error is smaller than 1 Angstrom and red parts indicate a residue error greater than > 3.5 Angstrom. (Source: [10])
Residue error plot: "model energy profile with estimated residue errors along the sequence" (Source: [11])
C_beta interaction energy: this value asses the long range interactions, the raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [12])
C_beta interaction energy: this value asses the long range interactions by only using the C-beta atoms of each residue. The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [13])
All-atom pairwise energy: this value asses the long range interactions by using all atoms of each residue. The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [14])
Solvation energy: "A solvation potential investigates the burial status of the residues." The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [15])
Torsion angle energy: This energy score investigates the local geometry by a torsion angle potential over three consecutive residues. The raw score is a pseudo statistical energy score (lower means better here) and the Z-Scores compares the raw score of our model to experimental structures of same size (higher means better here). (Source: [16])
Anolea: "The atomic empirical mean force potential ANOLEA (Melo et al.) is used to assess packing quality of the models. The program performs energy calculations on a protein chain, evaluating the "Non- Local Environment" (NLE) of each heavy atom in the molecule. The y-axis of the plot represents the energy for each amino acid of the protein chain. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid. " (Source: [17])
QMEAN: "is a composite scoring function for both the estimation of the global quality of the entire model as well as for the local per-residue analysis of different regions within a model." (Source: [18])
Gromos: "GROMOS (van Gunsteren et al.) is a general-purpose molecular dynamics computer simulation package for the study of biomolecular systems and can be applied to the analysis of conformations obtained by experiment or by computer simulation.
The y-axis of the plot represents the energy for each amino acid of the protein chain. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid. " (Source: [19])
Automated Mode: Modelling with template structure 1phz_A (>60%)
QMEAN Z-Score: -0.828
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.715 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -157.98 | -0.16 |
All-atom pairwise energy | -12503.57 | -0.1 |
Solvation energy | -50.66 | 1.01 |
Torsion angle energy | -78.77 | -1.48 |
QMEAN4 score | 0.715 | -0.83 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Automated Mode: Modelling with template structure 1toh_A (>40%)
QMEAN Z-Score: -2.745
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.604 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -78.67 | -1.16 |
All-atom pairwise energy | -7899.47 | -0.89 |
Solvation energy | -20.56 | -1.3 |
Torsion angle energy | -47.34 | -2.22 |
QMEAN4 score | 0.604 | -2.74 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Automated Mode: Modelling with template structure 1ltz_A (<40%)
QMEAN Z-Score: -4.282
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.47 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -40.25 | -2.1 |
All-atom pairwise energy | -3528.81 | -2.29 |
Solvation energy | -15.22 | -1.18 |
Torsion angle energy | -4.99 | -3.78 |
QMEAN4 score | 0.47 | -4.28 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Alignment Mode: Modeling with adjusted alignment 1phz_A (>60%)
QMEAN Z-Score: -2.786
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.6 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -87.10 | -1.26 |
All-atom pairwise energy | -8126.09 | -1.51 |
Solvation energy | -29.98 | -0.84 |
Torsion angle energy | -58.30 | -2.39 |
QMEAN4 score | 0.600 | -2.79 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Alignment Mode: Modeling with adjusted alignment 1toh_A (>40%)
QMEAN Z-Score: -4.498
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.494 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -21.70 | -2.19 |
All-atom pairwise energy | -4082.14 | -2.32 |
Solvation energy | -3.04 | -3.04 |
Torsion angle energy | -34.43 | -2.83 |
QMEAN4 score | 0.494 | -4.50 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Alignment Mode: Modeling with adjusted alignment 1ltz_A (<40%)
QMEAN Z-Score: -4.673
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.447 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -18.68 | -2.69 |
All-atom pairwise energy | -1884.30 | -2.99 |
Solvation energy | -9.58 | -1.72 |
Torsion angle energy | -7.34 | -3.55 |
QMEAN4 score | 0.447 | -4.67 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
iTasser
iTasser produces up to five models. For each model it calculates a C-Score. For the best model an estimation of the TM-Score and the RMSD with respect to the native structure are calculated.
"C-score is a confidence score for estimating the quality of predicted models by I-TASSER. It is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. C-score is typically in the range of [-5,2], where a C-score of higher value signifies a model with a high confidence and vice-versa." (source: scores by iTasser)
Model | C-Score | estimated TM-score | estimated RMSD |
1phz | 0.150 | 0.73±0.11 | 6.8±4.0Å |
1toh | 0.074 | 0.72±0.11 | 6.9±4.1Å |
1ltz | -0.380 | 0.66±0.13 | 7.9±4.4Å |
1phz adjusted | 1.650 | 0.95±0.05 | 3.5±2.4Å |
1toh adjusted | 1.944 | 0.99±0.04 | 2.6±1.9Å |
1ltz adjusted | 2.053 | 0.99±0.04 | 1.7±1.5Å |
The upper and lower bounds of the RMSD and TM-Score estimations of the unadjusted alignments are between nonsense and quite good. It is hard to say, if the model is good regarding only these vague estimations. For the adjusted alignments the estimated deviations of these scores are much smaller. Regarding these scores the models of the adjusted alignments were highly improved.
3D-JigSaw
3D-JigSaw works in iterations. In each iteration new models are calculated.
In the output 3D-JigSaw gives a plot of the energy of the calculated models in the different generations (this seems to be normalized, such that the lowes predicted energy is mapped to zero). Therefore one can see in which generation the models hit the the zero energy. This can be regarded as a measure of the difficulty to model the target sequence with the given models. The easier the earliar this value will be reached.
For each selected model in the final output, 3D-JigSaw gives the energy of the model itself, the coverage of the model and the target sequence, where the model starts in the target sequence and where it ends.
Additionally 3D-JigSaw calculates the ramachandran plot for each of the final models. This plot shows how often it was not possible to model a peptide bond, such that its phi- and psi-angle lie in the defined areas of the ramachandran plot. Together with the predicted energy of the model the user can estimate the quality of the model.
In one run the energies of the final models are quite the same. Comparing the different runs the models of 1phz have lower energies than the models of 1toh, which have lower energies than the models of 1ltz. Which indicates that it was possible to calculate more accurate models with initial models to templates of higher sequence identity.
1phz - sequence identity > 60 %
MODEL | ENERGY | COVERAGE | START | END | RAMACHANDRAN PLOT |
---|---|---|---|---|---|
MODEL_1 | -638.18 | 1.00 | 1 | 452 | |
MODEL_2 | -637.27 | 1.00 | 1 | 452 | |
MODEL_3 | -634.89 | 1.00 | 1 | 452 | |
MODEL_4 | -634.77 | 0.96 | 19 | 452 | |
MODEL_5 | -632.78 | 1.00 | 1 | 452 |
1toh - sequence identity > 40 %
MODEL | ENERGY | COVERAGE | START | END | RAMACHANDRAN PLOT |
---|---|---|---|---|---|
MODEL_1 | -557.84 | 1.00 | 1 | 452 | |
MODEL_2 | -557.83 | 1.00 | 1 | 452 | |
MODEL_3 | -556.11 | 1.00 | 1 | 451 | |
MODEL_4 | -548.57 | 1.00 | 1 | 452 | |
MODEL_5 | -547.22 | 1.00 | 1 | 452 |
1ltz - sequence identity < 40 %
MODEL | ENERGY | COVERAGE | START | END | RAMACHANDRAN PLOT |
---|---|---|---|---|---|
MODEL_1 | -492.57 | 1.00 | 1 | 452 | |
MODEL_2 | -490.72 | 1.00 | 1 | 452 | |
MODEL_3 | -489.81 | 1.00 | 1 | 452 | |
MODEL_4 | -489.75 | 1.00 | 1 | 452 | |
MODEL_5 | -489.73 | 1.00 | 1 | 452 |
Comparison to experimental structure
To calculate the C-alpha RMSD we used DaliLite. Later we changed to the command line tool sap file1.pdb file2.pdb
.
To calculate the TM-Score we used the TM-score webservice from the University of Michigan alternative. Later we changed to the command line tool TMS model.pdb native.pdb
.
To calculate the RMSD of the 6A radius of the catalytic center we had to first identify the catalytic center. We defined the center position of the catalytic side as the position where our Fe2+ atom is. With the position in hand we now have to extract the residues in a 6A radius around this Fe2+ atom. In order to do so we executed the following steps:
- We opened the complexed or apo structure and one of the modeled structures with Pymol.
- Then we aligned both structures to each other
- Then we selected the Fe2 atom of the apo/complexed structure and expanded this selection by 6A, residue
- Then we extracted the selected residues into two objects each object contains only the residues of either the apo/complexed structure or the modeled structure
- Then we saved both objects in seperate PDB structures
- Now we used the rms.pl script to calculate the all atom RMSD with the following command "./rms.pl -out all first.pdb second.pdb". This script is already installed in /apps/bin and can be called in the commandline by rms.pl.
We had a lot of models, therefore it was useful to write a program to do the evaluation automatically (sourcecode).
Modeller
Standard Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.9 | 0.9711 | 0.276 |
1PHZ Chain: A | 1J8U | Complexed | 0.9 | 0.6494 | 0.271 |
1TOH Chain: A | 1J8T | Apo | 1.9 | 0.6502 | 0.282 |
1TOH Chain: A | 1J8U | Complexed | 1.9 | 0.8850 | 0.242 |
1LTZ Chain: A | 1J8T | Apo | 2.3 | 0.6986 | 1.638 |
1LTZ Chain: A | 1J8U | Complexed | 2.3 | 0.6985 | 1.581 |
Multiple | 1J8T | Apo | 2.1 | 0.1886 | 3.814 |
Multiple | 1J8U | Complexed | 2.1 | 0.1879 | 3.809 |
Adjusted Alignment Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 1.0 | 0.1957 | 0.298 |
1PHZ Chain: A | 1J8U | Complexed | 1.0 | 0.1954 | 0.310 |
1TOH Chain: A | 1J8T | Apo | 1,1 | 0.1592 | 0.276 |
1TOH Chain: A | 1J8U | Complexed | 1.1 | 0.1592 | 0.328 |
1LTZ Chain: A | 1J8T | Apo | 1.7 | 0.1021 | 3.562 |
1LTZ Chain: A | 1J8U | Complexed | 1.7 | 0.1020 | 0.832 |
Swissmodel
Standard Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.9 | 0.7400 | 0.5162 |
1PHZ Chain: A | 1J8U | Complexed | 0.9 | 0.7408 | 0.5154 |
1TOH Chain: A | 1J8T | Apo | 1.3 | 0.8889 | 0.4616 |
1TOH Chain: A | 1J8U | Complexed | 1.2 | 0.8894 | 0.3361 |
1LTZ Chain: A | 1J8T | Apo | 2.3 | 0.8816 | 0.9225 |
1LTZ Chain: A | 1J8U | Complexed | 2.3 | 0.8814 | 0.9208 |
Adjusted Alignment Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.645 | 0.0880 | 0.355 |
1PHZ Chain: A | 1J8U | Complexed | 0.615 | 0.0878 | 0.267 |
1TOH Chain: A | 1J8T | Apo | 0.996 | 0.0894 | 0.227 |
1TOH Chain: A | 1J8U | Complexed | 0.993 | 0.0894 | 0.262 |
1LTZ Chain: A | 1J8T | Apo | 1.542 | 0.0782 | 1.204 |
1LTZ Chain: A | 1J8U | Complexed | 1.542 | 0.0790 | 3.839 |
iTasser
Standard Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.722 | 0.6596 | 0.241 |
1PHZ Chain: A | 1J8U | Complexed | 0.698 | 0.6604 | 0.279 |
1TOH Chain: A | 1J8T | Apo | 0.749 | 0.6571 | 0.299 |
1TOH Chain: A | 1J8U | Complexed | 0.731 | 0.6578 | 0.297 |
1LTZ Chain: A | 1J8T | Apo | 0.644 | 0.6606 | 0.374 |
1LTZ Chain: A | 1J8U | Complexed | 0.610 | 0.6620 | 0.351 |
Adjusted Alignment Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.829 | 0.0903 | 0.424 |
1PHZ Chain: A | 1J8U | Complexed | 0.816 | 0.0909 | 0.441 |
1TOH Chain: A | 1J8T | Apo | 0.720 | 0.0875 | 0.449 |
1TOH Chain: A | 1J8U | Complexed | 0.671 | 0.0873 | 0.416 |
1LTZ Chain: A | 1J8T | Apo | 1.382 | 0.0776 | 0.700 |
1LTZ Chain: A | 1J8U | Complexed | 1.372 | 0.0775 | 0.851 |
3D JigSaw
1phz
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
Model 1 | 1J8T | Apo | 1.121 | 0.6466 | 0.237 |
Model 1 | 1J8U | Complexed | 1.106 | 0.6475 | 0.439 |
Model 2 | 1J8T | Apo | 1.055 | 0.6422 | 0.265 |
Model 2 | 1J8U | Complexed | 1.040 | 0.6428 | 0.310 |
Model 3 | 1J8T | Apo | 1.035 | 0.6340 | 0.261 |
Model 3 | 1J8U | Complexed | 1.025 | 0.6347 | 0.277 |
Model 4 | 1J8T | Apo | 1.069 | 0.6664 | 0.450 |
Model 4 | 1J8U | Complexed | 1.063 | 0.6670 | 0.512 |
Model 5 | 1J8T | Apo | 1.055 | 0.6422 | 0.251 |
Model 5 | 1J8U | Complexed | 1.040 | 0.6428 | 0.369 |
1toh
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
Model 1 | 1J8T | Apo | 5.443 | 0.5090 | 0.734 |
Model 1 | 1J8U | Complexed | 5.453 | 0.5095 | 0.766 |
Model 2 | 1J8T | Apo | 5.443 | 0.5090 | 0.734 |
Model 2 | 1J8U | Complexed | 5.453 | 0.5095 | 0.766 |
Model 3 | 1J8T | Apo | 4.832 | 0.4450 | 1.879 |
Model 3 | 1J8U | Complexed | 4.838 | 0.4456 | 0.276 |
Model 4 | 1J8T | Apo | 3.262 | 0.5717 | 0.432 |
Model 4 | 1J8U | Complexed | 3.258 | 0.5722 | 0.453 |
Model 5 | 1J8T | Apo | 3.137 | 0.5708 | 0.432 |
Model 5 | 1J8U | Complexed | 3.128 | 0.5715 | 0.453 |
1ltz
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
Model 1 | 1J8T | Apo | 2.538 | 0.6247 | 0.344 |
Model 1 | 1J8U | Complexed | 2.528 | 0.6255 | 0.361 |
Model 2 | 1J8T | Apo | 2.540 | 0.6244 | 0.358 |
Model 2 | 1J8U | Complexed | 2.531 | 0.6255 | 0.385 |
Model 3 | 1J8T | Apo | 2.540 | 0.6244 | 0.358 |
Model 3 | 1J8U | Complexed | 2.531 | 0.6255 | 0.384 |
Model 4 | 1J8T | Apo | 2.541 | 0.6244 | 0.349 |
Model 4 | 1J8U | Complexed | 2.531 | 0.6255 | 0.377 |
Model 5 | 1J8T | Apo | 2.540 | 0.6244 | 0.349 |
Model 5 | 1J8U | Complexed | 2.531 | 0.6255 | 0.377 |
Discussion
Our chosen templates were not that easy for homology modelling. Phenylalanine hydroxylase contains two domains Biopterin_H and ACT. Only 1phz contains both domains. All three templates have large gaps in the Biopterin_H domain, especially in 1ltz the first part of the domain seems to be missing.
Modeller was the fastest method and relatively easy to handle. But it has several drawbacks. The alignment by Modeller is sometimes far from good. For example the Biopterin_H domain of 1ltz was splitted by Modeller. This is also reflected by the low TM-Scores of 1ltz and 1toh. But it was also relatively easy to improve the alignment, such that the alignments are much more similar to the family prediction by PFAM. For 1phz the alignment was almost perfect and only small adjustments were necessary. For the other two templates we deleted large parts at the beginning and at the end of the target, which were not covered by the templates. Most modeling methods tend to do nonsense especially at the beginning and end of a model, if there is no information given by the template (one kind of possible nonsense is shown in Figure 63). All three adjusted alignments are shorter than the target sequence. Therefore the TM-score in the experimental validation is not reliable, because it is dependent of the model length. Regarding the RMSD and the RMSD of the residues around the Fe in the native structures, the adjustment of the alignments was at least for 1ltz and 1toh very successful. In our opinion the adjustment of an alignment is almost always wise, but it is especially crucial for templates with low sequence identity. We tried to use the automated workflow of Modeller with multiple templates. Even with templates with high sequence identity to the target, the created model was obviously not reasonable. That is why we did not use any of these multiple template models.
Swissmodel was run in the server version. The modeling took less than ten minutes. Regarding the RMSD and the RMSD of the residues around the Fe in the native structures the usage of the adjusted alignments seems to improve the modeling. It surprised that Swissmodel performed better for the low identity templates than Modeller. But created for 1phz a model worse than that of Modeller. Perhaps Swissmodel does to much in the automatic workflow and loses information of templates with high sequence identity.
iTasser ran for about 2 days per model. Users are only allowed to upload one request at a time. Therefore iTasser was no fun. iTasser seems to do a lot of stuff. Therefore the models were worse with our adjusted alignments. But over all the models of the automated iTasser workflow are not that good. Adjusted models of Modeller and SwissModel are better in the RMSD and Fe-RMSD. The TM-Score is always below the automated workflow models of Modeller and SwissModel. This result was surprising, because iTasser is known from the CASP contest to be at the top of the automated prediction servers. Perhaps simple homology modeling and manually curated alignments are more powerful than these servers.
The results of 3D-jigsaw are quite disappointing. It looks as if 3d-jigsaw was too much influenced by the iTasser models. Perhaps with another set of models 3d-jigsaw would be more reasonable. But at least it was not able to distinguish the "good" models of the "bad" models in the sets and therefore it is doubtful that in praxis it would perform better. Users usually do not know how good or bad their models are and that is why they use such a server. The ranking of the chosen models is not really analyzable, because all the produced models are almost of the same quality. A cool feature of 3D-JigSaw is the ramachandran plot, which is calculated for the several models. With the ramachandran plot one can check if the Phi- and Psi-angles of the peptide-bonds are in defined areas or how often the model seems to break the restrictions in these angles.
Qualitative comparison of the best models for each method
In order to select the top model of each method we had to create a ranking to identify these high scoring models. The ranking for the modeller results were done by first sorting them in descending order by the values of the column GA341. This was done because GA341 is an alignment independent score and can thus be used to compare models from different alignments. However, this was not sufficient enough to create a total order since some of our models had 1.0 for this column. So in order to get a total order we now sorted these values by the score molpdf to get our final ranking.
The final ranking for Modeller is as follows:
Rank | Model | Molpdf | DOPE | GA341 |
---|---|---|---|---|
1 | 1ltz with adjusted alignment | 1197.55518 | -26429.44531 | 1 |
2 | 1toh with adjusted alignment | 1790.78723 | -38113.64453 | 1 |
3 | 1phz with adjusted alignment | 2567.52441 | -49139.3125 | 1 |
4 | 1phz | 2568.91309 | -53912.11328 | 1 |
5 | 1toh | 20481.25977 | -37609.82031 | 1 |
6 | 1ltz | 6824.36182 | -37422.13672 | 0.98868 |
7 | multiple alignment | 61561.77344 | -29682.17578 | 0.00012 |
For the ranking of our swissmodel results we used the global QMEANscore4. We received the following ranking:
Rank | Model | QMEANscore4 |
---|---|---|
1 | 1phz_A | 0.715 |
2 | 1toh_A | 0.604 |
3 | adjusted alignment 1phz_A | 0.6 |
4 | adjusted alignment 1toh_A | 0.494 |
5 | 1ltz_A | 0.47 |
6 | adjusted alignment 1ltz_A | 0.447 |
To rank the resulting models of iTasser we used the C-Score. We received the following ranking:
Rank | Model | C-Score | estimated TM-score | estimated RMSD |
---|---|---|---|---|
1 | 1ltz adjusted | 2.053 | 0.99±0.04 | 1.7±1.5Å |
2 | 1toh adjusted | 1.944 | 0.99±0.04 | 2.6±1.9Å |
3 | 1phz adjusted | 1.65 | 0.95±0.05 | 3.5±2.4Å |
4 | 1phz | 0.15 | 0.73±0.11 | 6.8±4.0Å |
5 | 1toh | 0.074 | 0.72±0.11 | 6.9±4.1Å |
6 | 1ltz | -0.38 | 0.66±0.13 | 7.9±4.4Å |
The top ranking model for Modeller is the model with an template from 1ltz with an adjusted alignment, for Swissmodel 1phz_A and for iTasser 1ltz with an adjusted alignment again.
Surprisingly the models of Swissmodel ranked completely different as the models of Modeller and iTasser. To be more accurate, the order of iTasser and Modeller is exactly the same. On top are the models with the adjusted alignment followed by the models based on the native alignment. Another interesting fact we observed for iTasser and Modeller is that for these methods not the template with the highest sequence identity to our target protein was the best but instead the one with the least sequence identity to our target protein ranked best for them. Swissmodel assigned the model with the highest sequence identity to our target sequence the best score, namely 1phz_A.
If we compare this ranking with the C-alpha RMSD values of 1J8T or 1J8U, our control structures, then we see that the quantitative scores of Modeller (molpdf and GA341) do not really reflect the real similarity to the experimentally solved structure as it should be the case. We get for 1ltz_A with adjusted alignment a 1.7 C-alpha RMSD which is only the 4th best if ranked after C-alpha RMSD. So we may conclude that the scores given are quite useless for real life predictions because normally if someone does homology modelling there is no experimental structure at hand to compare to. So the scientist has to completly rely on the given scores of the method.
The reliability of the QMEANscore4 seems to be more reliable when compared to the C-Alpha RMSD. The top structure by QMEANscore4 is on the second rank when ranked by C-Alpha RMSD values.
The reliability of the C-Score which is given by iTasser seems to be even more unreliable than the Modeller score. The top structure by C-Score is on the last rank (rank 6) when ranked by C-Alpha RMSD values. This makes the iTasser score also really unreliable.
Finally we visualized the top 3 structures of each method with Poymol and aligned them to the reference structure 1J8T (see figure 64). The first obvious observation we could make is that the Swissmodel has an extend chain which is not present in the models of iTasser and Modeller. Which is somehow reasonable since Swissmodels top structure used as a template 1phz_A and, iTasser and Modeller used 1ltz_A. The second observation we could make is that the loop regions are very differently modeled. Also some helices seem to be more variable in their position than others. Overall we can say that at least the catalytic center seems to be quite conserved in all three models.