Difference between revisions of "Task 4: Homology based structure predictions"
(→Homology modelling with Modeller) |
(→Calculation of models) |
||
Line 190: | Line 190: | ||
* > 40% sequence identity: 1toh |
* > 40% sequence identity: 1toh |
||
* < 40% sequence identity: 1ltz |
* < 40% sequence identity: 1ltz |
||
+ | |||
+ | === Alignment Refinement === |
||
+ | |||
+ | We used the reference for a search in PFAM. There were two PFAM-domains detected on the reference sequence: ACT and Biopterin_H |
||
+ | Then we used the sequence of the three proteins 1PHZ, 1TOH and 1LTZ to run a search against PFAM and used the alignments |
||
+ | with the HMMs of ACT and Biopterin_H and the alignment of the reference sequence with the HMMs of ACT and Biopterin_H |
||
+ | as seeds for the improved alignments. |
||
+ | The crucial parts of the reference sequence according to the annotation in UniProt was already aligned by the seeds. |
||
+ | We predicted the secondary structure of the three sequences of the proteins (TODO ref... seems to be better to predict) and |
||
+ | tried to extend the seeds. In PHZ a large gap could be filled by high sequence identity. Afterwards we deleted the unaligned |
||
+ | ends of the reference sequence to improve the resulting model. The essential part of the protein, the domains, should now be |
||
+ | better modeled. |
||
+ | |||
+ | Seeds by PFAM - ACT: |
||
+ | {| border="1" |
||
+ | ! REFERENCE |
||
+ | | SLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKkDEYEFFTHLD-KRSL |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | SLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYL------ |
||
+ | |} |
||
+ | |||
+ | Seeds by PFAM - Biopterin_H |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | PWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-SEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQK |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | PWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT----------------------- |
||
+ | |- |
||
+ | ! 1TOH |
||
+ | | PWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSL-SEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDELHTLAHALS- |
||
+ | |- |
||
+ | ! 1LTZ |
||
+ | | ------------------------------------------------------PQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLdSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQ-------------------------------------------------------- |
||
+ | |} |
||
+ | |||
+ | unadjusted alignment of 1TOH |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1TOH |
||
+ | | -------------K-------------------------------------------------------------------------------------------------------VPWFPRKVSELDKCD---L--------DHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASDEEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSLSEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDELHTLAHALSAIS |
||
+ | |- |
||
+ | ! PSI |
||
+ | | -------------C-------------------------------------------------------------------------------------------------------CCCCCCCCCCCCCCC---C--------CCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHCEEEEEEEEEEEEECCCCEECCCCCCCCCCHHCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCEEEEECCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |} |
||
+ | |||
+ | unadjusted alignment of 1PHZ |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | ------------------GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFFTYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT------------------------- |
||
+ | |- |
||
+ | ! PSI |
||
+ | | ------------------CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEEEEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCCEEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEECCCC------------------------- |
||
+ | |} |
||
+ | |||
+ | unadjusted alignment of 1LTZ |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-SEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCH HHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHH CCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1LTZ |
||
+ | | -----------------F--------------------------V-----V-------------------------------------PDITT-----RKNVG-----LSHDANDFTLP------QPLDR-------YSA-----------------------------------------EDHATWATLYQRQCKLLPGRACDEFLEGLERLE----VDADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKALGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLDSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQL------FDAD----FAPLY------LQLAD-AQPWG--AGDIAP------DDL--VL |
||
+ | |- |
||
+ | ! PSI |
||
+ | | -----------------C--------------------------C-----C-------------------------------------CCCCC-----CCCCC-----CCCCCCCCCCC------CCCCC-------CCH-----------------------------------------HHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCC----CCCCCCCCCHHHHHHHHHHHCCEEEECCCCCCHHHHHHHHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHEEEEEEEEEECCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHH------HHHC----CHHHH------HHHHH-CCCCC--CCCCCC------CCC--CC |
||
+ | |} |
||
+ | |||
+ | adjusted alignment of 1LTZ |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | PIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAAT |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHH |
||
+ | |- |
||
+ | ! 1LTZ |
||
+ | | PQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLdSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQLFDADFAPL |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHCCEEEECCCCCCHHHHHHHHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHEEEEEEEEEECCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHCCHHHHHHH |
||
+ | |} |
||
+ | |||
+ | adjusted alignment of 1PHZ |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | GQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQ |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHH |
||
+ | |- |
||
+ | ! 1PHZ |
||
+ | | GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEEEEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCCEEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCH-HHHHHHHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCC-CCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEECCCC |
||
+ | |} |
||
+ | |||
+ | adjusted alignment of 1TOH |
||
+ | {| |
||
+ | ! REFERENCE |
||
+ | | VPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |- |
||
+ | ! 1TOH |
||
+ | | VPWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSL-SEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDELHTLAHALSA |
||
+ | |- |
||
+ | ! PSI |
||
+ | | CCCCCCCCCCCCCC CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHCEEEEEEEEEEEEECCCCEECCCCCCCCCCHHCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCEEEEECCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
||
+ | |} |
||
+ | |||
=== Homology modelling with Modeller === |
=== Homology modelling with Modeller === |
Revision as of 14:05, 10 June 2011
Task description
The full description of this task can be found here.
In this task we are going to learn more about several methods of homology modelling. There exists only a small number of known protein structures. Therefore if someone wants to predict the structure of a somehow new protein (newly sequenced, a mutant, etc.). He could use known structures to calculate a model of the unknown structure.
Homology modelling bases on an alignment between the target-sequence and one or more template structures. The aligned coordinates are directly used for the model. That is why it is important to select a good template and alignment.
Calculation of models
Overview of available homologous structures
Search
We used hhsearch with the standard parameter to find homologous structures of our protein. The following command was executed:
- ./hhsearch -i reference_pah_aa.fasta -d pdb70.db -b 500 -o hhsearch.out
We received the following hits:
No. | PDB ID | Description | Prob | E-Value | P-Value | Score | SS | Cols | Query HMM | Template HMM | Residues | Sequence Identity |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1phz_A | Protein (phenylalanine | 1 | 1 | 1 | 1084.4 | 0 | 429 | 1-429 | 1-429 | (429) | 92% |
2 | 1j8u_A | Phenylalanine-4-hydroxy | 1 | 1 | 1 | 894.5 | 0 | 325 | 103-427 | 1-325 | (325) | 100% |
3 | 1toh_A | Tyroh tyrosine hydroxy | 1 | 1 | 1 | 890.7 | 0 | 342 | 111-452 | 2-343 | (343) | 60% |
4 | 1mlw_A | Tryptophan 5-monooxygen | 1 | 1 | 1 | 804.2 | 0 | 300 | 116-415 | 2-301 | (301) | 66% |
5 | 1ltz_A | Phenylalanine-4-hydroxy | 1 | 1 | 1 | 504.9 | 0 | 265 | 144-414 | 2-269 | (297) | 30% |
6 | 2v27_A | Phenylalanine hydroxyla | 1 | 1 | 1 | 471.1 | 0 | 254 | 167-424 | 4-271 | (275) | 30% |
7 | 2qmx_A | Prephenate dehydratase; | 1 | 1 | 1 | 70.0 | 0 | 53 | 33-85 | 199-251 | (283) | 40% |
8 | 2qmw_A | PDT prephenate dehydra | 1 | 1 | 1 | 66.1 | 0 | 51 | 35-85 | 190-240 | (267) | 37% |
9 | 3luy_A | Probable chorismate mut | 1 | 1 | 1 | 66.0 | 0 | 53 | 33-85 | 207-259 | (329) | 28% |
10 | 1y7p_A | Hypothetical protein AF | 1 | 1 | 1 | 19.9 | 0 | 38 | 36-73 | 6-43 | (223) | 16% |
Template structure selection
We selected the following structures as our template structures:
- > 60% sequence identity: 1phz
- > 40% sequence identity: 1toh
- < 40% sequence identity: 1ltz
Alignment Refinement
We used the reference for a search in PFAM. There were two PFAM-domains detected on the reference sequence: ACT and Biopterin_H Then we used the sequence of the three proteins 1PHZ, 1TOH and 1LTZ to run a search against PFAM and used the alignments with the HMMs of ACT and Biopterin_H and the alignment of the reference sequence with the HMMs of ACT and Biopterin_H as seeds for the improved alignments. The crucial parts of the reference sequence according to the annotation in UniProt was already aligned by the seeds. We predicted the secondary structure of the three sequences of the proteins (TODO ref... seems to be better to predict) and tried to extend the seeds. In PHZ a large gap could be filled by high sequence identity. Afterwards we deleted the unaligned ends of the reference sequence to improve the resulting model. The essential part of the protein, the domains, should now be better modeled.
Seeds by PFAM - ACT:
REFERENCE | SLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKkDEYEFFTHLD-KRSL |
---|---|
1PHZ | SLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYL------ |
Seeds by PFAM - Biopterin_H
REFERENCE | PWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-SEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQK |
---|---|
1PHZ | PWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT----------------------- |
1TOH | PWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSL-SEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDELHTLAHALS- |
1LTZ | ------------------------------------------------------PQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLdSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQ-------------------------------------------------------- |
unadjusted alignment of 1TOH
REFERENCE | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
---|---|
PSI | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
1TOH | -------------K-------------------------------------------------------------------------------------------------------VPWFPRKVSELDKCD---L--------DHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASDEEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSLSEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDELHTLAHALSAIS |
PSI | -------------C-------------------------------------------------------------------------------------------------------CCCCCCCCCCCCCCC---C--------CCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHCEEEEEEEEEEEEECCCCEECCCCCCCCCCHHCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCEEEEECCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
unadjusted alignment of 1PHZ
REFERENCE | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
---|---|
PSI | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
1PHZ | ------------------GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFFTYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT------------------------- |
PSI | ------------------CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEEEEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCCEEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEECCCC------------------------- |
unadjusted alignment of 1LTZ
REFERENCE | MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCL-SEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
---|---|
PSI | CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCH HHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHH CCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
1LTZ | -----------------F--------------------------V-----V-------------------------------------PDITT-----RKNVG-----LSHDANDFTLP------QPLDR-------YSA-----------------------------------------EDHATWATLYQRQCKLLPGRACDEFLEGLERLE----VDADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKALGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLDSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQL------FDAD----FAPLY------LQLAD-AQPWG--AGDIAP------DDL--VL |
PSI | -----------------C--------------------------C-----C-------------------------------------CCCCC-----CCCCC-----CCCCCCCCCCC------CCCCC-------CCH-----------------------------------------HHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCC----CCCCCCCCCHHHHHHHHHHHCCEEEECCCCCCHHHHHHHHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHEEEEEEEEEECCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHH------HHHC----CHHHH------HHHHH-CCCCC--CCCCCC------CCC--CC |
adjusted alignment of 1LTZ
REFERENCE | PIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAAT |
---|---|
PSI | CCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHH |
1LTZ | PQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFLEGLERLEV----DADRVPDFNKLNEKLMAATGWKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKAlGALPMLARLYWYTVEFGLINTPAGMRIYGAGILSSKSESIYCLdSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQLFDADFAPL |
PSI | CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHCCEEEECCCCCCHHHHHHHHHCCCCCEEECCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHEEEEEEEEEECCCCEEEECCCCCCCCCCCCCCCCCCCCEEECCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHCCHHHHHHH |
adjusted alignment of 1PHZ
REFERENCE | GQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQ |
---|---|
PSI | CCCCCCCCCCCCCCCCEEEEEECCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHCCCCEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHH |
1PHZ | GQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNkDEYEFFTYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQI------LDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPD-EYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCL-SDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNT |
PSI | CCCCCCCCCCCCCCCEEEEEEEECCCCCHHHHHHHHHHHCCCEEEEEECCCCCCCCCEEEEEEEECCCCCCCHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCHHHHHHHHHHH------CCCCCCCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHCCCCEEEECCCCCCHHHHHHHHHCCCCCEECCCCCCCCCCCCCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCH-HHHHHHHHHEEEEEEEEEEEECCCCEEECCCCCCCCCCCCCCC-CCCCCCCCCCHHHHHCCCCCCCCCCCCEEEECCHHHHHHHHHHHHHHCCCCCCCCCCCCCCEEEECCCC |
adjusted alignment of 1TOH
REFERENCE | VPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK |
---|---|
PSI | CCCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHCCCCEECCCEEEECCCCCCCCCCCCHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHEEEEEEEEEECCCCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCHHHHHCCCCCCCCCCEEEEEECCHHHHHHHHHHHHHHCCCCCEEEECCCCCEEEECCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
1TOH | VPWFPRKVSELDKC-----------DLDHPGFSDQVYRQRRKLIAEIAFQYKHGEPIPHVEYTAEEIATWKEVYVTLKGLYATHACREHLEGFQLLERYCGYREDSIPQLEDVSRFLKERTGFQLRPVAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPDCCHELLGHVPMLADRTFAQFSQDIGLASLGASD-EEIEKLSTVYWFTVEFGLCKQNGELKAYGAGLLSSYGELLHSL-SEEPEVRAFDPDTAAVQPYQDQTYQPVYFVSESFNDAKDKLRNYASRIQRPFSVKFDPYTLAIDVLDSPHTIQRSLEGVQDELHTLAHALSA |
PSI | CCCCCCCCCCCCCC CCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHCCCEEEECCCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHHCEEEEEEEEEEEEECCCCEECCCCCCCCCCHHCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCEEEEECCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHC |
Homology modelling with Modeller
Modeller uses two types of files to be run. The first one contains the used alignment in the PIR-format (see PIR FORMAT), the second one is a python script, which tells Modeller which steps it has to perform. There are some examples in /apps/modeller9.9/examples/automodel
. But python seems to be falsely configured on the virtual machines (at least the linux virtual machine).
- The used fix of the python installation is described at the software section.
- The Modeller modules were still not importable by Python, that is why it was necessary to reinstall Modeller. The steps for this are described in software section.
After all this stuff we tried to write a alignment-file with the hhsearch alignments and a python-file (according to the example /apps/modeller9.9/examples/automodel/model-default.py
. But modeller seems to be very sensible due to missing acids in the coordinate section of the pdb, which are of course mentioned in the sequence used in the alignment.
To avoid this problem there are at least two possibilities:
- Repair the pdb by the script repairPDB. Map the sequence in the alignment on the sequence in the coordinate section of the used pdb (a lot of work - need some kind of alignment and a pdb-parser... both is not trivial)
- The other possibility is to let Modeller create the alignment on the basis of the repaired PDB (with repairPDB).
We have chosen the second option, which is really nice described in the Modeller tutorial (see basic modeller tutorial).
At the example of 1PHZ. We prepared three files.
- The alignment file, which contains only the reference sequence: pah.ali
>P1;PAH
sequence:reference::::::::
MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEEN
DVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKD
TVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPI
PRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQ
FLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGH
VPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLL
SSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATI
PRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK*
- The python script, which tells Modeller to create an alignment
from modeller import *
env = environ()
aln = alignment(env)
mdl = model(env, file='../structures/1phz.pdb', model_segment=('FIRST:A','LAST:A'))
aln.append_model(mdl, align_codes='1phz', atom_files='../structures/1phz.pdb')
aln.append(file='pah.ali', align_codes='PAH')
aln.align2d()
aln.write(file='phz.ali', alignment_format='PIR')
- Then we tried to optimize the alignment. (for the different criteria see below)
- The python script, which tells Modeller to create a model
from modeller.automodel import *
a = automodel(env, alnfile='phz.ali',
knowns='1phz', sequence='PAH',
assess_methods=(assess.DOPE, assess.GA341))
a.starting_model = 1
a.ending_model = 1
a.make()
As already mentioned above, we took a closer look at the sequence above.
- We removed gaps at the beginning and at the end of the alignment
- We tried to shift the alignment, such that the dssp secondary structures of the reference (used an alignment to 2PAH from task 3) and the template (new dssp run) have a higher coverage
- We tried to match the corresponding annotated functional sites (PDB).
Homology modelling with Swissmodel
Standard workflow
The standard workflow of Swissmodel is the automated mode. For this mode only the UniProt accession number or the amino acid sequence of the target protein is required. As an optional parameter it is possible to enter the template structure as well. However, if this field is left blank Swissmodel will search automatically for a suitable template.
The input for all three models is as follows:
Homology modelling with iTasser
ITasser is a prediction server, which participated in several CASP competitions. It claims of itself to be the best one. The interface of this server can be seen in Figure 1. The runtime of the jobs is approximately 24 to 48 hours. The server seems to receive a lot of jobs, that is why it is not allowed to add more than one job at a time from the same ip-address. Therefore it is probably not possible to run all six variants described in the task.
Evaluation of the calculated models
Selection of the reference structures
We had the following choice of reference structures for PAH:
Entry | Method | Resolution (A) | Chain | Positions |
---|---|---|---|---|
1DMW | X-Ray | 2.00 | A | 118-424 |
1J8T | X-Ray | 1.70 | A | 103-427 |
1J8U | X-Ray | 1.50 | A | 103-427 |
1KW0 | X-Ray | 2.50 | A | 103-427 |
1LRM | X-Ray | 2.10 | A | 103-427 |
1MMK | X-Ray | 2.00 | A | 103-427 |
1MMT | X-Ray | 2.00 | A | 103-427 |
1PAH | X-Ray | 2.00 | A | 117-424 |
1TDW | X-Ray | 2.10 | A | 117-424 |
1TG2 | X-Ray | 2.20 | A | 117-424 |
2PAH | X-Ray | 3.10 | A/B | 118-452 |
3PAH | X-Ray | 2.00 | A | 117-424 |
4PAH | X-Ray | 2.00 | A | 117-424 |
5PAH | X-Ray | 2.10 | A | 117-424 |
6PAH | X-Ray | 2.15 | A | 117-424 |
All these structures have in common that they did not solve the structure of the whole PAH protein. In addition, there is no complete true apo structure available either. All structures have at least a Fe2+ atom bound. So we defined these structures as our apo structure.
Finally, we decided to select 1J8T (apo) and 1J8U (complexed). As mentioned before our apo structure has complexed Fe2+ and our complexed structure is complexed with Fe2+ and BH4 (5,6,7,8-TETRAHYDROBIOPTERIN). The reason for our decision was that both structures are solved from the same group which somehow guaranties a more consistent methodology as if we had selected structures from two different groups. Another reason is the resolution, both structures are the two with the best resolved resolution which is 1.5 Angstrom and 1.7 Angstrom for 1J8U and 1J8T respectively. Finally for more easy comparison, both structures include the same range of amino acids which is from 103 to 427.
Numeric evaluation of the calculated models
Modeller
We have chosen three scores to be calculated by Modeller: molpdf, DOPE and GA341
model | modlpdf | DOPE | GA341 |
---|---|---|---|
1phz | 2568.91309 | -53912.11328 | 1.00000 |
1toh | 20481.25977 | -37609.82031 | 1.00000 |
1ltz | 6824.36182 | -37422.13672 | 0.98868 |
1phz with adjusted alignment | 2567.52441 | -49139.31250 | 1.00000 |
1loh with adjusted alignment | 1790.78723 | -38113.64453 | 1.00000 |
1ltz with adjusted alignment | 1197.55518 | -26429.44531 | 1.00000 |
multiple alignment | 61561.77344 | -29682.17578 | 0.00012 |
Swissmodel
Automated Mode: Modelling with template structure 1phz_A (>60%)
QMEAN Z-Score: -0.828
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.715 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -157.98 | -0.16 |
All-atom pairwise energy | -12503.57 | -0.1 |
Solvation energy | -50.66 | 1.01 |
Torsion angle energy | -78.77 | -1.48 |
QMEAN4 score | 0.715 | -0.83 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Automated Mode: Modelling with template structure 1toh_A (>40%)
QMEAN Z-Score: -2.745
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.604 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -78.67 | -1.16 |
All-atom pairwise energy | -7899.47 | -0.89 |
Solvation energy | -20.56 | -1.3 |
Torsion angle energy | -47.34 | -2.22 |
QMEAN4 score | 0.604 | -2.74 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
Automated Mode: Modelling with template structure 1ltz_A (<40%)
QMEAN Z-Score: -4.282
QMEAN4 global scores:
QMEANscore4 | Estimated absolute model quality | Score components |
---|---|---|
0.47 |
Local scores:
Coloring by residue error | Residue error plot |
---|---|
Global scores: QMEAN4:
Scoring function term | Raw score | Z-score |
---|---|---|
C_beta interaction energy | -40.25 | -2.1 |
All-atom pairwise energy | -3528.81 | -2.29 |
Solvation energy | -15.22 | -1.18 |
Torsion angle energy | -4.99 | -3.78 |
QMEAN4 score | 0.47 | -4.28 |
Local Model Quality Estimation: Anolea / QMEAN / Gromos:
iTasser
Comparison to experimental structure
To calculate the C-alpha RMSD we used DaliLite.
To calculate the TM-Score we used the TM-score webservice from the University of Michigan.
To calculate the RMSD of the 6A radius of the catalytic center we had to first identify the catalytic center. We defined the center position of the catalytic side as the position where our Fe2+ atom is. With the position in hand we now have to extract the residues in a 6A radius around this Fe2+ atom. In order to do so we executed the following steps:
- We opened the complexed or apo structure and one of the modeled structures with Pymol.
- Then we aligned both structures to each other
- Then we selected the Fe2 atom of the apo/complexed structure and expanded this selection by 6A, residue
- Then we extracted the selected residues into two objects each object contains only the residues of either the apo/complexed structure or the modeled structure
- Then we saved both objects in seperate PDB structures
- Now we used the rms.pl script to calculate the all atom RMSD with the following command "./rms.pl -out all first.pdb second.pdb"
Modeller
Standard Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.9 | 0.9711 | 4.2082 |
1PHZ Chain: A | 1J8U | Complexed | 0.9 | 0.9718 | - |
1TOH Chain: A | 1J8T | Apo | 1.9 | 0.8844 | 5.9302 |
1TOH Chain: A | 1J8U | Complexed | 1.9 | 0.8850 | 4.3264 |
1LTZ Chain: A | 1J8T | Apo | 2.3 | 0.6986 | 1.0011 |
1LTZ Chain: A | 1J8U | Complexed | 2.3 | 0.6985 | - |
Multiple | 1J8T | Apo | 2.1 | 0.1886 | - |
Multiple | 1J8U | Complexed | 2.1 | 0.1879 | - |
Adjusted Alignment Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 1.0 | 0.1957 | - |
1PHZ Chain: A | 1J8U | Complexed | 1.0 | 0.1954 | - |
1TOH Chain: A | 1J8T | Apo | 1,1 | 0.1592 | - |
1TOH Chain: A | 1J8U | Complexed | 1.1 | 0.1592 | - |
1LTZ Chain: A | 1J8T | Apo | 1.7 | 0.1021 | - |
1LTZ Chain: A | 1J8U | Complexed | 1.7 | 0.1020 | - |
Swissmodel
Standard Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | 0.9 | 0.7400 | 0.5162 |
1PHZ Chain: A | 1J8U | Complexed | 0.9 | 0.7408 | 0.5154 |
1TOH Chain: A | 1J8T | Apo | 1.3 | 0.8889 | 0.4616 |
1TOH Chain: A | 1J8U | Complexed | 1.2 | 0.8894 | 0.3361 |
1LTZ Chain: A | 1J8T | Apo | 2.3 | 0.8816 | 0.9225 |
1LTZ Chain: A | 1J8U | Complexed | 2.3 | 0.8814 | 0.9208 |
Adjusted Alignment Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | |||
1PHZ Chain: A | 1J8U | Complexed | |||
1TOH Chain: A | 1J8T | Apo | |||
1TOH Chain: A | 1J8U | Complexed | |||
1LTZ Chain: A | 1J8T | Apo | |||
1LTZ Chain: A | 1J8U | Complexed |
iTasser
Standard Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | |||
1PHZ Chain: A | 1J8U | Complexed | |||
1TOH Chain: A | 1J8T | Apo | |||
1TOH Chain: A | 1J8U | Complexed | |||
1LTZ Chain: A | 1J8T | Apo | |||
1LTZ Chain: A | 1J8U | Complexed |
Adjusted Alignment Workflow
Template Structure | Compared To | Apo/Complexed | C-alpha RMSD | TM score | All Atoms RMSD, 6A |
---|---|---|---|---|---|
1PHZ Chain: A | 1J8T | Apo | |||
1PHZ Chain: A | 1J8U | Complexed | |||
1TOH Chain: A | 1J8T | Apo | |||
1TOH Chain: A | 1J8U | Complexed | |||
1LTZ Chain: A | 1J8T | Apo | |||
1LTZ Chain: A | 1J8U | Complexed |