^{by Benjamin Drexler and Fabian Grandke}

Introduction

In this task, we performed homology modelling of the protein α-galactosidase A with the programs MODELLER, SWISS-MODEL, iTasser and 3D-JIGSAW. Homology modelling relies on the following two assumptions. First, the structure of the protein is determined by its amino acid sequence. Second, the structure of a protein is more conserved than its amino acid sequence. Usually one performs homology modelling of a protein which structure is not known. In this case, we have several PDB structures of the α-galactosidase A available and hence we are able to evaluate the resulting models of the programs afterwards.

General

Template Selection

The following table lists the best ten hits of the HHpred search of Task 1. We used 3HG3 (97% identity), 1KTB (53%) and 3CC1 (34%) as templates for the modelling process. This selection covers a wide range of sequence identity and hence we are able to evaluate how the sequence identity influence the quality of the models.

PDB-ID	Name	Probability	E-value	P-value	Identity	Template
> 60% sequence identity
3hg3_A	Alpha-galactosidase A	1.0	0	0	97%	x
> 40% sequence identity
1ktb_A	Alpha-N-acetylgalactosaminidase	1.0	0	0	53%	x
< 40% sequence identity
1uas_A	Alpha-galactosidase	1.0	0	0	39%
3lrk_A	Alpha-galactosidase 1	1.0	0	0	32%
3a5v_A	Alpha-galactosidase	1.0	0	0	35%
1szn_A	Alpha-galactosidase	1.0	0	0	34%
3a21_A	Putative secreted alpha-galactosidase	1.0	0	0	34%
3cc1_A	BH1870 protein	1.0	0	0	26%	x
3a24_A	Alpha-galactosidase	1.0	0	0	14%
1zy9_A	Alpha-galactosidase	1.0	2.2E-37	8.8E-42	14%

Calculation of Models

MODELLER

MODELLER is a program to produce three-dimensional protein structures based on homology or comparative modelling. The user has to provide the sequence of the protein to be modeled and the structure and sequence of at least one related protein that is used as a template. MODELLER uses all atoms of the template protein, but the hydrogen-atoms. We used MODELLER as described in the tutorial Using Modeller for TASK 4. Therefor we had to align both sequences and convert them into pir-format. This alignment is given as input together with the template pdb-file. Unfortunately the input file has to be provided as python file. Additionally to the pairwise approach we used a multiple alignment as template for the model. Therefor we created an alignment of the sequence, provided in the Multiple_Sequence_Alignments section of this page. Then we added the target sequence to the alignment and supervised it. The supervision showed, that the sequences aligned very well in general, but the sequences 3LRK_A and 3CC1_A. Thus, those were removed and the alignment was realigned. Both, the supervised and the unsupervised alignment have been used as input for MODELLER.<ref name=modeller>http://salilab.org/modeller/</ref>

Pairwise Alignments

In this section, we used a pairwise alignment between the template (i.e. 3HG3, 1KTB and 3CC1) and the target as the input for MODELLER.

Evaluation

Figure 1: Representation of the resulting models of MODELLER and the reference PDB structure 1R47. The models are in superposition to the reference structure (green) and are shown in cartoon representation. (A) The model is based on the PDB structure 3HG3 (red). (B) The model is based on the PDB structure 1KTB (blue). (C) The model is based on the PDB structure 3CC1 (magenta).

	Apo (1R46)				Complexed (1R47)
Template	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site
3HG3	0.141	0.2743	0.498	0.326	0.1413	0.2746	0.512	0.366
1KTB	0.140	0.2793	0.901	0.439	0.1413	0.2739	0.888	0.437
3CC1	0.1397	0.2670	2.864	3.436	0.1397	0.2665	2.853	3.405

Multiple Sequence Alignments

PDB-ID	Identity	Comment
3LX9_A	99%
3GXP_A	99%
3H53_A	99%
3HG3_A	97%
3IGU_A	54%
1KTB_A	53%
1UAS_A	39%
3LRK_A	34%	Was removed due to little sequence identity. Caused huge gaps in alignment.
3CC1_A	28%	Was removed due to little sequence identity. Caused huge gaps in alignment.

iTasser

Figure 1 shows, that iTasser takes an amino acid sequence as input and tries to retrieve template proteins from PDB. In the next step fragments from the the templates are reassembled to a complete model. In the last step, the model is reassembled by taking energy calculations into account. Additionally biological function prediction is done, but that was not of interest of this task.<ref name=itasser1>http://zhanglab.ccmb.med.umich.edu/I-TASSER/about.html</ref>

Figure 1: A schematic representation of the I-TASSER protocol for protein structure and function predictions. The protein chains are colored from blue at the N-terminus to red at the C-terminus.<ref name=itasser2>Roy et al., I-TASSER: a unified platform for automated protein structure and function prediction, Nature Protocols, 2007</ref>

We used the iTasser-server in two different ways:

Standard parameters: the protein sequence is given as input and the program searches PDB for templates. The found proteins are used to create a template to predict the structure.
PDB-ID as input: together with the amino acid sequence a template PDB-ID is given as input. The program takes all available information into account and uses them to calculate the structure.

As the iTasser server has very low capacities and only one job commitment at the same time is possible, the results of the second way are not yet present. The standalone version is no option, because it has a size of about 10GB and it does not work properly.

SWISS-MODEL

We used the swissmodel server with two different options:

Automated Mode: A template sequence is given as input. As no further information are given, the model is directly created from the amino acid sequence. This method should only be used, if the sequence identity between target and template is greater than 50%.
Aligned Mode: A pairwise alignment of template and target sequence is given as input. We created our alignments using online ClustalW2 from EBI.

Following sequences have been selected:

3hg3_A			1ktb_A			3cc1_A
	Automated Mode	Aligned Mode		Automated Mode	Aligned Mode		Automated Mode	Aligned Mode
Identity	Z-score	Z-score	Identity	Z-score	Z-score	Identity	Z-score	Z-score
97%	0	-0.415	53%	-2.742	-12.996	26%	Error¹	-14.046

¹The sequences are to different to create a useful model(26%). In the automated mode sequence identity of at least 50% is recommended.

Evaluation of Models

MODELLER

Numeric Evaluation

	Apo (1R46)			Complexed (1R47)
Template	TMS	RMSD	RMSD catalytic site	TMS	RMSD	RMSD catalytic site
3HG3	0.141	0.498	0.326	0.1413	0.512	0.366
1KTB	0.140	0.901	0.439	0.1413	0.888	0.437
3CC1	0.1397	2.864	3.436	0.1397	2.853	3.405

Comparison to Experimental Structure

Representation of the resulting models of MODELLER and the reference PDB structure 1R47. The models are in superposition to the reference structure (green) and are shown in cartoon representation. (A) The model is based on the PDB structure 3HG3 (red). (B) The model is based on the PDB structure 1KTB (blue). (C) The model is based on the PDB structure 3CC1 (magenta).

iTasser

Numeric Evaluation

Comparison to Experimental Structure

SWISS-MODEL

Template 3HG3

Numeric Evaluation

	Apo (1R46)				Complexed (1R47)
Mode	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site
Aligned	0.1411	0.2729	0.485	0.279	0.1412	0.2731	0.489	0.290
Automated	0.1411	0.2729	0.485	0.277	0.1412	0.2731	0.489	0.291

Comparison to Experimental Structure

Representation of the resulting models of SWISS-MODEL which used the PDB structure 3HG3 as a template the reference PDB structure 1R47. The models are in superposition to the reference structure (green) and are shown in cartoon representation. (A) The model (cyan) was build by using the aligned mode of SWISS-MODEL. (B) The model (yellow) was build by using the aligned mode of SWISS-MODEL.

Template 1KTB

Numeric Evaluation

	Apo (1R46)				Complexed (1R47)
Mode	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site
Aligned	0.1598	0.2638	0.943	5.073	0.1606	0.2636	0.932	6.409
Automated	0.1361	0.2669	0.981	0.417	0.1368	0.2672	0.974	0.404

Comparison to Experimental Structure

Representation of the resulting models of SWISS-MODEL which used the PDB structure 1KTB as a template and the reference PDB structure 1R47. The models are in superposition to the reference structure (green) and are shown in cartoon representation. (A) The model (red) was build by using the aligned mode of SWISS-MODEL. (B) The model (blue) was build by using the aligned mode of SWISS-MODEL.

Template 3CC1

Numeric Evaluation

	Apo (1R46)				Complexed (1R47)
Mode	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site	TMS (command line)	TMS (webserver)	RMSD	RMSD catalytic site
Aligned	0.1302	0.2436	3.279	7.107	0.1300	0.2442	3.802	7.357
Automated	N/A	N/A	N/A	N/A	N/A	N/A	N/A	N/A

Comparison to Experimental Structure

Representation of the resulting model of SWISS-MODEL which used the PDB structure 3CC1 as a template and the reference PDB structure 1R47. The model is in superposition to the reference structure (green) and are shown in cartoon representation. It was not possible to build a model with the automated mode of SWISS-MODEL and hence there is only a model of the aligned mode (magenta).

References

Homology Modelling GLA

Contents

Introduction

General

Template Selection

Calculation of Models

MODELLER

Pairwise Alignments

Evaluation

Multiple Sequence Alignments

iTasser

SWISS-MODEL

Evaluation of Models

MODELLER

Numeric Evaluation

Comparison to Experimental Structure

iTasser

Numeric Evaluation

Comparison to Experimental Structure

SWISS-MODEL

Template 3HG3

Numeric Evaluation

Comparison to Experimental Structure

Template 1KTB

Numeric Evaluation

Comparison to Experimental Structure

Template 3CC1

Numeric Evaluation

Comparison to Experimental Structure

References

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools