Task 2: Alignments

Multiple sequence alignments

**Table 5:** List of all sequences that were used in the multiple sequence alignments.
< 30% sequence identity		> 60% sequence identity		whole range of sequence identity
AC Number	Sequence Identity	AC Number	Sequence Identity	AC Number	Sequence Identity	AC Number	Sequence Identity
F6JYA9	0.29	B4DDZ1	0.83	B4DDZ1	0.83	F6WCX4	0.34
D9J389	0.28	Q5EEZ1	0.76	Q5EEZ1	0.76	F6JYA9	0.29
D5MSB3	0.27	G3THV5	0.75	G1MBW1	0.73	D9J389	0.28
B3FRK2	0.20	G1MBW1	0.73	H0VAR7	0.72	D5MSB3	0.27
3ov6_A	0.22	H0VAR7	0.72	F1PX48	0.71	2zok_E	0.24
1p7k_L	0.21	F1PX48	0.71	G1T7D7	0.70	H0Y1D0	0.20
H0Y1D0	0.20	G1T7D7	0.70	G5BQE5	0.67	Q8SNJ4	0.22
B3FRK3	0.18	O35799	0.68	Q95IT9	0.38	B3FRK3	0.18
Q8HWL2	0.17	G5BQE5	0.67	2qrt_A	0.38	Q8HWL2	0.17
				Q8HX83	0.36

In order to assess the difference of multiple sequence alignments between close and related homologs, three different groups of sequences were selected. One with sequences with a sequence identity above 60% to human hfe, one with sequences below 30% identity and one with sequences covering the whole range of sequence identity. The selected sequences are listed in Table 5.
In order to ensure that also sequences with known structures are included in the alignments, the sequences from the following PDB structures were included:

1p7k_L
3ov6_A
2qrt_A
2zok_E

ClustalW

ClustalW alignment of sequences from the below 30% identity group.

ClustalW alignment of sequences from the above 60% identity group.

ClustalW alignment of sequences from the whole range identity group.

</figtable>

MAFFT

MAFFT alignment of sequences from the below 30% identity group.

MAFFT alignment of sequences from the above 60% identity group.

MAFFT alignment of sequences from the whole range identity group.

</figtable>

T-Coffee

T-Coffee alignment of sequences from the below 30% identity group.

T-Coffee alignment of sequences from the above 60% identity group.

T-Coffee alignment of sequences from the whole range identity group.

</figtable>

Comparison

In the above 60% sequence identity group, the residues and the gaps are well conserved, especially in the first third of the sequence. In the other two thirds, the sequence conservation drops slightly, but is still at a generally high level. In the below 30% identity group, the number of gaps inside the sequences is not much higher than in the above 60% group, but the overall residue conservation is significantly lower. Nevertheless, there are some very well conserved residues in this group that might be functionally important. In the mixed sequence identity group, the gaps are not as well conserved as in the other two groups and the conservation is even lower than in the below 30% identity group. But this effect is probably strengthened by the higher amount of sequences present.

The different alignment programs yield comparable results, i.e. none of them is considerably better or worse than the other two. Nevertheless, some differences can be observed. The first notable one is that MAFFT yields a lot less conserved columns for the low sequence identity group than the other programs. Also, the positioning of consecutive gap columns varies between the programs.

Task 2: Alignments

Contents

Multiple sequence alignments

ClustalW

MAFFT

T-Coffee

Comparison

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools