Task 2: Alignments
Multiple sequence alignments
< 30% sequence identity | > 60% sequence identity | whole range of sequence identity | |||||
AC Number | Sequence Identity | AC Number | Sequence Identity | AC Number | Sequence Identity | AC Number | Sequence Identity |
F6JYA9 | 0.29 | B4DDZ1 | 0.83 | B4DDZ1 | 0.83 | F6WCX4 | 0.34 |
D9J389 | 0.28 | Q5EEZ1 | 0.76 | Q5EEZ1 | 0.76 | F6JYA9 | 0.29 |
D5MSB3 | 0.27 | G3THV5 | 0.75 | G1MBW1 | 0.73 | D9J389 | 0.28 |
B3FRK2 | 0.20 | G1MBW1 | 0.73 | H0VAR7 | 0.72 | D5MSB3 | 0.27 |
3ov6_A | 0.22 | H0VAR7 | 0.72 | F1PX48 | 0.71 | 2zok_E | 0.24 |
1p7k_L | 0.21 | F1PX48 | 0.71 | G1T7D7 | 0.70 | H0Y1D0 | 0.20 |
H0Y1D0 | 0.20 | G1T7D7 | 0.70 | G5BQE5 | 0.67 | Q8SNJ4 | 0.22 |
B3FRK3 | 0.18 | O35799 | 0.68 | Q95IT9 | 0.38 | B3FRK3 | 0.18 |
Q8HWL2 | 0.17 | G5BQE5 | 0.67 | 2qrt_A | 0.38 | Q8HWL2 | 0.17 |
Q8HX83 | 0.36 |
In order to assess the difference of multiple sequence alignments between close and related homologs, three different groups of sequences were selected. One with sequences with a sequence identity above 60% to human hfe, one with sequences below 30% identity and one with sequences covering the whole range of sequence identity. The selected sequences are listed in Table 5.
In order to ensure that also sequences with known structures are included in the alignments, the sequences from the following PDB structures were included:
- 1p7k_L
- 3ov6_A
- 2qrt_A
- 2zok_E
ClustalW
<figtable id="clustalW">
</figtable>
MAFFT
<figtable id="mafft">
</figtable>
T-Coffee
<figtable id="tcoffee">
</figtable>
Comparison
In the above 60% sequence identity group, the residues and the gaps are well conserved, especially in the first third of the sequence. In the other two thirds, the sequence conservation drops slightly, but is still at a generally high level. In the below 30% identity group, the number of gaps inside the sequences is not much higher than in the above 60% group, but the overall residue conservation is significantly lower. Nevertheless, there are some very well conserved residues in this group that might be functionally important. In the mixed sequence identity group, the gaps are not as well conserved as in the other two groups and the conservation is even lower than in the below 30% identity group. But this effect is probably strengthened by the higher amount of sequences present.
The different alignment programs yield comparable results, i.e. none of them is considerably better or worse than the other two. Nevertheless, some differences can be observed. The first notable one is that MAFFT yields a lot less conserved columns for the low sequence identity group than the other programs. Also, the positioning of consecutive gap columns varies between the programs.