Difference between revisions of "Task 2: Multiple Sequence Alignment"
(→Results) |
|||
(65 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | |||
+ | '''Sorry, were behind scedule, page will be filled with content as soon as possible.''' |
||
+ | |||
+ | |||
We researched the protein sequence of the branched-chain alpha-keto acid dehydrogenase complex subunit alpha (BCKDHA) with the following original sequence: |
We researched the protein sequence of the branched-chain alpha-keto acid dehydrogenase complex subunit alpha (BCKDHA) with the following original sequence: |
||
* BCKDHA |
* BCKDHA |
||
Line 12: | Line 16: | ||
+ | |||
+ | == Blast == |
||
To calculate the sequence alignments we used the blast and psiblast binaries from [ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ NCBI] (version 2.2.26+) |
To calculate the sequence alignments we used the blast and psiblast binaries from [ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ NCBI] (version 2.2.26+) |
||
+ | As the standard blast alignment hits the limit of 250 matches per alignment, and all of them still seemed very significant (Evalue of < 1e-60) we increased the number of max target hits to 2000 and set an Evalue threshold of 0.002. With this method we found about 1550 matching alignments. |
||
+ | |||
+ | As can be seen in the figure to the right [[File:Blast_SeqSim.png|thumb|400px|Distibution of sequence similarity with the BCKDHA blast-query against the big80 database.]], the sequence alignments mainly have a similarity between 15% and 40%. |
||
+ | [[File:Blast-evalue.jpg|400px|Distribution of evalues in BLAST.]] |
||
+ | |||
+ | == Psi-Blast == |
||
+ | == HHBlits == |
||
+ | = Multiple Sequence Alignment (MSA) = |
||
+ | In this task we are to produce MSA´s out of our database search results. The first step here is to create representative datasets, followed by creating MSA´s using different tools, and finally review the alignments using jalview and compare the tools against each other. |
||
+ | == Dataset creation == |
||
+ | We have chosen the following sequences from the Psi-Blast run with evalue E-10 and 10 iterations, trying to fit into the scheme given on the task-page: |
||
+ | <table border=1> |
||
+ | <tr><td>Identifier</td><td>Identity</td><td>Organism</td><td>Description</td></tr> |
||
+ | <tr><td colspan=4 align=center>ref seq</td></tr> |
||
+ | <tr><td>P12694</td><td>100%</td><td>human</td><td>ODBA_HUMAN 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial</td></tr> |
||
+ | <tr><td colspan=4 align=center>high identity ( >60% )</td></tr> |
||
+ | <tr><td>B4DP47</td><td>90%</td><td>human</td><td>B4DP47_HUMAN Uncharacterized protein</td></tr> |
||
+ | <tr><td>H2L9X9</td><td>80%</td><td>Oryzias latipes</td><td>H2L9X9_ORYLA Uncharacterized protein</td></tr> |
||
+ | <tr><td>H2NYX7</td><td>90%</td><td>Pongo abelii</td><td>H2NYX7_PONAB Uncharacterized protein</td></tr> |
||
+ | <tr><td>G3RDZ3</td><td>99%</td><td>Gorilla gorilla gorilla</td><td>G3RDZ3_GORGO Uncharacterized protein</td></tr> |
||
+ | <tr><td>G7PXN6</td><td>92%</td><td>Macaca fascicularis</td><td>G7PXN6_MACFA Putative uncharacterized protein</td></tr> |
||
+ | <tr><td>C1BZX0</td><td>61%</td><td>Caligus clemensi</td><td>C1BZX0_9MAXI 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial</td></tr> |
||
+ | <tr><td>E9C2J8</td><td>62%</td><td>Capsaspora owczarzaki</td><td>E9C2J8_CAPO3 Branched chain keto acid dehydrogenase E1</td></tr> |
||
+ | <tr><td>A8XMR6</td><td>63%</td><td>Caenorhabditis briggsae</td><td>A8XMR6_CAEBR Putative uncharacterized protein</td></tr> |
||
+ | <tr><td>F1L131</td><td>61%</td><td>Ascaris suum</td><td>F1L131_ASCSU 2-oxoisovalerate dehydrogenase subunit alpha</td></tr> |
||
+ | <tr><td>B3S5B4</td><td>66%</td><td>Trichoplax adhaerens</td><td>B3S5B4_TRIAD Putative uncharacterized protein</td></tr> |
||
+ | <tr><td colspan=4 align=center>moderate identity ( >40% )</td></tr> |
||
+ | <tr><td>F7NS20</td><td>50%</td><td>Rheinheimera sp. A13L</td><td>F7NS20_9GAMM Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase component alpha subunit</td></tr> |
||
+ | <tr><td>A3JES2</td><td>50%</td>Marinobacter sp. ELB17<td></td><td>A3JES2_9ALTE 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit</td></tr> |
||
+ | <tr><td>F2GCR2</td><td>49%</td><td>Alteromonas macleodii (strain DSM 17117 / Deep ecotype)</td>F2GCR2_ALTMD Alpha keto acid dehydrogenase complex, E1 component, alpha subunit OS=Alteromonas macleodii (strain DSM 17117 / Deep ecotype)<td></td></tr> |
||
+ | <tr><td>E1VBY4</td><td>48%</td><td>Halomonas elongata</td><td>E1VBY4_HALED 3-methyl-2-oxobutanoate dehydrogenase (2-methylpropanoyl-transferring)</td></tr> |
||
+ | <tr><td>A9TD44</td><td>47%</td><td>Physcomitrella patens subsp. patens</td><td>A9TD44_PHYPA Predicted protein</td></tr> |
||
+ | <tr><td>E1ZBL6</td><td>51%</td><td>Chlorella variabilis</td><td>E1ZBL6_CHLVA Putative uncharacterized protein</td></tr> |
||
+ | <tr><td>F2Q107</td><td>51%</td><td>Trichophyton equinum</td><td>F2Q107_TRIEC 2-oxoisovalerate dehydrogenase subunit alpha</td></tr> |
||
+ | <tr><td>B6GYK7</td><td>52%</td><td>Penicillium chrysogenum</td><td>B6GYK7_PENCW Pc12g08790 protein</td></tr> |
||
+ | <tr><td>G2QH91</td><td>53%</td><td>Thielavia heterothallica</td><td>G2QH91_THIHA Dehydrogenase-like protein</td></tr> |
||
+ | <tr><td>G0S8V2</td><td>51%</td><td>Chaetomium thermophilum</td><td>G0S8V2_CHATD Alpha subunit-like protein</td></tr> |
||
+ | <tr><td colspan=4 align=center>whole range identity ( 0-100% )</td></tr> |
||
+ | <tr><td>F6EVQ9</td><td>39%</td><td>Sphingobium chlorophenolicum</td><td>F6EVQ9_SPHCR 3-methyl-2-oxobutanoate dehydrogenase</td></tr> |
||
+ | <tr><td>B1YII6</td><td>39%</td><td>Exiguobacterium sibiricum</td><td>B1YII6_EXIS2 Pyruvate dehydrogenase</td></tr> |
||
+ | <tr><td>Q9HN77</td><td>34%</td><td>Halobacterium salinarium</td><td>Q9HN77_HALSA Pyruvate dehydrogenase</td></tr> |
||
+ | <tr><td>G2QH91</td><td>53%</td><td>Thielavia heterothallica</td><td>G2QH91_THIHA Dehydrogenase-like protein</td></tr> |
||
+ | <tr><td>A9TD44</td><td>47%</td><td>Physcomitrella patens subsp. patens</td><td>A9TD44_PHYPA Predicted protein</td></tr> |
||
+ | <tr><td>G0S8V2</td><td>51%</td><td>Chaetomium thermophilum</td><td>G0S8V2_CHATD Alpha subunit-like protein</td></tr> |
||
+ | <tr><td>B4DP47</td><td>90%</td><td>human</td><td>B4DP47_HUMAN Uncharacterized protein</td></tr> |
||
+ | <tr><td>H2L9X9</td><td>80%</td><td>Oryzias latipes</td><td>H2L9X9_ORYLA Uncharacterized protein</td></tr> |
||
+ | <tr><td>H2NYX7</td><td>90%</td><td>Pongo abelii</td><td>H2NYX7_PONAB Uncharacterized protein</td></tr> |
||
+ | <tr><td>C1BZX0</td><td>61%</td><td>Caligus clemensi</td><td>C1BZX0_9MAXI 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial</td></tr> |
||
+ | </table> |
||
+ | |||
+ | == clustalW == |
||
+ | |||
+ | == T-Coffee == |
||
+ | == Muscle == |
||
+ | == Results == |
||
+ | Jalview representations of the alignments: |
||
+ | |||
+ | <table> |
||
+ | <tr rowspan=3><td> >60% fraction </td></tr> |
||
+ | <tr><td>[[File:Clustal high.jpg|800px|thumb|ClustalW MSA >60%]]</td></tr> |
||
+ | |||
+ | <tr><td>[[File:T-coffee high.jpg|800px|thumb|T-Coffee MSA >60%]]</td></tr> |
||
+ | |||
+ | <tr><td>[[File:Muscle high.jpg|800px|thumb|Muscle MSA >60%]]</tr></td> |
||
+ | |||
+ | <tr rowspan=3><td> >40% fraction </td></tr> |
||
+ | <tr><td>[[File:Clustal moderate.jpg|800px|thumb|ClustalW MSA >40%]]</td></tr> |
||
+ | |||
+ | <tr><td>[[File:T-coffee moderate.jpg|800px|thumb|T-Coffee MSA >40%]]</td></tr> |
||
+ | |||
+ | <tr><td>[[File:Muscle moderate.jpg|800px|thumb|Muscle MSA >40%]]</td></tr> |
||
+ | <tr rowspan=3><td> whole range fraction </td></tr> |
||
+ | <tr><td>[[File:Clustal whole.jpg|800px|thumb|ClustalW MSA 0-100%]]</td></tr> |
||
+ | |||
+ | <tr><td>[[File:T-coffee whole.jpg|800px|thumb|T-Coffee MSA 0-100%]]</td></tr> |
||
+ | |||
+ | <tr><td>[[File:Muscle whole.jpg|800px|thumb|Muscle MSA 0-100%]]</td></tr> |
||
+ | </table> |
||
+ | <table> |
||
+ | <tr><td> |
||
+ | <table border=1> |
||
+ | <tr><td>clustalw >60 ID</td><td>gaps</td></tr> |
||
+ | <tr><td>tr|C1BZX0|C1BZX0_9MAXI </td><td>84</td></tr><tr><td>sp|P12694|ODBA_HUMAN </td><td>70</td></tr><tr><td>tr|B3S5B4|B3S5B4_TRIAD </td><td>174</td></tr> |
||
+ | <tr><td>tr|H2L9X9|H2L9X9_ORYLA </td><td>66</td></tr> |
||
+ | <tr><td>tr|H2NYX7|H2NYX7_PONAB </td><td>52</td></tr><tr><td>tr|F1L131|F1L131_ASCSU </td><td>74</td></tr><tr><td>tr|B4DP47|B4DP47_HUMAN </td><td>67</td></tr> |
||
+ | <tr><td>tr|G7PXN6|G7PXN6_MACFA </td><td>32</td></tr> |
||
+ | <tr><td>tr|G3RDZ3|G3RDZ3_GORGO </td><td>90</td></tr> |
||
+ | <tr><td>tr|E9C2J8|E9C2J8_CAPO3 </td><td>78</td></tr> |
||
+ | <tr><td>tr|A8XMR6|A8XMR6_CAEBR </td><td>83</td></tr> |
||
+ | <tr><td>conserved >8</td><td>235</td></tr> |
||
+ | <tr><td>conserved >10</td><td>130</td></tr> |
||
+ | </table> |
||
+ | </td><td> |
||
+ | <table border=1> |
||
+ | <tr><td>t-coffee >60 ID</td><td>gaps</td></tr> |
||
+ | <tr><td>UniProt/Swiss-Prot|P12694|Q16034|Q16472|ODBA_HUMAN/1-445</td><td>80</td></tr><tr><td>UniProt/Swiss-Prot|C1BZX0|C1BZX0_9MAXI/1-431 </td><td>94</td></tr><tr><td>UniProt/Swiss-Prot|A8XMR6|A8XMR6_CAEBR/1-432 </td><td>93</td></tr><tr><td>UniProt/Swiss-Prot|B4DP47|E7EW46|B4DP47_HUMAN/1-448 </td><td>77</td></tr><tr><td>UniProt/Swiss-Prot|H2L9X9|H2L9X9_ORYLA/1-449 </td><td>76</td></tr><tr><td>UniProt/Swiss-Prot|F1L131|F1L131_ASCSU/1-441 </td><td>84</td></tr><tr><td>UniProt/Swiss-Prot|G3RDZ3|G3RDZ3_GORGO/1-425 </td><td>100</td></tr><tr><td>UniProt/Swiss-Prot|B3S5B4|B3S5B4_TRIAD/1-341 </td><td>184</td></tr><tr><td>UniProt/Swiss-Prot|G7PXN6|G7PXN6_MACFA/1-483 </td><td>42</td></tr><tr><td>UniProt/Swiss-Prot|H2NYX7|H2NYX7_PONAB/1-463 </td><td>62</td></tr><tr><td>UniProt/Swiss-Prot|E9C2J8|E9C2J8_CAPO3/1-437 </td><td>88</td></tr><tr><td>conserved >8</td><td>235</td></tr> |
||
+ | <tr><td>conserved >10</td><td>129</td></tr> |
||
+ | </table> |
||
+ | </td><td> |
||
+ | <table border=1> |
||
+ | <tr><td>muscle >60 ID</td><td>gaps</td></tr> |
||
+ | <tr><td>UniProt/Swiss-Prot|P12694|Q16034|Q16472|ODBA_HUMAN/1-445</td><td>75</td></tr><tr><td>UniProt/Swiss-Prot|C1BZX0|C1BZX0_9MAXI/1-431 </td><td>89</td></tr><tr><td>UniProt/Swiss-Prot|A8XMR6|A8XMR6_CAEBR/1-432 </td><td>88</td></tr><tr><td>UniProt/Swiss-Prot|B4DP47|E7EW46|B4DP47_HUMAN/1-448 </td><td>72</td></tr><tr><td>UniProt/Swiss-Prot|H2L9X9|H2L9X9_ORYLA/1-449 </td><td>71</td></tr><tr><td>UniProt/Swiss-Prot|F1L131|F1L131_ASCSU/1-441 </td><td>79</td></tr><tr><td>UniProt/Swiss-Prot|G3RDZ3|G3RDZ3_GORGO/1-425 </td><td>95</td></tr><tr><td>UniProt/Swiss-Prot|B3S5B4|B3S5B4_TRIAD/1-341 </td><td>179</td></tr><tr><td>UniProt/Swiss-Prot|G7PXN6|G7PXN6_MACFA/1-483 </td><td>37</td></tr><tr><td>UniProt/Swiss-Prot|H2NYX7|H2NYX7_PONAB/1-463 </td><td>57</td></tr><tr><td>UniProt/Swiss-Prot|E9C2J8|E9C2J8_CAPO3/1-437 </td><td>83</td></tr><tr><td>conserved >8</td><td>236</td></tr> |
||
+ | <tr><td>conserved >10</td><td>130</td></tr> |
||
+ | </table> |
||
+ | </td></tr></table> |
||
+ | <table><tr><td> |
||
+ | <table border=1> |
||
+ | <tr><td>clustalw >40 ID</td><td>gaps</td></tr> |
||
+ | <tr><td>tr|G0S8V2|G0S8V2_CHATD </td><td>29</td></tr><tr><td>sp|P12694|ODBA_HUMAN </td><td>92</td></tr><tr><td>tr|E1ZBL6|E1ZBL6_CHLVA </td><td>141</td></tr><tr><td>tr|A9TD44|A9TD44_PHYPA </td><td>70</td></tr><tr><td>tr|A3JES2|A3JES2_9ALTE </td><td>133</td></tr><tr><td>tr|F2Q107|F2Q107_TRIEC </td><td>90</td></tr><tr><td>tr|B6GYK7|B6GYK7_PENCW </td><td>89</td></tr><tr><td>tr|F2GCR2|F2GCR2_ALTMD </td><td>142</td></tr><tr><td>tr|G2QH91|G2QH91_THIHA </td><td>66</td></tr><tr><td>tr|F7NS20|F7NS20_9GAMM </td><td>143</td></tr><tr><td>tr|E1VBY4|E1VBY4_HALED </td><td>130</td></tr><tr><td>conserved >8</td><td>139</td></tr> |
||
+ | <tr><td>conserved >10</td><td>99</td></tr> |
||
+ | </table> |
||
+ | </td><td> |
||
+ | <table border=1> |
||
+ | <tr><td>t-coffee >40 ID</td><td>gaps</td></tr> |
||
+ | <tr><td>UniProt/Swiss-Prot|P12694|Q16034|Q16472|ODBA_HUMAN/1-445</td><td>109</td></tr><tr><td>UniProt/Swiss-Prot|G2QH91|G2QH91_THIHA/1-471 </td><td>83</td></tr><tr><td>UniProt/Swiss-Prot|F7NS20|F7NS20_9GAMM/1-394 </td><td>160</td></tr><tr><td>UniProt/Swiss-Prot|A3JES2|A3JES2_9ALTE/1-404 </td><td>150</td></tr><tr><td>UniProt/Swiss-Prot|B6GYK7|B6GYK7_PENCW/1-448 </td><td>106</td></tr><tr><td>UniProt/Swiss-Prot|E1ZBL6|E1ZBL6_CHLVA/1-396 </td><td>158</td></tr><tr><td>UniProt/Swiss-Prot|F2GCR2|F2GCR2_ALTMD/1-395 </td><td>159</td></tr><tr><td>UniProt/Swiss-Prot|F2Q107|F2Q107_TRIEC/1-447 </td><td>107</td></tr><tr><td>UniProt/Swiss-Prot|A9TD44|A9TD44_PHYPA/1-467 </td><td>87</td></tr><tr><td>UniProt/Swiss-Prot|E1VBY4|E1VBY4_HALED/1-407 </td><td>147</td></tr><tr><td>UniProt/Swiss-Prot|G0S8V2|G0S8V2_CHATD/1-508 </td><td>46</td></tr><tr><td>conserved >8</td><td>149</td></tr> |
||
+ | <tr><td>conserved >10</td><td>101</td></tr> |
||
+ | </table> |
||
+ | </td><td> |
||
+ | <table border=1> |
||
+ | <tr><td>muscle >40 ID</td><td>gaps</td></tr> |
||
+ | <tr><td>UniProt/Swiss-Prot|P12694|Q16034|Q16472|ODBA_HUMAN/1-445</td><td>93</td></tr><tr><td>UniProt/Swiss-Prot|G2QH91|G2QH91_THIHA/1-471 </td><td>67</td></tr><tr><td>UniProt/Swiss-Prot|F7NS20|F7NS20_9GAMM/1-394 </td><td>144</td></tr><tr><td>UniProt/Swiss-Prot|A3JES2|A3JES2_9ALTE/1-404 </td><td>134</td></tr><tr><td>UniProt/Swiss-Prot|B6GYK7|B6GYK7_PENCW/1-448 </td><td>90</td></tr><tr><td>UniProt/Swiss-Prot|E1ZBL6|E1ZBL6_CHLVA/1-396 </td><td>142</td></tr><tr><td>UniProt/Swiss-Prot|F2GCR2|F2GCR2_ALTMD/1-395 </td><td>143</td></tr><tr><td>UniProt/Swiss-Prot|F2Q107|F2Q107_TRIEC/1-447 </td><td>91</td></tr><tr><td>UniProt/Swiss-Prot|A9TD44|A9TD44_PHYPA/1-467 </td><td>71</td></tr><tr><td>UniProt/Swiss-Prot|E1VBY4|E1VBY4_HALED/1-407 </td><td>131</td></tr><tr><td>UniProt/Swiss-Prot|G0S8V2|G0S8V2_CHATD/1-508 </td><td>30</td></tr><tr><td>conserved >8</td><td>149</td></tr> |
||
+ | <tr><td>conserved >10</td><td>100</td></tr> |
||
+ | </table> |
||
+ | </td></tr></table> |
||
+ | <table><tr><td> |
||
+ | <table border=1> |
||
+ | <tr><td>clustalw whole range ID</td><td>gaps</td></tr> |
||
+ | <tr><td>sp|P12694|ODBA_HUMAN </td><td>133</td></tr><tr><td>tr|C1BZX0|C1BZX0_9MAXI </td><td>147</td></tr><tr><td>tr|G0S8V2|G0S8V2_CHATD </td><td>70</td></tr><tr><td>tr|H2L9X9|H2L9X9_ORYLA </td><td>129</td></tr><tr><td>tr|A9TD44|A9TD44_PHYPA </td><td>111</td></tr><tr><td>tr|H2NYX7|H2NYX7_PONAB </td><td>115</td></tr><tr><td>tr|B1YII6|B1YII6_EXIS2 </td><td>228</td></tr><tr><td>tr|B4DP47|B4DP47_HUMAN </td><td>130</td></tr><tr><td>tr|F6EVQ9|F6EVQ9_SPHCR </td><td>143</td></tr><tr><td>tr|Q9HN77|Q9HN77_HALSA </td><td>159</td></tr><tr><td>tr|G2QH91|G2QH91_THIHA </td><td>107</td></tr><tr><td>conserved >8</td><td>132</td></tr> |
||
+ | <tr><td>conserved >10</td><td>58</td></tr> |
||
+ | </table> |
||
+ | </td><td> |
||
+ | <table border=1> |
||
+ | <tr><td>t-coffee whole range ID</td><td>gaps</td></tr> |
||
+ | <tr><td>UniProt/Swiss-Prot|P12694|Q16034|Q16472|ODBA_HUMAN/1-445</td><td>170</td></tr><tr><td>UniProt/Swiss-Prot|G2QH91|G2QH91_THIHA/1-471 </td><td>144</td></tr><tr><td>UniProt/Swiss-Prot|B1YII6|B1YII6_EXIS2/1-350 </td><td>265</td></tr><tr><td>UniProt/Swiss-Prot|C1BZX0|C1BZX0_9MAXI/1-431 </td><td>184</td></tr><tr><td>UniProt/Swiss-Prot|B4DP47|E7EW46|B4DP47_HUMAN/1-448 </td><td>167</td></tr><tr><td>UniProt/Swiss-Prot|H2L9X9|H2L9X9_ORYLA/1-449 </td><td>166</td></tr><tr><td>UniProt/Swiss-Prot|F6EVQ9|F6EVQ9_SPHCR/1-435 </td><td>180</td></tr><tr><td>UniProt/Swiss-Prot|Q9HN77|Q9HN77_HALSA/1-419 </td><td>196</td></tr><tr><td>UniProt/Swiss-Prot|G0S8V2|G0S8V2_CHATD/1-508 </td><td>107</td></tr><tr><td>UniProt/Swiss-Prot|H2NYX7|H2NYX7_PONAB/1-463 </td><td>152</td></tr><tr><td>UniProt/Swiss-Prot|A9TD44|A9TD44_PHYPA/1-467 </td><td>148</td></tr><tr><td>conserved >8</td><td>138</td></tr> |
||
+ | <tr><td>conserved >10</td><td>60</td></tr> |
||
+ | </table> |
||
+ | </td><td> |
||
+ | <table border=1> |
||
+ | <tr><td>muscle whole range ID</td><td>gaps</td></tr> |
||
+ | <tr><td>UniProt/Swiss-Prot|P12694|Q16034|Q16472|ODBA_HUMAN/1-445</td><td>135</td></tr><tr><td>UniProt/Swiss-Prot|G2QH91|G2QH91_THIHA/1-471 </td><td>109</td></tr><tr><td>UniProt/Swiss-Prot|B1YII6|B1YII6_EXIS2/1-350 </td><td>230</td></tr><tr><td>UniProt/Swiss-Prot|C1BZX0|C1BZX0_9MAXI/1-431 </td><td>149</td></tr><tr><td>UniProt/Swiss-Prot|B4DP47|E7EW46|B4DP47_HUMAN/1-448 </td><td>132</td></tr><tr><td>UniProt/Swiss-Prot|H2L9X9|H2L9X9_ORYLA/1-449 </td><td>131</td></tr><tr><td>UniProt/Swiss-Prot|F6EVQ9|F6EVQ9_SPHCR/1-435 </td><td>145</td></tr><tr><td>UniProt/Swiss-Prot|Q9HN77|Q9HN77_HALSA/1-419 </td><td>161</td></tr><tr><td>UniProt/Swiss-Prot|G0S8V2|G0S8V2_CHATD/1-508 </td><td>72</td></tr><tr><td>UniProt/Swiss-Prot|H2NYX7|H2NYX7_PONAB/1-463 </td><td>117</td></tr><tr><td>UniProt/Swiss-Prot|A9TD44|A9TD44_PHYPA/1-467 </td><td>113</td></tr><tr><td>conserved >8</td><td>136</td></tr> |
||
+ | <tr><td>conserved >10</td><td>58</td></tr> |
||
+ | </table> |
||
+ | </td></tr></table> |
Latest revision as of 12:11, 11 May 2012
Sorry, were behind scedule, page will be filled with content as soon as possible.
We researched the protein sequence of the branched-chain alpha-keto acid dehydrogenase complex subunit alpha (BCKDHA) with the following original sequence:
- BCKDHA
>sp|P12694|ODBA_HUMAN 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial OS=Homo sapiens GN=BCKDHA PE=1 SV=2 MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAE FIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILY ESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYG NISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGA ASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHP ISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQL RKQQESLARHLQTYGEHYPLDHFDK
Contents
Blast
To calculate the sequence alignments we used the blast and psiblast binaries from NCBI (version 2.2.26+) As the standard blast alignment hits the limit of 250 matches per alignment, and all of them still seemed very significant (Evalue of < 1e-60) we increased the number of max target hits to 2000 and set an Evalue threshold of 0.002. With this method we found about 1550 matching alignments.
As can be seen in the figure to the right
, the sequence alignments mainly have a similarity between 15% and 40%.
Psi-Blast
HHBlits
Multiple Sequence Alignment (MSA)
In this task we are to produce MSA´s out of our database search results. The first step here is to create representative datasets, followed by creating MSA´s using different tools, and finally review the alignments using jalview and compare the tools against each other.
Dataset creation
We have chosen the following sequences from the Psi-Blast run with evalue E-10 and 10 iterations, trying to fit into the scheme given on the task-page:
Marinobacter sp. ELB17F2GCR2_ALTMD Alpha keto acid dehydrogenase complex, E1 component, alpha subunit OS=Alteromonas macleodii (strain DSM 17117 / Deep ecotype)Identifier | Identity | Organism | Description |
ref seq | |||
P12694 | 100% | human | ODBA_HUMAN 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial |
high identity ( >60% ) | |||
B4DP47 | 90% | human | B4DP47_HUMAN Uncharacterized protein |
H2L9X9 | 80% | Oryzias latipes | H2L9X9_ORYLA Uncharacterized protein |
H2NYX7 | 90% | Pongo abelii | H2NYX7_PONAB Uncharacterized protein |
G3RDZ3 | 99% | Gorilla gorilla gorilla | G3RDZ3_GORGO Uncharacterized protein |
G7PXN6 | 92% | Macaca fascicularis | G7PXN6_MACFA Putative uncharacterized protein |
C1BZX0 | 61% | Caligus clemensi | C1BZX0_9MAXI 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial |
E9C2J8 | 62% | Capsaspora owczarzaki | E9C2J8_CAPO3 Branched chain keto acid dehydrogenase E1 |
A8XMR6 | 63% | Caenorhabditis briggsae | A8XMR6_CAEBR Putative uncharacterized protein |
F1L131 | 61% | Ascaris suum | F1L131_ASCSU 2-oxoisovalerate dehydrogenase subunit alpha |
B3S5B4 | 66% | Trichoplax adhaerens | B3S5B4_TRIAD Putative uncharacterized protein |
moderate identity ( >40% ) | |||
F7NS20 | 50% | Rheinheimera sp. A13L | F7NS20_9GAMM Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase component alpha subunit |
A3JES2 | 50% | A3JES2_9ALTE 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit | |
F2GCR2 | 49% | Alteromonas macleodii (strain DSM 17117 / Deep ecotype) | |
E1VBY4 | 48% | Halomonas elongata | E1VBY4_HALED 3-methyl-2-oxobutanoate dehydrogenase (2-methylpropanoyl-transferring) |
A9TD44 | 47% | Physcomitrella patens subsp. patens | A9TD44_PHYPA Predicted protein |
E1ZBL6 | 51% | Chlorella variabilis | E1ZBL6_CHLVA Putative uncharacterized protein |
F2Q107 | 51% | Trichophyton equinum | F2Q107_TRIEC 2-oxoisovalerate dehydrogenase subunit alpha |
B6GYK7 | 52% | Penicillium chrysogenum | B6GYK7_PENCW Pc12g08790 protein |
G2QH91 | 53% | Thielavia heterothallica | G2QH91_THIHA Dehydrogenase-like protein |
G0S8V2 | 51% | Chaetomium thermophilum | G0S8V2_CHATD Alpha subunit-like protein |
whole range identity ( 0-100% ) | |||
F6EVQ9 | 39% | Sphingobium chlorophenolicum | F6EVQ9_SPHCR 3-methyl-2-oxobutanoate dehydrogenase |
B1YII6 | 39% | Exiguobacterium sibiricum | B1YII6_EXIS2 Pyruvate dehydrogenase |
Q9HN77 | 34% | Halobacterium salinarium | Q9HN77_HALSA Pyruvate dehydrogenase |
G2QH91 | 53% | Thielavia heterothallica | G2QH91_THIHA Dehydrogenase-like protein |
A9TD44 | 47% | Physcomitrella patens subsp. patens | A9TD44_PHYPA Predicted protein |
G0S8V2 | 51% | Chaetomium thermophilum | G0S8V2_CHATD Alpha subunit-like protein |
B4DP47 | 90% | human | B4DP47_HUMAN Uncharacterized protein |
H2L9X9 | 80% | Oryzias latipes | H2L9X9_ORYLA Uncharacterized protein |
H2NYX7 | 90% | Pongo abelii | H2NYX7_PONAB Uncharacterized protein |
C1BZX0 | 61% | Caligus clemensi | C1BZX0_9MAXI 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial |
clustalW
T-Coffee
Muscle
Results
Jalview representations of the alignments:
>60% fraction |
>40% fraction |
whole range fraction |
|
|
|
|
|
|
|
|
|