Protein structure prediction from evolutionary sequence variation (Phenylketonuria)

From Bioinformatikpedia
Revision as of 15:07, 15 June 2013 by Waldraffs (talk | contribs) (H-Ras)

Summary

...

Multiple Sequence Alignment

Lab journal
The multiple alignment of Ras (PF00071) was downloaded from Pfam.

For our protein two domains are included (ACT and Biopterin domain).

Calculate and analyze correlated mutations

Lab journal

Results

Freecontact is a program that takes a multiple alignment and can calculate the mutual information score and the corrected norm contact score.

  • MI:...
  • CN:...

H-Ras

<figtable id="cn-ras">

Ten results with highest CN for all and for extracted pairs
All pairs
pos1 aa1 pos2 aa2 MI CN
6 L 7 V 0.50 6.01
162 E 163 I 0.47 5.87
83 A 85 N 0.35 5.56
15 G 16 K 0.44 4.86
159 L 160 V 0.47 4.82
9 V 10 G 0.46 4.80
161 R 162 E 0.46 4.63
124 T 125 V 0.34 4.58
10 G 11 A 0.46 4.52
16 K 17 S 0.45 4.44
Extracted pairs
pos1 aa1 pos2 aa2 MI CN
11 A 92 D 0.32 3.40
81 V 116 N 0.24 3.00
87 T 129 Q 0.22 2.69
82 F 141 Y 0.25 2.53
84 I 115 G 0.16 2.53
19 L 81 V 0.14 2.50
82 F 115 G 0.14 2.42
10 G 16 K 0.39 2.26
130 A 141 Y 0.39 2.25
123 R 143 E 0.27 2.21
EVcouplings-extracted
pos1 aa1 pos2 aa2 MI CN
10 G 16 K 0.19 0.22
13 G 21 I 0.68 0.10
11 A 92 D 0.43 0.09
117 K 145 S 0.13 0.08
116 N 146 A 0.12 0.08
81 V 116 N 0.15 0.08
82 F 141 Y 0.32 0.07
35 T 60 G 0.08 0.06
130 A 141 Y 0.34 0.06
114 V 155 A 0.25 0.06

The ten results of freecontact calculated for H-RAS with highest CN-value first for all pairs and second only for pairs with a distance of at least five residues. The third table represents the ten best residue pairs with DI score calculated with EVcouplings. All three tables show the position of the first residue in column one with its corresponding amino acid (column 2) and of the second residue in column three with its amino acid in column four. The next one represents the mutual information score (MI) and the last one the corrected norm contact score(CN). </figtable>

<figure id="cn_distr_ras">

CN-score distribution of H-Ras calculated with freecontact for all (green) and extracted residue pairs with a distance of at least six amino acids (purple).

</figure>

Comparing the best ten CN-scores for all and for residue pairs with at least a distant of more than five amino acids it is remarkable that for all pairs the ten best all are directly neighboring amino acids and have almost twice of the CN-score (<xr id="cn-ras"/>. The range of the CN-score goes from -0.65 to 3.40 for the extracted residue pairs. The distribution can be viewed in <xr id="cn_distr_ras"/>. Additionally the distribution of all residue pairs is shown. They are very similar, nevertheless the range of all pairs goes up to 6.01. For the DI-score calculation using EVcouplings a range between 0.00 and 0.22 was achieved. Five of the ten best results are in common for extracted residue pairs found with freecontact and EVcouplings, which are 10(G)-16(K), 11(A)-92(D), 81(V)-116(N), 82(F)-141(Y) and 130(A)-141(Y).


PAH

<figtable id="cn-pah">

Ten results with highest CN for all and for extracted pairs
All pairs
pos1 aa1 pos2 aa2 MI CN
217 C 218 G 0.77 3.43
402 F 403 A 0.75 3.25
428 Q 430 L 0.81 3.03
216 Y 217 C 1.12 2.92
174 I 175 P 0.73 2.91
120 W 121 F 0.69 2.91
403 A 404 A 1.02 2.83
447 A 448 L 0.64 2.82
428 Q 429 Q 0.92 2.82
429 Q 430 L 0.81 2.78
Extracted pairs
pos1 aa1 pos2 aa2 MI CN
342 A 354 L 0.70 2.46
122 P 128 L 0.69 2.40
122 P 129 D 0.69 2.29
257 G 264 H 1.17 2.27
131 F 137 S 1.09 2.25
282 D 290 H 0.18 2.19
151 D 157 R 0.69 2.18
192 K 221 E 0.94 2.17
264 H 277 Y 0.63 2.10
120 W 128 L 0.67 2.09
EVcouplings-extracted
pos1 aa1 pos2 aa2 MI CN
257 G 264 H 1.15 0.25
342 A 354 L 0.81 0.22
352 G 382 F 0.54 0.20
192 K 221 E 0.74 0.18
282 D 290 H 0.24 0.17
174 I 218 G 0.45 0.16
347 L 354 L 0.48 0.15
365 L 385 L 0.80 0.14
326 W 377 Y 0.36 0.13
235 Q 241 R 0.56 0.13

The ten results of freecontact calculated for the biopterin domain of PAH with highest CN-value first for all pairs and second only for pairs with a distance of at least five residues. The third table represents the best ten DI-scores also for extracted pairs calculated by EVcouplings. All three tables show the position of the first residue in column one with its corresponding amino acid (column 2) and of the second residue in column three with its amino acid in column four. The next one represents the mutual information score (MI) and the last one the corrected norm contact score(CN). </figtable>

<figure id="cn_distr_pah">

CN-score distribution of PAH domain biopterin calculated with freecontact for all (green) and extracted residue pairs with a distance of at least six amino acids (purple).

</figure>


For the biopterin-domain the CN-scores of freecontact for all residue pairs are ranged between -1.04 and 3.44, for extracted pairs they are between -1.04 and 2.46 and for the DI scores calculated with EVcouplings can be found in a range between 0.00 and 0.25. In <xr id="cn_distr_pah"/> the distribution of the CN-scores for all and extracted residue pairs can be viewed. Comparing the ten best results for the extracted residue pairs calculated with freecontact and with EVcouplings you can see that four results can be found in both, which are the residue pairs 342(A)-354(L), 257(G)-264(H), 282(D)-290(H) and 192(K)-221(E).

Discussion

Calculate structural model

Lab journal

H-Ras

<figure id="ras_evfold">

a) ...
b) ...
c) ...

</figure>

PAH

<figure id="biopterin_evfold">

a) ...
b) ...
c) ...

</figure>

References

<references/>