Protein structure prediction from evolutionary sequence variation (Phenylketonuria)

From Bioinformatikpedia
Revision as of 14:13, 15 June 2013 by Waldraffs (talk | contribs) (PAH)

Summary

...

Multiple Sequence Alignment

Lab journal
The multiple alignment of Ras (PF00071) was downloaded from Pfam.

For our protein two domains are included (ACT and Biopterin domain).

Calculate and analyze correlated mutations

Lab journal

Results

Freecontact is a program that takes a multiple alignment and can calculate the mutual information score and the corrected norm contact score.

  • MI:...
  • CN:...

H-Ras

<figtable id="cn-ras">

Ten results with highest CN for all and for extracted pairs
All pairs
pos1 aa1 pos2 aa2 MI CN
6 L 7 V 0.50 6.01
162 E 163 I 0.47 5.87
83 A 85 N 0.35 5.56
15 G 16 K 0.44 4.86
159 L 160 V 0.47 4.82
9 V 10 G 0.46 4.80
161 R 162 E 0.46 4.63
124 T 125 V 0.34 4.58
10 G 11 A 0.46 4.52
16 K 17 S 0.45 4.44
Extracted pairs
pos1 aa1 pos2 aa2 MI CN
11 A 92 D 0.32 3.40
81 V 116 N 0.24 3.00
87 T 129 Q 0.22 2.69
82 F 141 Y 0.25 2.53
84 I 115 G 0.16 2.53
19 L 81 V 0.14 2.50
82 F 115 G 0.14 2.42
10 G 16 K 0.39 2.26
130 A 141 Y 0.39 2.25
123 R 143 E 0.27 2.21

The ten results of freecontact calculated for H-RAS with highest CN-value first for all pairs and second only for pairs with a distance of at least five residues. Both tables show the position of the first residue in column one with its corresponding amino acid (column 2) and of the second residue in column three with its amino acid in column four. The next one represents the mutual information score (MI) and the last one the corrected norm contact score(CN). </figtable>

<figure id="cn_distr_ras">

...

</figure>

Comparing the best ten CN-scores for all and for residue pairs with at least a distant of more than five amino acids it is remarkable that for all pairs the ten best all are directly neighboring amino acids and have almost twice of the CN-score. The range of the CN-score goes from -0.65 to 3.40 for the extracted residue pairs. The distribution can be viewed in <xr id="cn_distr_ras"/>. Additionally the distribution of all residue pairs is shown. They are very similar, nevertheless the range of all pairs goes up to 6.01.

PAH

<figtable id="cn-pah">

Ten results with highest CN for all and for extracted pairs
All pairs
pos1 aa1 pos2 aa2 MI CN
217 C 218 G 0.77 3.43
402 F 403 A 0.75 3.25
428 Q 430 L 0.81 3.03
216 Y 217 C 1.12 2.92
174 I 175 P 0.73 2.91
120 W 121 F 0.69 2.91
403 A 404 A 1.02 2.83
447 A 448 L 0.64 2.82
428 Q 429 Q 0.92 2.82
429 Q 430 L 0.81 2.78
Extracted pairs
pos1 aa1 pos2 aa2 MI CN
342 A 354 L 0.70 2.46
122 P 128 L 0.69 2.40
122 P 129 D 0.69 2.29
257 G 264 H 1.17 2.27
131 F 137 S 1.09 2.25
282 D 290 H 0.18 2.19
151 D 157 R 0.69 2.18
192 K 221 E 0.94 2.17
264 H 277 Y 0.63 2.10
120 W 128 L 0.67 2.09
EVcouplings
pos1 aa1 pos2 aa2 MI CN
257 G 264 H 1.15 0.25
342 A 354 L 0.81 0.22
352 G 382 F 0.54 0.20
192 K 221 E 0.74 0.18
282 D 290 H 0.24 0.17
174 I 218 G 0.45 0.16
347 L 354 L 0.48 0.15
365 L 385 L 0.80 0.14
326 W 377 Y 0.36 0.13
235 Q 241 R 0.56 0.13

The ten results of freecontact calculated for the biopterin domain of PAH with highest CN-value first for all pairs and second only for pairs with a distance of at least five residues. Both tables show the position of the first residue in column one with its corresponding amino acid (column 2) and of the second residue in column three with its amino acid in column four. The next one represents the mutual information score (MI) and the last one the corrected norm contact score(CN). </figtable>


Distribution of CN-Scores: <figure id="cn_distr_ras">

...

</figure>

Discussion

Calculate structural model

Lab journal

H-Ras

<figure id="ras_evfold">

a) ...
b) ...
c) ...

</figure>

PAH

<figure id="biopterin_evfold">

a) ...
b) ...
c) ...

</figure>

References

<references/>