Task 6 (MSUD)

From Bioinformatikpedia
Revision as of 18:22, 14 June 2013 by Schillerl (talk | contribs) (BCKDHA)


Lab journal



From the output of freecontact, only those residue pairs were extracted, that are separated by at least 5 residues. Pairs that are near in sequence are natually coupled, but this isn't interesting because it doesn't help to predict the 3D structure, where residues intereact, which are far away from each other in the 1D sequence. The CN score values of these extracted pairs range between -0.87 and 5.28. The following diagram shows the distribution of the scores:

MSUD BCKDHA freecontact CN score distribution.png

Most scores are between -1 and 1. So those pairs, which have a score greater than 1 are considered high scoring and used for further analyses. High scoring pairs were regarded as true positive (TP), if their distance (between any pair of atoms) in the reference structure is below 5 Å. There are 94 TPs among 194 high scoring pairs, and this has a correlation to CN score of 0.31 (for a pair with higher CN score it is more likely that it is TP). This table shows the ten highest scoring pairs:

1. residue # 1. aa 2. residue # 2. aa MI score CN score true/false positive
247 H 291 Y 0.69 5.28 TP
236 F 264 C 0.45 4.02 TP
266 N 327 E 0.31 3.67 TP
239 G 270 A 0.35 3.48 TP
296 I 312 E 0.44 3.37 FP
316 R 324 F 0.60 3.34 TP
144 S 235 Y 0.48 3.17 TP
300 G 330 T 0.27 3.13 TP
261 I 317 A 0.25 3.06 TP
301 N 333 I 0.87 2.77 FP

In a second step, the highest scoring 300 (= alignment length) couplings were taken, scores for each residue were summed and normalized by the average score of these couplings. The residues with the highest values (with a gap to the others) are: Thr 338, Phe 130 and Tyr 158. Interestingly, according to Uniprot, Tyr 158 is in the thiamine pyrophosphate binding region, and near position 338 there are some modified residues (phosphoserine).
