Difference between revisions of "Sequence-based mutation analysis BCKDHA"

From Bioinformatikpedia
(Subset of mutations)
Line 321: Line 321:
== Comparison ==

Revision as of 21:14, 23 June 2011


A Protocol was created where all steps for running the programs etc are described.

Subset of mutations

Reference amino acid Mutated amino acid Structural Difference Secondary Structure
Position Residue Properties Structure Residue Properties Structure
29 G tiny, small E charged, polar C
125 Q acidic, polar
E charged, polar
166 Y hydrophobic, aromatic, polar
BCKDHA Y121N Y.png
N acidic, polar, small
BCKDHA Y121N N.png
BCKDHA Y121N.png
249 G tiny, small
BCKDHA G204S G.png
S polar, small, tiny, hydroxylic
BCKDHA G204S S.png
BCKDHA G204S.png
264 C sulphur containing, hydrophobic, tiny, small, polar
BCKDHA C219W C.png
W hydrophobic, aromatic, polar
BCKDHA C219W W.png
BCKDHA C219W.png
265 R charged, positive (basic), polar
BCKDHA R220W R.png
W hydrophobic, aromatic, polar
BCKDHA R220W W.png
BCKDHA R220W.png
326 I aliphatic, hydrophobic
BCKDHA I281T I.png
T hydroxylic, hydrophobic, small, polar
BCKDHA I281T T.png
BCKDHA I281T.png
361 I aliphatic, hydrophobic
BCKDHA I316V I.png
V aliphatic, hydrophobic, small
BCKDHA I316V V.png
BCKDHA I316V.png
409 F aromatic, hydrophobic
BCKDHA F364C F.png
C sulphur containing, hydrophobic, tiny, small, polar
BCKDHA F364C C.png
BCKDHA F364C.png
438 Y hydrophobic, aromatic, polar
BCKDHA Y393N Y.png
N acidic, polar, small
BCKDHA Y393N N.png
BCKDHA Y393N.png

Annotation: H = helix, E = beta-sheet, C = coil

To visualize the mutations in the three-dimensional protein structure, the PDB entry for BCKDHA, 1U5B, was used. As the PDB file only contains coordinate information about the protein itself, the signal peptide (45 first amino acids) are not annotated. Therefore the first mutation on position 29, which lies in the signal peptide, could not be visualized.

The Protocol describes in detail the way how we used pymol to visualize our mutations.

Position AA1/AA2 BLOSUM62 PAM1 PAM250
score worst score worst score worst
29 G/E -2 -4 (I, L) 7 0 (I, W, Y) 9 2 (W)
125 Q/E 2 -3 (C, F, I) 27 0 (F, W, Y) 7 1 (C, F, W)
166 Y/N -2 -3 (D, G, P) 3 0 (R, D, Q, G, K, M, P) 2 1 (A, R, D, Q, E, G, K, P)
249 G/S 0 -4 (I, L) 21 0 (I, W, Y) 11 2 (W)
264 C/W -2 -4 (E) 0 0 (N, D, Q, E, G, L, K, M, F, W) 1 1 (R, N, D, Q, E, L, K, M, F, W)
265 R/W -3 -3 (W, V, F, I, C) 8 0 (D, E, G, Y) 7 1 (F)
326 I/T -1 -4 (G) 7 0 (G, A, P, W) 4 1 (W)
361 I/V 3 -4 (G) 33 0 (G, H, P, W) 9 1 (W)
409 F/C -2 -4 (P) 0 0 (D, C, Q, E, K, P, V) 1 1 (R, D, C, Q, E, G, K, P)
438 Y/N -2 -3 (D, G, P) 3 0 (R, D, Q, G, K, M, P) 2 1 (A, R, D, Q, E, G, K, P)



In order to get a human-readable PSSM-File PsiBlast was run using the following command: blastpgp -i sequence.fasta -j 5 -d /data/blast/nr/nr -C profile.ckp -u 1 -J T

Multiple Sequence Alignment

To find the homologue sequences to BCKDHA we used BLAST. It found 250 homologous sequences, 25 of them are mammals.

ID Accession Entry name
sp P11178 ODBA_BOVIN
sp P12694 ODBA_HUMAN
sp P50136 ODBA_MOUSE
sp P11960 ODBA_RAT
tr B2R8A9 B2R8A9_HUMAN
tr Q658P7 Q658P7_HUMAN
tr E7EW46 E7EW46_HUMAN
tr B4DP47 B4DP47_HUMAN
tr Q59EI3 Q59EI3_HUMAN
tr F1N5F2 F1N5F2_BOVIN
tr B1PK12 B1PK12_PIG
tr F1PI86 F1PI86_CANFA
tr Q3U3J1 Q3U3J1_MOUSE
tr Q99L69 Q99L69_MOUSE
tr Q5EB89 Q5EB89_RAT

Multiple Alignment of the homologous sequences of BCKDHA with CLUSTALW

With this 25 results we made a multiple alignment by using CLUSTALW. The alignment with all mammalian homologous was quite bad because of the sequences "Q6ZSA3" and "E2RPW4". These two sequences are much longer than the other one. So we removed those sequences and realigned the other sequences.

With this new multiple Alignment we could analyze the 10 positions to find out how good they are conserved.

position conservation wildtype conservation mutant
29 0.72 0
125 0.96 0
166 1 0
249 1 0
264 1 0
265 1 0
326 1 0
361 0.92 0
409 0.92 0
438 0.92 0

The results show that all amino acids on the observed positions are really good conserved since the value is always nearly 1. Only on position 29 the conservation of Glycin is only about 72%. This is not that high as the other results but it is still good conserved. Regions in the proteins which are good conserved are propably very important for the structure and the function of the protein. Because of the fact that all amino acids are very good conserved, the mutations on these positions can be very damaging and can have a huge impact on the protein and its function.


To run SNAP we used the command:

snapfun -i BCKDHA.fasta -m mutations.txt -o SNAP.out

nsSNP Prediction Reliability Index Expected Accuracy
G29E Neutral 0 53%
Q125E Non-neutral 1 63%
Y166N Non-neutral 2 70%
G249S Neutral 1 60%
C264W Non-neutral 4 82%
R265W Non-neutral 4 82%
I326T Non-neutral 3 78%
I361V Neutral 4 85%
F409C Non-neutral 4 82%
Y438N Non-neutral 4 82%

A second SNAP run was performed where all ten chosen mutation positions were mutated by all possible substitutions.


The following table displays the SIFT results. The threshold for intolerance is 0.05.

The amino acids are colored in the following way:

  • nonpolar
  • uncharged polar
  • basic
  • acidic

Capital letters: amino acids appear in the alignment

Lower case letters: amino acids result from prediction

Seq Rep:fraction of sequences that contain one of the basic amino acids

Position Reference AA Mutated AA SIFT prediction SIFT Matrix Prediction
Predict Not Tolerated Seq Rep Predict Tolerated BCKDHA aa.PNG
29 G E 0.37 wcmPdIGnqRhVTkeSFLAy BCKDHA 29.PNG tolerated
125 Q E ywvtsrpnmlkihgfedca 0.98 Q BCKDHA 125.PNG not tolerated
166 Y N cpdmeqkngrtisval 1.00 FHYW BCKDHA 166.PNG not tolerated
249 G S whyfimrqnlckdvtps 1.00 EGA BCKDHA 249.PNG not tolerated
264 C W ywvtsrqpnmlkihgfeda 0.98 C BCKDHA 264.PNG not tolerated
265 R W ywvtsqpnmlkihgfedca 1.00 R BCKDHA 265.PNG not tolerated
326 I T hdwpneqcrsgkytaM 1.00 FLVI BCKDHA 326.PNG not tolerated
361 I V hwdgnqryekspcfmtlA 1.00 VI BCKDHA 361.PNG tolerated
409 F C hndkrqgecpstamvwiy 1.00 LF BCKDHA 409.PNG not tolerated
438 Y N wvtsrqpnmlkihgfedca 1.00 Y BCKDHA 438.PNG not tolerated

The only substritutions SIFT predicts not to affect protein function are G29E and I361V. The first substitution may be tolerated, as this position is not within the actual protein sequence. The second tolerated amino acid exchange is from isoleucin to valin, which are both 'branched-chain' amino acids. These two amino acids quite similar concerning their structure and physiochemical properties, so an exchange can be tolerated.


Position AA1/AA2 HumDiv HumVar
prediction Score Sensitivity Specificity prediction Score Sensitivity Specificity
29 G/E benign 0.025 0.96 0.80 benign 0.018 0.96 0.52
125 Q/E possibly damaging 0.759 0.85 0.93 benign 0.285 0.87 0.75
166 Y/N probably damaging 0.997 0.40 0.98 probably damaging 0.964 0.59 0.93
249 G/S benign 0.145 0.93 0.86 benign 0.292 0.86 0.75
264 C/W probably damaging 1.000 0.00 1.00 probably damaging 1.000 0.00 1.00
265 R/W probably damaging 1.000 0.00 1.00 probably damaging 1.000 0.00 1.00
326 I/T probably damaging 0.997 0.40 0.98 probably damaging 0.998 0.16 0.99
361 I/V benign 0.039 0.95 0.82 benign 0.178 0.89 0.70
409 F/C probably damaging 0.998 0.27 0.99 probably damaging 0.939 0.64 0.92
438 Y/N probably damaging 1.000 0.00 1.00 probably damaging 0.987 0.49 0.96


back to Maple_syrup_urine_disease main page

go back to Task 5 Mapping SNPs