Sequence-based mutation analysis BCKDHA

From Bioinformatikpedia
Revision as of 18:01, 23 June 2011 by Demel (talk | contribs) (SIFT)

General

A Protocol was created where all steps for running the programs etc are described.

Subset of mutations

Reference amino acid Mutated amino acid Secondary Structure
Position Residue Properties Structure Residue Properties Structure
29 G tiny, small E charged, polar C
125 Q acidic, polar E charged, polar C
166 Y hydrophobic, aromatic, polar N acidic, polar, small H
249 G tiny, small S polar, small, tiny, hydroxylic H
264 C sulphur containing, hydrophobic, tiny, small, polar W hydrophobic, aromatic, polar E
265 R charged, positive (basic), polar W hydrophobic, aromatic, polar E
326 I aliphatic, hydrophobic T hydroxylic, hydrophobic, small, polar E
361 I aliphatic, hydrophobic V aliphatic, hydrophobic, small H
409 F aromatic, hydrophobic C sulphur containing, hydrophobic, tiny, small, polar C
438 Y hydrophobic, aromatic, polar N acidic, polar, small C

Annotation: H = helix, E = beta-sheet, C = coil

To visualize the mutations in the three-dimensional protein structure, the PDB entry for BCKDHA, 1U5B, was used. As the PDB file only contains coordinate information about the protein itself, the signal peptide (45 first amino acids) are not annotated. Therefore the first mutation on position 29, which lies in the signal peptide, could not be visualized.

The Protocol describes in detail the way how we used pymol to visualize our mutations.

Position AA1/AA2 BLOSUM62 PAM1 PAM250
score worst score worst score worst
29 G/E -2 -4 (I, L) 7 0 (I, W, Y) 9 2 (W)
125 Q/E 2 -3 (C, F, I) 27 0 (F, W, Y) 7 1 (C, F, W)
166 Y/N -2 -3 (D, G, P) 3 0 (R, D, Q, G, K, M, P) 2 1 (A, R, D, Q, E, G, K, P)
249 G/S 0 -4 (I, L) 21 0 (I, W, Y) 11 2 (W)
264 C/W -2 -4 (E) 0 0 (N, D, Q, E, G, L, K, M, F, W) 1 1 (R, N, D, Q, E, L, K, M, F, W)
265 R/W -3 -3 (W, V, F, I, C) 8 0 (D, E, G, Y) 7 1 (F)
326 I/T -1 -4 (G) 7 0 (G, A, P, W) 4 1 (W)
361 I/V 3 -4 (G) 33 0 (G, H, P, W) 9 1 (W)
409 F/C -2 -4 (P) 0 0 (D, C, Q, E, K, P, V) 1 1 (R, D, C, Q, E, G, K, P)
438 Y/N -2 -3 (D, G, P) 3 0 (R, D, Q, G, K, M, P) 2 1 (A, R, D, Q, E, G, K, P)


BLOSUM62
PAM1
PAM250

PSSM

In order to get a human-readable PSSM-File PsiBlast was run using the following command: blastpgp -i sequence.fasta -j 5 -d /data/blast/nr/nr -C profile.ckp -u 1 -J T

Multiple Sequence Alignment

To find the homologue sequences to BCKDHA we used BLAST. It found 250 homologous sequences, 25 of them are mammals.

ID Accession Entry name
sp P11178 ODBA_BOVIN
sp P12694 ODBA_HUMAN
sp Q8HXY4 ODBA_MACFA
sp P50136 ODBA_MOUSE
sp A5A6H9 ODBA_PANTR
sp P11960 ODBA_RAT
tr Q6ZSA3 Q6ZSA3_HUMAN
tr E7ESE6 E7ESE6_HUMAN
tr B2R8A9 B2R8A9_HUMAN
tr Q658P7 Q658P7_HUMAN
tr E7EW46 E7EW46_HUMAN
tr B4DP47 B4DP47_HUMAN
tr Q59EI3 Q59EI3_HUMAN
tr F1N5F2 F1N5F2_BOVIN
tr B1PK12 B1PK12_PIG
tr E2RPW4 E2RPW4_CANFA
tr B2LSM3 B2LSM3_SHEEP
tr F1RHA0 F1RHA0_PIG
tr F1PI86 F1PI86_CANFA
tr D2HMT3 D2HMT3_AILME
tr Q2TBT9 Q2TBT9_BOVIN
tr Q3U3J1 Q3U3J1_MOUSE
tr Q99L69 Q99L69_MOUSE
tr Q5EB89 Q5EB89_RAT
tr B1WBN3 B1WBN3_RAT
Multiple Alignment of the homologous sequences of BCKDHA with CLUSTALW

With this 25 results we made a multiple alignment by using CLUSTALW. The alignment with all mammalian homologous was quite bad because of the sequences "Q6ZSA3" and "E2RPW4". These two sequences are much longer than the other one. So we removed those sequences and realigned the other sequences.

SNAP

To run SNAP we used the command:

snapfun -i BCKDHA.fasta -m mutations.txt -o SNAP.out

nsSNP Prediction Reliability Index Expected Accuracy
G29E Neutral 0 53%
Q125E Non-neutral 1 63%
Y166N Non-neutral 2 70%
G249S Neutral 1 60%
C264W Non-neutral 4 82%
R265W Non-neutral 4 82%
I326T Non-neutral 3 78%
I361V Neutral 4 85%
F409C Non-neutral 4 82%
Y438N Non-neutral 4 82%

A second SNAP run was performed where all ten chosen mutation positions were mutated by all possible substitutions.

SIFT

Position Reference AA Mutated AA SIFT prediction SIFT Matrix Prediction
Predict Not Tolerated Seq Rep Predict Tolerated A C D E F G H I K L M N P Q R S T V W Y

Polyphen2

Position AA1/AA2 HumDiv HumVar
prediction Score Sensitivity Specificity prediction Score Sensitivity Specificity
29 G/E benign 0.025 0.96 0.80 benign 0.018 0.96 0.52
125 Q/E possibly damaging 0.759 0.85 0.93 benign 0.285 0.87 0.75
166 Y/N probably damaging 0.997 0.40 0.98 probably damaging 0.964 0.59 0.93
249 G/S benign 0.145 0.93 0.86 benign 0.292 0.86 0.75
264 C/W probably damaging 1.000 0.00 1.00 probably damaging 1.000 0.00 1.00
265 R/W probably damaging 1.000 0.00 1.00 probably damaging 1.000 0.00 1.00
326 I/T probably damaging 0.997 0.40 0.98 probably damaging 0.998 0.16 0.99
361 I/V benign 0.039 0.95 0.82 benign 0.178 0.89 0.70
409 F/C probably damaging 0.998 0.27 0.99 probably damaging 0.939 0.64 0.92
438 Y/N probably damaging 1.000 0.00 1.00 probably damaging 0.987 0.49 0.96



back to Maple_syrup_urine_disease main page

go back to Task 5 Mapping SNPs