Task 9 Lab Journal (MSUD)
Contents
Selection of structure model
As a tradeoff between resolution and sequence completeness of the structure model, we have chose the PDB structure 2BFF as model structure for BCKDHA.
Visualization of mutant structures
In order to get the position of mutations in PDB structure 2BFF, we have aligned the SEQRES sequence of 2BFF to reference sequence of BCKDHA using Needleman Wunsch algorithm. The position of mutations should be shifted 45 residues back. Alignment is shown below:
NP_000700.1 1 MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDD 50 ||||| SEQUENCE 1 ---------------------------------------------SSLDD 5 NP_000700.1 51 KPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE 100 |||||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 6 KPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE 55 NP_000700.1 101 KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDN 150 |||||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 56 KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDN 105 NP_000700.1 151 TDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER 200 |||||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 106 TDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER 155 NP_000700.1 201 HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGF 250 |||||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 156 HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGF 205 NP_000700.1 251 NFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG 300 |||||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 206 NFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG 255 NP_000700.1 301 NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDE 350 |||||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 256 NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDE 305 NP_000700.1 351 VNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK 400 |.|||||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 306 VGYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK 355 NP_000700.1 401 PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK 445 ||||||||||||||||||||||||||||||||||||||||||||| SEQUENCE 356 PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK 400
Create mutated structures (SCWRL)
With repairPDB the PDB file of 2BFF was repaired and the sequence was extracted:
/opt/SS12-Practical/scripts/repairPDB 2BFF.pdb > 2BFF_repaired.pdb
/opt/SS12-Practical/scripts/repairPDB 2BFF.pdb -seq > 2BFF_sequence.txt
The mutations were introduced into the sequence with the following script (located at /mnt/home/student/schillerl/MasterPractical/task9/create_mutated_sequences.py
):
<source lang=python>
Create mutated sequences for a sequence extracted from pdb file.
Usage: python create_mutated_sequences.py <sequence file> <pdb sequence file> [mutations] <sequence file> reference sequence fasta file (mutation numbers correspond to this sequence) <pdb sequence file> sequence in pdb file (extracted with repairPDB -seq option) mutations list of mutations
One output file for each mutation will be created (named like pdb sequence file extended with mutation identifier), all residues lower case except for mutated.
Example for usage: python create_mutated_sequences.py refseq_BCKDHA_protein.fasta 2BFF_sequence.txt M82L A222T C264W R346H I361V
@author: Laura Schiller
import sys from Bio import SeqIO, pairwise2 from Bio.SubsMat import MatrixInfo
ref_seq = SeqIO.read(sys.argv[1], "fasta") mutations = sys.argv[3:len(sys.argv)]
pdb_seq_file = open(sys.argv[2]) pdb_seq = pdb_seq_file.readline()[0:-1] pdb_seq_file.close()
- pairwise alignment
matrix = MatrixInfo.blosum62 gap_open = -10 gap_extend = -0.5 alignment = pairwise2.align.globalds(ref_seq, pdb_seq, matrix, gap_open, gap_extend) ref_seq_aligned = alignment[0][0] pdb_seq_aligned = alignment[0][1]
for mutation in mutations:
mut_pos = int(mutation[1:-1]) old_aa = mutation[0] new_aa = mutation[-1] # determine corresponding position in pdb sequence pos = 0 for i in range(len(ref_seq_aligned)): if ref_seq_aligned[i] != '-': pos += 1 if pos == mut_pos: break assert pdb_seq_aligned[i] == old_aa mutated_seq = pdb_seq_aligned.lower() mutated_seq = mutated_seq[0:i] + new_aa + mutated_seq[i+1:len(mutated_seq)] out_file = open(sys.argv[2].split(".")[0] + "_" + mutation + "." + sys.argv[2].split(".")[-1], "w") out_file.write(mutated_seq.replace('-', ) + "\n") out_file.close()
</source>
For each mutated sequence, a structure was created with SQWRL:
<source lang=bash>
for mutation in M82L A222T C264W R346H I361V; do
/opt/SS12-Practical/scwrl4/Scwrl4 -i 2BFF_repaired.pdb -s 2BFF_sequence_${mutation}.txt -o 2BFF_${mutation}_scwrl_model.pdb
done</source>
The mutat structures are located at /mnt/home/student/schillerl/MasterPractical/task9/scwrl/
.
Energy comparisons
FoldX
We have adopted the example files from FoldX to perform a batched evaluation of the energy of mutant structures. Because, in our case, we are interested in the effect of single point mutations on protein structure and function, we simply assigned the 5 chosen mutations into 5 rows with tailing semicolon. Again, we have to adjust the position of mutations to the position in PDB structure.
List of individual mutants (Chain id must be assigned after WT residue):
MA37L; AA177T; CA219W; RA301H; IA316V;
Minimise
Hydrogens and waters were removed from the structures:
/opt/SS12-Practical/scripts/repairPDB <input pdb file> -noh -jprot > <output pdb file>
The structures were minimised 5 times recursively (the output from one run was used as input for the next):
/opt/SS12-Practical/minimise/minimise <input pdb file> <output pdb file>
Minimised structures can be found at /mnt/home/student/schillerl/MasterPractical/task9/minimise/
. Output was only generated for wild type and FoldX structures - for structures created with SCWRL, Minimise did not create output or calculate energies.