Lab Journal - Task 9 (PAH)

From Bioinformatikpedia
Revision as of 13:47, 11 August 2013 by Worfk (talk | contribs) (foldX)

structure selection

The information for the resolution, chain and the positions in PAH can be found on the UniProt entry P00439 itself. The R-factor for the proteins, can be found on the PDBsum entries and the pH values on the pdb entries in the method section.

To check if a gap is included in a structure, we first downloaded the pdb file in text format from the pdb website and then used following unix shell-commands:

grep "^ATOM" 1DMW.pdb > 1DMW.txt
cut -c 23-27 1DMW.txt | uniq > 1DMW_res.txt

Then, we have to check, if the residues are consecutive. Therefore, we wrote a Python-script, which can be invoked as followed:

python test.py

The coverage in per cent was calculated like in the example shown below:

  • P00439 has a sequence length of 452AA
  • 1DMW has 424 - 117 = 307 residues
  • coverage: 307 / (452 / 100) = 67,92%

=> 1DMW has a coverage of 67,92% of the PAH (P00439) sequence!

SCWRL

Before generating the mutations with SCWRL, we first had to filter the sequence from the pdb file. Therefore, we used the repairPDB script with following command:

/opt/SS12_Practical/scripts/repairPDB 1J8U.pdb -seq > 1J8U_seq.txt

Afterwards, we had to change the upper letters to lower ones, except for the interesting mutations. For this purpose, we generated a python script: up2low.py ... Then, we can generate a mutation with SCWRL for each sequence file like shown below:

/opt/SS12-Practical/scwrl4/Scwrl4 -i 1J8U.pdb -s 1J8U_seq_mut1.txt -o 1J8U_mut1.pdb

foldX

For foldX, we used the files from the example shown on the foldX webserver with the approach Multiple mutations using individual list. We did not change the run.txt file, but the following files:

list.txt:

1J8U.pdb

A list of all pdb files to use for the mutation, in our case only the 1J8U pdb file.

individual_list.txt:

QA172H;
AA259V;
TA266A;
FA392S;
PA416Q;

An indiviudal list with all mutations to use. Every line stands for one run. We wanted to do only one mutation per run, so we only had to wrote one mutation in each line, but it would be possible to wrote more than one point mutation per line separated by commas. The first letter of every mutation stands for the amino acid in the unmutated structure, the second one for the chain (always A), the number gives the position in the structure and the last letter is for the mutation. Every line has to be finished with a semicolon.

After generating all needed files, we wanted to run foldX on the biolab computers, but the licence was not current. So, we downloaded a version of foldX onto our own path and generated the mutation files via following command:

/mnt/home/student/worfk/Masterpractical/Task09/foldX/FoldX.linux64 -runfile run.txt

The list.txt, individual_list.txt, run.txt and 1J8U.pdb file have to be in the same direction as well as the rotabase.txt file.

minimise

...