Lab Journal - Task 9 (PAH)

From Bioinformatikpedia
Revision as of 12:27, 11 August 2013 by Worfk (talk | contribs) (structure selection)

structure selection

The information for the resolution, chain and the positions in PAH can be found on the UniProt entry P00439 itself. The R-factor for the proteins, can be found on the PDBsum entries and the pH values on the pdb entries in the method section.

To check if a gap is included in a structure, we first downloaded the pdb file in text format from the pdb website and then used following unix shell-commands:

grep "^ATOM" 1DMW.pdb > 1DMW.txt
cut -c 23-27 1DMW.txt | uniq > 1DMW_res.txt

Then, we have to check, if the residues are consecutive. Therefore, we wrote a Python-script, which can be invoked as followed:

python test.py

The coverage in per cent was calculated like in the example shown below:

  • P00439 has a sequence length of 452AA
  • 1DMW has 424 - 117 = 307 residues
  • coverage: 307 / (452 / 100) = 67,92%

=> 1DMW has a coverage of 67,92% of the PAH (P00439) sequence!

SCWRL

Before generating the mutations with SCWRL, we first had to filter the sequence from the pdb file. Therefore, we used the repairPDB script with following command:

/opt/SS12_Practical/scripts/repairPDB 1J8U.pdb -seq > 1J8U_seq.txt

Afterwards, we had to change the upper letters to lower ones, except for the interesting mutations. For this purpose, we generated a python script: up2low.py ... Then, we can generate a mutation with SCWRL for each sequence file like shown below:

/opt/SS12-Practical/scwrl4/Scwrl4 -i 1J8U.pdb -s 1J8U_seq_mut1.txt -o 1J8U_mut1.pdb

foldX

...