Gaucher Disease: Task 03 - Lab Journal
ReProf uses a fasta sequence or a PSI-BLAST PSSM for prediction, PsiPred a fasta sequence and DSSP server needs a PDB file in order to use the 3D coordinates of atoms. The predictions were made for the four proteins, including the Gaucher's disease-causing protein, listed below. If several PDB structures are available, the one covering the most UniProt sequence most similarly was chosen. For glucosylceramidase the structure 1OGS was used (as in the task 2).
|Entry||Protein name||Origin||Length||Entry||Method||Resolution (Å)||Chain||Positions|
|Q9X0E6||Divalent-cation tolerance protein CutA||bacterium Thermotoga maritima||101||1VHF||X-ray||1.54||A||2-101|
|Q08209||Serine/threonine-protein phosphatase 2B catalytic subunit alpha isoform, EC=188.8.131.52||human||521||1AUI||X-ray||2.10||A||1-521|
The script of the Phenylketonuria group filter_secStruc.pl was used to extract the secondary structures in the three letter code: E, H and L. For DSSP irregular regions are encoded as "-". Then, precision of two output secondary structure string was calculated using the second script of the Phenylketonuria group, SecStrucComparison.jar.
First different PSSMs after different PSI-BLAST runs (all combinations: against big_80/swissprot database, 2/3 iterations, E-value 2E-3/10E-10/10E-20) were tested on the shortest protein, Q9X0E6, then the run parameters yielding the best precision compared to PsiPred and DSSP were chosen. The best parameters were: big_80, 3 iterations and evalue cutoff 10E-10, which were then applied to create PSSMs for the other proteins. (The table where the results for all parameters are summarized can be seen here: