Difference between revisions of "Gaucher Disease: Task 03 - Sequence-based predictions"
(→Transmembrane Helices) |
(→Human Glucosylceramidase) |
||
Line 43: | Line 43: | ||
===Human Glucosylceramidase=== |
===Human Glucosylceramidase=== |
||
+ | This Protein is not a membrane protein and is located on the extracellular side of the membrane as documented in OPM. For the same reason there exist no entry in the PDBTM, as this databse only contains membrane proteins. The prediction of Polyphobius causes to the same result. Additionally Polyphobius predicted also the signal peptide (including the N/H/C-region). MemsatSVM detected a false positive transmembrane helix. As the Glucosylceramidase cleaves lipids of cell membranes, the ative site of the enzym may be mistaken for a trensmembrane helix. |
||
{| border="1" cellpadding="5" cellspacing="0" align="center" |
{| border="1" cellpadding="5" cellspacing="0" align="center" |
Revision as of 22:38, 27 May 2013
Contents
Secondary Structure
In this task secondary structure is predicted using ReProf and PsiPred and compared to DSSP structure assignment. ReProf uses a fasta sequence or a PSI-BLAST PSSM for prediction, PsiPred a fasta sequence and DSSP server needs a PDB file in order to use the 3D coordinates of atoms. The predictions were made for the four proteins, including the Gaucher's disease-causing protein, listed below. If several PDB structures are available, the one covering the most UniProt sequence most similarly was chosen. For glucosylceramidase the structure 1OGS was used (as in the task 2).
Uniprot | PDB | |||||||
---|---|---|---|---|---|---|---|---|
Entry | Protein name | Origin | Length | Entry | Method | Resolution (Å) | Chain | Positions |
P10775 | Ribonuclease inhibitor | pig | 456 AA | 2BNH | X-ray | 2.30 | A | 1-456 |
Q9X0E6 | Divalent-cation tolerance protein CutA | bacterium Thermotoga maritima | 101 AA | 1VHF | X-ray | 1.54 | A | 2-101 |
Q08209 | Serine/threonine-protein phosphatase 2B catalytic subunit alpha isoform, EC=3.1.3.16 | human | 521 AA | 1AUI | X-ray | 2.10 | A | 1-521 |
P04062 | Glucosylceramidase/acid-beta-glucosidase, EC=3.2.1.45 | human | 536 AA | 1OGS | X-ray | 2.00 | A/B | 40-536 |
The script of the Phenylketonuria group filter_secStruc.pl was used to extract the secondary structures in the three letter code: E, H and L. For DSSP irregular regions are encoded as "-". Then, precision of two output secondary structure string was calculated using the second script of the Phenylketonuria group, SecStrucComparison.jar.
First different PSSMs after different PSI-BLAST runs (all combinations: against big_80/swissprot database, 2/3 iterations, E-value 2E-3/10E-10/10E-20) were tested on the shortest protein, Q9X0E6, then the run parameters yielding the best precision compared to PsiPred and DSSP were chosen. The best parameters were: big_80, 3 iterations and evalue cutoff 10E-10, which were then applied to create PSSMs for the other proteins. (The table where the results for all parameters are summarized can be seen here: /mnt/home/student/kalemanovm/master_practical/Assignment3_SequenceBasedPredictions/SecondaryStructure/reprof_out/parsedSecStr/README.Q9X0E6.psiblast_param.precision
.)
TODO: What features are predicted? Discuss the results for your protein and the example proteins. Using the predictions, what could you learn about your protein and the example proteins? Compare to the available knowledge in UniProt, PDB, DisProt, OPM, PDBTM, Pfam...
Disorder
Transmembrane Helices
Four Proteins, including the Gaucher's disease causing Protein, where analysed under reference by transmembrane helices. The used prediction tools differ in their analysing features. While Polyphobius only differs between residues being part of a transmembrane helix or being inside/outside of the cytoplama, Memsat-SVM also predicts re-entrant helices and pore-linig helices. Due to the fact that pore-lining helices are also transmembrane helices, this kind of helices is detected of both prediction tools. In case of re-entrant helices both programms differ. In general a membrane helix crosses the membrane, so that both ends of the helix lie on different sides of the membrane. In contrast, the re-entrant helix leads bot its ends to the same side of the mebrane. Memsat-SVM can predict re-entrant helices, but Polyphobius treats this helices as a general membrane helices, which crosses the membrane (seen for Q9YDF8), or ignores it (seen for P47863). In case of re-entrant helices predictions also the C-terminal or the N-terminal may be predicted on different membrane sides, as well as some helices may be predicted to lie in a different direction within the membrane, because of an re-entrant helix.
Human Glucosylceramidase
This Protein is not a membrane protein and is located on the extracellular side of the membrane as documented in OPM. For the same reason there exist no entry in the PDBTM, as this databse only contains membrane proteins. The prediction of Polyphobius causes to the same result. Additionally Polyphobius predicted also the signal peptide (including the N/H/C-region). MemsatSVM detected a false positive transmembrane helix. As the Glucosylceramidase cleaves lipids of cell membranes, the ative site of the enzym may be mistaken for a trensmembrane helix.
Comparison of TMH for Glucosylceramidase (P04062, human) | ||||
---|---|---|---|---|
Prediction | Assignment | |||
Memsat SVM | Polyphobius | OPM | PDMTM | |
# of TMH | 1 | - | - | - |
TMH Topology | 456-471 | - | - | - |
N-terminal | extracellular | extracellular | extracellular | - |
C-terminal | cytoplasmic | extracellular | extracellular | - |
Signal peptide | 1-34 | 1-40 | - | - |
Re-entrant Helix | - | - | - | - |
Pore-lining Helix | 1 | - | - | - |
Graphical position | - | |||
more information | P04062 | 1OGS | 1OGS is not in the PDBTM |
Aeropyrum pernix Voltage-gated potassium channel
Comparison of TMH for Voltage-gated potassium channel (Q9YDF8, Aeropyrum pernix) | ||||
---|---|---|---|---|
Prediction | Assignment | |||
Memsat SVM | Polyphobius | OPM | PDMTM | |
# of TMH | 6 | 7 | 5 | 4 |
TMH Topology | 43-59 72-90 101-118 128-143 163-184 221-245 |
42-60 68-88 108-129 137-157 163-184 196-213 224-244 |
25-46 55-78 86-97 100-107 117-148 |
27-50 55-75 88-107 118-142 |
N-terminal | cytoplasmic | extracellular | cytoplasmic | cytoplasmic |
C-terminal | cytoplasmic | cytoplasmic | cytoplasmic | cytoplasmic |
Signal peptide | - | - | ||
Re-entrant Helix | 188-217 | - | ||
Pore-lining Helix | 1 | - | ||
Graphical position | ||||
more information | Q9YDF8 | 1ORS | 1ORS |
Human Lysosome-associated membrane glycoprotein 1
Comparison of TMH for Lysosome-associated membrane glycoprotein 1 (P47863, human) | ||||
---|---|---|---|---|
Prediction | Assignment | |||
Memsat SVM | Polyphobius | OPM | PDMTM | |
# of TMH | 6 | 6 | 8 (per chain) | 8 (per chain) |
TMH Topology | 35-56 71-89 113-136 157-178 190-205 232-252 |
34-58 70-91 115-136 156-177 188-208 231-252 |
34-56 70-88 98-107 112-136 156-178 189-203 214-223 231-252 |
39-55 72-89 95-106 116-133 158-177 188-205 209-222 231-248 |
N-terminal | extracellular | cytoplasmic | cytoplasmic | cytoplasmic |
C-terminal | extracellular | cytoplasmic | cytoplasmic | cytoplasmic |
Signal peptide | 1-20 | |||
Re-entrant Helix | 93-109 209-225 |
95-106 209-222 | ||
Pore-lining Helix | 4 | |||
Graphical position | ||||
more information | P47863 | 2D57 | 2D57 |
Human D3 dopamine receptor
Comparison of TMH for D3 dopamine receptor (P35462, human) | ||||
---|---|---|---|---|
Prediction | Assignment | |||
Memsat SVM | Polyphobius | OPM | PDMTM | |
# of TMH | 6 | 7 | 7 | 7 |
TMH Topology | 32-55 65-88 101-129 151-169 188-209 331-354 |
30-55 66-88 105-126 150-170 188-212 329-352 367-386 |
34-52 67-91 101-126 150-170 187-209 330-351 363-386 |
35-52 68-84 109-123 152-166 191-206 334-347 368-382 |
N-terminal | extracellular | extracellular | extracellular | extracellular |
C-terminal | extracellular | cytoplasmic | cytoplasmic | cytoplasmic |
Signal peptide | 1-29 | - | ||
Re-entrant Helix | - | - | - | |
Pore-lining Helix | 1 | |||
Graphical position | ||||
more information | P35462 | 3PBL | 3PBL |
Signal Peptides
For the following proteins, the signal peptides as well as its cleavage sides were predigted with SignalP:
- Glucosylceramidase (P04062, human)
- Serum albumin (P02768, human)
- Aquaporin 4 (P11279, rat)
- Lysosome-associated membrane glycoprotein 1 (P47863, human)
The four eukaryotic proteins were also looked up in the Signal Peptide Database to campare the entry with the results of the prediction.
Glucosylceramidase (P04062)
For the Glucosylcerbrosidase, the prediction of SignalP differs from the database entry.
In the database the protein has a signal peptide of 39 residues. A signal peptide is characterized with high hydrophobicity in its core region followed by the cleavage side[1]. Escpecially the residues 18-23 and 27-34 indicate with its higher hydrohobicity to a signal peptide (green area in the hydrophobicity image).
MEFSSPSREECPKPLSRVSIMAGSLTGLLLLQAVSWASG
However, the prediction of SignalP results no signal peptide. On the visualisation of the different scores below, the green signal peptide score shows the most possible prediction for an signal peptide. The green line is higher for the first 39 residues than for the later residues. But the calculated D-score of the detected peptide lies with 0.37 below the threshold (0.5). The peptide is neglected as signal. These residues are not only defined as signal peptide by the database, but were also detected, with a light deviation, by the transmembranehelix predictors MemsatSVM(residues 1-34) and Polyphobius(residues 1-40).
SignalP result for P04062: The green line represents the signal peptide score. The higher the score the higher the probability of a residue being part of a signal peptide. A higher raw cleavage site score (C-score) marks the residue directly after the cleavage side. The blue line shows a combination of the C and S-score.
Serum albumin (P02768)
The signal peptide consists of residues 1-18 and is predicted of SignalP as well as documented in the Signal Peptide Databse
MKWVTFISLLFLFSSAYS
The images below show an clearly prediction of the signal peptide. A high S-core for the signal peptide region with a D-score of 0.85 far over th threshold. The cleavage side is predicted between the residue 18 and 19. The database shows a high hydrophobicity for the residues 6-14 which marks the region as signal peptide as well.
SignalP result for P02768: The green line represents the signal peptide score. The higher the score the higher the probability of a residue being part of a signal peptide. A higher raw cleavage site score (C-score) marks the residue directly after the cleavage side. The blue line shows a combination of the C and S-score.
Aquaporin 4 (P11279)
For Aquaporin the Scores are even higher than for Serum albimum. The signal peptide consists of 28 residues as follows:
MAAPGSARRPLLLLLLLLLLGLMHCASA
The Database shows a large hydrophobic region of 17 residues. At the end of the protein a transmembrane helix with a length of 23 residues ending in cytoplasm is documented in the Aquaporin 4 entry. The SignalP prediction gives a D-score of 0.95 for the detected signal peptide. The cleavage side is predicted between between the residues 28 and 29 (ASA-AM).
SignalP result for P11279: The green line represents the signal peptide score. The higher the score the higher the probability of a residue being part of a signal peptide. A higher raw cleavage site score (C-score) marks the residue directly after the cleavage side. The blue line shows a combination of the C and S-score.
Lysosome-associated membrane glycoprotein 1 (P47863)
The Rat protein has no entry in the Signal Peptide Database, as no signal peptide exists for it. The visualised results of the prediction show on the first sight, that the Protein does not have a signal peptide. All scores are lower than 0.21, which is far below the threshold for signal peptides.
SignalP result for P47863: The green line represents the signal peptide score. The higher the score the higher the probability of a residue being part of a signal peptide. A higher raw cleavage site score (C-score) marks the residue directly after the cleavage side. The blue line shows a combination of the C and S-score. The threshold is marked by a red dotted line.
GO Terms
Discussion
Other available methods
Prediction of | Tool |
---|---|
secondary structure | GOR |
disorder | DISOPRED2 |
transmembrane helices | MEMSAT3 |
TMHMM | |
PredictProtein | |
DAS | |
HMMTOP | |
TMpred | |
signal peptides | PrediSi |
Polyphobius | |
MemsatSVM | |
SIGCLEAVE | |
ANTHEPROT | |
Signal Find Server | |
SPD | |
SPEPlip | |
SOSUIsignal | |
GO terms |
What else can/is be predicted from protein sequence alone
- Fold recognition (profile based pGenTHREADER and rapid GenTHREADER)
- Fold domain recognition (pDomTHREADER)
- Protein domain prediction (DomPred)
- Homology modelling (BioSerf v2.0)
- Function prediction (eukaryotic function: FFPred v2.0)
- Prediction of TM topology and helix packing (SVM-based MEMPACK)
http://bioinf.cs.ucl.ac.uk/psipred/