Prediction of transmembrane alpha-helices and signal peptides BACR HALSA

From Bioinformatikpedia

TMHMM

First of all, we used TMHMM to predict the transmembrane helices in this protein.

Figure 1: Prediction of TMHMM for the transmembrane helices of BACR_HALSA
start position end position location
1 22 outside
23 42 TM Helix
43 54 inside
55 77 TM Helix
78 91 outside
92 114 TM Helix
115 120 inside
121 143 TM Helix
144 147 outside
148 170 TM Helix
171 189 inside
190 212 TM Helix
213 262 outside

TMHMM predicts six transmembrane helices for BACR_HALSA, which can be seen on Figure 1. We decided to compare the TMHMM prediction with the real occurring transmembrane helices in BACR_HALSA:

Figure 2: Comparison between real occurring transmembrane helices and the TMHMM result.

Especially at the beginning is the prediction very good, which can be seen on Figure 2. There is almost 100% overlap between predicted and real helices. Only in the end of the protein lacks one transmembrane helix in the TMHMM prediction. Therefore, in real there are 7 transmembrane helices, whereas TMHMM only predicts 6. This is really bad, because it is different for the function if there are 6 or 7 helices, but in general the prediction of TMHMM was quite good.

Phobius and PolyPhobius

Next, we used Phobius and PolyPhobius to predict again transmembrane helices and also the signal peptide.

Figure 3: Prediction of Phobius for the transmembrane helices and signal peptides of BACR_HALSA
Figure 4: Prediction of PolyPhobius for the transmembrane helices and signal peptides of BACR_HALSA
Phobius PolyPhobius
start position end position prediction start position end position prediction
Signal peptide prediction
No prediction available
Transmembrane helices prediction
23 42 TM helix 22 43 TM helix
43 53 inside 44 54 inside
54 76 TM helix 55 77 TM helix
77 95 outside 78 94 outside
96 114 TM helix 95 114 TM helix
115 120 inside 115 120 inside
121 142 TM helix 121 141 TM helix
143 147 outside 142 147 outside
148 169 TM helix 148 166 TM helix
170 189 inside 167 186 inside
190 212 TM helix 187 205 TM helix
213 217 outside 206 215 outside
218 237 TM helix 216 237 TM helix
238 262 inside 238 262 inside

Both methods do not predict a signal peptide (compare Figure 3 and Figure 4), but both recognize, that this protein is a transmembrane protein with seven helices. The predictions only differ at the beginning and the end of the helix positions, but the differences between these two predictions is only about 1 to 3 residues.

To evaluate the predictions, we compared the predictions with the real occurring transmembrane helices (Figure 5 and Figure 6).

Comparison with the real structure of the protein:

Figure 5: Comparison between the prediction of Phobius and the real protein
Figure 6: Comparison between the prediction of PolyPhobius and the real protein



OCTOPUS and SPOCTOPUS



As next, we used OCTOPUS and SPOCTOPUS to predict transmembrane helices and the signal peptide.

Figure 7: Prediction of OCTOPUS for the transmembrane helices of BACR_HALSA
Figure 8: Prediction of SPOCTOPUS for the transmembrane helices of BACR_HALSA
OCTOPUS SPOCTOPUS
start position end position prediction start position end position prediction
1 22 outside 1 22 outside
23 43 TM helix 23 43 TM helix
44 54 inside 44 54 inside
55 75 TM helix 55 75 TM helix
76 95 outside 76 95 outside
96 116 TM helix 96 116 TM helix
117 121 inside 117 120 inside
122 142 TM helix 121 141 TM helix
143 147 outside 142 147 outside
148 168 TM helix 148 168 TM helix
169 185 inside 169 185 inside
186 206 TM helix 186 206 TM helix
207 216 outside 207 216 outside
217 237 TM helix 217 237 TM helix
238 262 inside 238 262 inside

Both methods have a very similar result (compare Figure 7 and Figure 8), which is identical with the exception of some residues. Both predicted the seven transmembrane helices, which is a very good result.

Comparison with the real structure of the protein:

Figure 9: Comparison between the prediction of OCTOPUS and the real protein
Figure 10: Comparison between the prediction of SPOCTOPUS and the real protein


Next, we compared the prediction of these two methods with the real structure of the protein. As we can see in Figure 9 and Figure 10, the prediction and the real structure agree most of the time.

TargetP

All of our proteins are proteins from human and archaea, so therefore we only use the non-plant option of TargetP.


Location Probability
mitochondiral targeting SP 0.019
secretory pathway SP 0.897
other 0.562

TargetP predicts that this protein contains a secretory pathway signal peptide. The probability for this signal peptide is very high, although the result is wrong, because BACR_HALSA is a transmembrane protein.

SignalP

For our analysis we used the Hidden Markov Model based and also the neuronal network based prediction.
The prediction with the Hidden Markov Model used three different scores. The S-score which is the score for the signal peptide, the C-score which is the score for the cleavage site and the Y-score which is a combination of the S-score and the C-score and is used to predict the cleavage site, because the Y-score is more precise than the C-score.

BACR_HALSA is an archaea protein. SignalP gave the possibility to predict eukaryotic or bacteria (gram-positive and gram-negative) signal peptides. Therefore, we decided to use all three possible prediction methods and to compare the results with the real signal peptide.

eukaryotes

Result of the neuronal network

Signal peptide Clevage site
start position end position start position end position prediction
1 38 38 39 signal peptide


Result of the hidden markov model

prediction signal peptide probability signal anchor probability cleavage site start cleavage site end
signal peptide 0.017 0.859 15 16
Figure 11: Result of the SignalP method based on the neuronal network for BACR_HALSA with the prediction method for eukaryotes
Figure 12: Result of the SignalP method based on the hidden markov model for BACR_HALSA with the prediction method for eukaryotes



gram-negative bacteria

Result of the neuronal network

Signal peptide Clevage site
start position end position start position end position prediction
1 42 42 43 no signal peptide

Result of the hidden markov model

prediction signal peptide probability signal anchor probability cleavage site start cleavage site end
Non-secretory protein 0.000 0.000
Figure 13: Result of the SignalP method based on the neuronal network for BACR_HALSA with the prediction method for gram-negative bacteria
Figure 14: Result of the SignalP method based on the hidden markov model for BACR_HALSA with the prediction method for gram-negative bacteria



gram-positive bacteria

Result of the neuronal network

Signal peptide Clevage site
start position end position start position end position prediction
1 33 33 34 no signal peptide

Result of the hidden markov model

prediction signal peptide probability signal anchor probability cleavage site start cleavage site end
Non-secretoy protein 0.000 0.000
Figure 15: Result of the SignalP method based on the neuronal network for BACR_HALSA with the prediction method for gram-positive bacteria
Figure 16: Result of the SignalP method based on the Hidden Markov Model for BACR_HALSA with the prediction method for gram-positive bacteria


Only the eukaryotic prediction method predicts a signal peptide (Figure 11 and Figure 12), whereas the both methods (Figure 13 – Figure 16) for bacteria predict, that this protein has no signal peptide. Otherwise, only the eukaryotic prediction method predict the protein as a signal anchor, which is correct, because BACR_HALSA is a transmembrane protein. Therefore, it seemed, that the eukaryotic prediction method suited better for BACR_HALSA