Transmembrane signal peptide general

From Bioinformatikpedia

TMHMM (transmembrane helices hidden markov model)

Authors: E. L.L. Sonnhammer, G. von Heijne, and A. Krogh
Year: 1998
Source: [A hidden Markov model for predicting transmembrane helices in protein sequences]

Description

TMHMM is a Hidden Markov Model-based prediction method for transmembrane helices in proteins. The HMM consists of three different main locations (core, cap, loop) and seven different states (cytoplasmic loop, cytoplasmic cap, helix core, non-cytoplasmic cap, short non-cytoplasmic loop, long non-cytoplasmic loop and globular domain).

Prediction

This method searches for a given protein sequence in FASTA-format the best path through the Hidden Markov Model. There are two output possibilities, the short one and the long one. The long output format gives additional statistic information (i.e. expected numbers of amino acids in transmembrane helices).

Input

The method only needs the protein sequence in FASTA-format for the prediction.

[Back to sequence-based prediction]

Phobius and PolyPhobius

  • Phobius:

Authors: Lukas Käll, Anders Krogh and Erik L. L. Sonnhammer
Year: 2004
Source: [A Combined Transmembrane Topology and Signal Peptide Prediction Method]

  • PolyPhobius:

Authors: Lukas Käll, Anders Krogh and Erik Sonnhammer
Year: 2005
Source: [An HMM posterior decoder for sequence feature prediction that includes homology information]

Description

Phobius and PolyPhobius are combined methods, which predict transmembrane helices and signal peptides. These both methods are based on a Hidden Markov Model and combine the methods from TMHMM and SignalP. The basic of these methods are the HMM from TMHMM with an additional start state for signal peptides. The difference between Phobius and PolyPhobius is, that PolyPhobius also use homology information for the prediction.

Input

We used the [Webserver] for Phobius and PolyPhobius and so it was only necessary to paste the protein sequence in FASTA-format.

Output

The Server outputs a text file with the prediction of the position of the signal peptide, the type of the signal peptide and also the positions of the transmembrane helices. Furthermore, it outputs a detailed file, with the probabilities for each residue to be located in a transmembrane helix or signal peptide. Additionally, the server outputs a picture of the prediction.

[Back to sequence-based prediction]

OCTOPUS and SPOCTOPUS

  • OCTOPUS:

Authors: Håkan Viklund and Arne Elofsson
Year: 2008
Source: [OCTOPUS: Improving topology prediction by two-track ANN-based preference scores and an extended topological grammar.]

  • SPOCTOPUS:

Authors: Håkan Viklund, Andreas Bernsel, Marcin Skwark and Arne Elofsson
Year: 2008
Source: [SPOCTOPUS: A combined predictor of signal peptides and membrane protein topology. ]

Description

OCTOPUS is a method, which is based on neuronal networks and Hidden Markov Models. To make a prediction, first a multiple sequence alignment is generated by BLAST. Next the algorithm calculates the PSSM profile and a raw sequence profile and both profiles are used as the input for the neuronal networks. These neuronal networks (one for the PSSM profile and one for the raw sequence profile) predict the preference of each residue to be located in a transmembrane helix or not. The outputs of these networks are used as input for a Hidden Markov Model, which generates the final prediction.
OCTOPUS only predicts transmembrane helices, whereas SPOCTOPUS can also predict signal peptides. The basis of OCTOPUS and SPOCTOPUS is the same, above described, algorithm.

Input

We used the [Webserver] for our predictions. The server is very easy to use, because it has only one input field, where you can paste your protein sequence in FASTA-format. Than it is possible to choose between OCTOPUS and SPOCTOPUS and the prediction starts.

Output

The [Webserver] gives 3 files as output. The first file contains the exact probabilities for each residue to be located inside, outside or in a transmembrane helix (nnprf file). The next file contains the result of the prediction (topo file) and the last file visualize the prediction (png file).

[Back to sequence-based prediction]

TargetP

Authors: Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne
Year: 1997
Source: [Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites.]

Description

The TargetP method is based on two different neuronal networks. The input for the first neuronal network is the protein sequence in FASTA-format. The first network can recognize the cleavage site. This part of the protein goes to the next neuronal network, which can distinguish between signal peptides and non-signal peptides and also which signal peptide it is.
The user has the possibility to specify if he wants to predict a plant or non-plant signal peptide.

Input

We used the [TargetP webserver] for our analysis and paste our sequence in FASTA-format in the sequence field.

Output

As an output you get one file which shows the probability for each signal peptide. Therefore, you have exact values and can decide on your own, if the probability is high enough to trust the prediction.

[Back to sequence-based prediction]

SingalP

Authors: Henrik Nielsen and Anders Krogh
Year: 1998
Source: [Prediction of signal peptides and signal anchors by a hidden Markov model.]

Description

The SignalP method is based on two Hidden Markov Models. The one Hidden Markov Model has defined states for the different parts of the signal peptide and is used for the signal peptide prediction. The second Hidden Markov Model is used to distinguish between signal peptides and sequence anchors to improve the prediction accuracy.
SignalP gives the user the possibility to predict specifically for eukaryotes, gram negative or gram positive bacterias to get a more precise prediction.

Input

We used the [Webserver] for our prediction and therefore it was only necessary to paste the protein sequence in FASTA-format.

Output

The server gave a detailed output about the probability for each residue for different locations. At the end of the file there is a short prediction summary which gives information about the prediction result, the signal peptide probability and some other statistical measurements.

[Back to sequence-based prediction]