Sequence-based analyses of ARS A
Contents
Additional Proteins
The following proteins are additionally used for the prediction of transmembrand alpha-helices and signal peptides and for the prediction of GO Terms:
BACR
BACR_HALSA is a bacterial membrane protein...
type | Position ' | Description |
Topological domain | 14 – 23 | Extracellular |
Transmembrane | 24 – 42 | Helical; Name=Helix A |
Topological domain | 43 – 56 | Cytoplasmic |
Transmembrane | 57 – 75 | Helical; Name=Helix B |
Topological domain | 76 – 91 | Extracellular |
Transmembrane | 92 – 109 | Helical; Name=Helix C |
Topological domain | 110 – 120 | Cytoplasmic |
Transmembrane | 121 – 140 | Helical; Name=Helix D |
Topological domain | 141 – 147 | Extracellular |
Transmembrane | 148 – 167 | Helical; Name=Helix E |
Topological domain | 168 – 185 | Cytoplasmic |
Transmembrane | 186 – 204 | Helical; Name=Helix F |
Topological domain | 205 – 216 | Extracellular |
Transmembrane | 217 – 236 | Helical; Name=Helix G |
Topological domain | 237 – 262 | Cytoplasmic |
RET 4
- RET4_HUMAN is a human retinal-binding protein. It delivers retionl from the liver stores to the peripheral tissues. Defects can cause night vision problems.
type | Position ' | Description |
Signal peptide | 1 - 18 | |
Chain | 19 - 201 | Retinol-binding protein 4 |
Chain | 19 - 200 | Plasma retinol-binding protein (1-182) |
Chain | 19 - 199 | Plasma retinol-binding protein (1-181) |
Chain | 19 - 197 | Plasma retinol-binding protein (1-179) |
Chain | 19 - 194 | Plasma retinol-binding protein (1-176) |
INSL 5
- INSL5_HUMAN is a human insulin-like peptide. It consists of two chains and may have a role in gut contractility or in thymic development and regulation.
type | Position ' | Description |
Signal peptide | 1 - 22 | |
Peptide | 23 - 46 | Insulin-like peptide INSL5 B chain |
Propeptide | 49 - 114 | Connecting peptide |
Peptide | 115 - 135 | Insulin-like peptide INSL5 A chain |
LAMP 1
- LAMP1_HUMAN is a human membrane glycoprotein. It presents cabohydrate ligands to selectins.
type | Position ' | Description |
Signal peptide | 1 - 28 | |
Chain | 29 - 417 | Lysosome-associated membrane glycoprotein 1 |
A 4
- A4_HUMAN is a human cell surface receptor involved in neurite growth, neuronal adhesion and axonogenesis. It can be involved in Alzheimer disease and Amyloidosis.
Secondary Structure Prediction
Program | #TP | #FP | accuracy |
PSI-PRED | 374 | 133 | 0.74 |
Jpred | 359 | 148 | 0.71 |
DSSP: mmmmmmmmmmmmmmmmmmCCCEEEEEEECCCCCCCCHHHCCCCCCCHHHHHHHHCCEEECCEECCCCCHHHHHHHHHHCCCHHHHCC
JPRED: CCHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCCHHHHHHHHCCCEECCCCCCCCCCHHHHHHHHHCCCCCCCCC
PSI-PRED: CCHHHHHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCC
DSSP: CCCCCCCCECCECCCCCCCHHHHHHCCCCEEEEEECCCCECCHHHCCCHHHHCCCEEEECCCCCCCCECCCCEEECCCEECCCCECC
JPRED: CCCCCCCCCCCCCCCCCCHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
PSI-PRED: CCCCCCCCCCCCCCCCCCCHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
DSSP: CCCCCCEEECCEEEEECCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH
JPRED: CCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHH
PSI-PRED: CCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHH
DSSP: HHHHHHCCCHHHEEEEEEECCCCCHHHHHHCCCCCCCCCCCCCCCHHHHECCCEEECCCCCCCEEECCCEEHHHHHHHHHHHHCCCC
JPRED: HHHHHHCCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCECCCCCCCCHHHHHHHHHCCCC
PSI-PRED: HHHHHHCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCEECCCHHHHHHHHHHHHHHCCCC
DSSP: CCCCCCCCCCHHHHHCCCCCCCCEEEECCCCCCCCCCCCEEEECCEEEEEEECCCHHHCCCCCHHHCCCCCCEEEEEEEEEECCCCC
JPRED: CCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCEEEEECCCEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCEECCCCCC
PSI-PRED: CCCCCCCCCCHHHHCCCCCCCCCEEEECCCCCCCCCCEEEEEECCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCC
DSSP: CCCCCCCCCmmmCCHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHCECHHHCCCCCCCCCCCCCCCCECmmmm
JPRED: CCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
PSI-PRED: CCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
DSSP
Prediction of Disordered Regions
DISOPRED
POODLE
IUPRED
Meta-Disorder
Prediction of transmembrane alpha-helices and signal peptides
SignalP
ARS A | A4 | RET4 | INSL5 | LAMP1 | BACR |
TMHMM
TMHMM predicts transmembrane helices (TMH) using a Hidden Markov Model (HMM). The protein described by TMH model essentially consists of seven different states. Globular domains can occur on the cytoplasmic and the non-cytoplasmic side. On the cytoplsmic side, globular domains are linked to loops, ehich are agin linked to cytoplasimc caps. These caps are followed by the helex core and there is again a cap on the non-cytoplasmic side. These caps are linked to globular domains by either short or long non-cytoplasmic loops.
TMHMM outputs the most likely structure of the protein, ragarding to the above model. It also includes the orientation (cytoplasmic or non-cytoplasmic side) of the N-terminal signal sequence. The ouput consists of a plot - graphically showing the different states along the protein - and some additional statistics <ref> http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output </ref>:
- The number of predicted transmembrane helices.
- The expected number of amino acids in transmembrane helices. If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide).
- The expected number of amino acids in transmembrane helices in the first 60 amino acids of the protein. If this number more than a few, you should be warned that a predicted transmembrane helix in the N-term could be a signal peptide.
- The total probability that the N-term is on the cytoplasmic side of the membrane.
Discussion
- ARS A:outside 1 507 (=all)
- A4_HUMAN: The topology is given below
Description | Position ' |
outside | 1-700 |
TMhelix | 701-723 |
inside | 724-770 |
- INSL5_HUMAN: outside 1 135 (all residues)
- LAMP1_HUMAN POSSIBLE N-term signal sequence
Description | Position ' |
inside | 1-10 |
TMhelix | 11-33 |
outside | 34-383 |
TMhelix | 384-406 |
inside | 407-417 |
- RET4_HUMAN: outside 1 201 (all)
- BACR:
- sp_P02945_BACR_HALSA POSSIBLE N-term signal sequence
Description | Position ' |
outside | 1-22 |
TMhelix | 23-42 |
inside | 43-54 |
TMhelix | 55-77 |
outside | 78-91 |
TMhelix | 92-114 |
inside | 115-120 |
TMhelix | 121-143 |
outside | 144-147 |
TMhelix | 148-170 |
inside | 171-189 |
TMhelix | 190-212 |
outside | 213-262 |
Phobius and Polyphobius
OCTOPUS and SPOCTOPUS
OCPTOPUS uses a combination of a Hidden Markov Model and neural network to predict the topology of a transmembrane protein. It uses BALST to create a sequence profile, whihc is then used by the neural network to predict the preference of the amino acids to be located within a transmembrane (M), interface (I), close loop (L) globular loop (G), inside (i) or outside (o). These scores are then passed to the HMM, which predicts the final states.
SPOCTOPUS extends the OCTOPUS algorithm with a preprocessing step. OCTOPUS does not predict signal peptides. The N-terminal targeting sequences mainly consist of hydrophobic residues and thus thier properties strongly resemble the transmembrane helices. Not considering the signal peptides in the prediction often leads to a false prediction of a transmembrane helix at the N-terminal domain. Therefore SPOCTOPUS extends the OCTOPUS algorithm with the prediction of signal peptide preference scores within the first 70 amino acids of the protein. The exact location of a potential signal peptide are then predicted by a HMM in OCTOPUS.
Protein | OCTOPUS | SPOCTOPUS |
ARS A | ||
A4 | ||
RET4 | ||
INSL5 | ||
LAMP1 | ||
BACR |
TargetP
TargetP is used to predict the cellular localization of a protein. It combines the two methods ChloroP and SignalP. The following targeting sequences can be identified:
- chloroplast transit peptide (cTP)
- mitochondrial targeting peptide (mTP)
- secretory pathway signal peptide (SP)
TargetP uses a neural network to calculate and outputs scores for each of the above subcellular targets. TargetP finally predicts the location with the highest score. In our case all proteins are predicted to be targeted to the secretory pathway (S). Results are shown below. Note, that cTP is not included in our predictions, as we only considered eukaryotic and bacterial proteins. Also note, that TargetP is trained on eukaryotic proteins and hence the prediction for the protein "BACR", which is bacterial does not make sense, because there are completely different pathways of localization and secretion in eukayotes and bacteria (e.g. bacteria do not have an endoplasmatic reticulum, Golgi-Apparatus or Lysosome). Nevertheless, we included it in our analysis to see if TargetP predicts finds any localization sequence in it or predicts "-" (= no localization signal found).
Protein | mTP | SP | other | prediction |
ARS A | 0.070 | 0.926 | 0.054 | S |
A4_HUMAN | 0.035 | 0.937 | 0.084 | S |
INSL5_HUMAN | 0.074 | 0.899 | 0.037 | S |
LAMP1_HUMAN | 0.043 | 0.953 | 0.017 | S |
RET4_HUMAN | 0.242 | 0.928 | 0.020 | S |
BACR (bacterial) | 0.019 | 0.897 | 0.562 | S |
Discussion
All proteins are assigned to the secretory pathway.
- Arylsulfatase A is a lysosomal enzyme. Therefore, the prediction is correct, as lysosomal proteins are guided there by the secretory pathway, via the endoplasmatic reticulum and the Golgi apparatus.
- coming
- coming
- coming
- coming
- As described above, BACR is a bacterial protein. TargetP assigns, that this protein is also guided to the secretory pathway, which makes no sense as the bacterial protein secretion is different from eukaryotic secretion. Nevertheless, the prediction is much less obvious in this case, compared to the others. The "other" class - meaning that no targeting sequence is found in the protein gets a considerable high score in this prediction, hence the assignment to S is more questionable here.
SignalP
sudo /apps/signalp-3.0/signalp -t gram- ../BACR.fasta > BACR.signalp
sudo /apps/signalp-3.0/signalp -t euk ../ARSA.fasta > ARSA.signalp
sudo /apps/signalp-3.0/signalp -t euk ../A4.fasta > A4.signalp
sudo /apps/signalp-3.0/signalp -t euk ../LAMP1.fasta > LAMP1.signalp
sudo /apps/signalp-3.0/signalp -t euk ../INSL5.fasta > INSL5.signalp
sudo /apps/signalp-3.0/signalp -t euk ../RET4.fasta > RET4.signalp