Difference between revisions of "Sequence-based analyses of ARS A"

From Bioinformatikpedia
Line 87: Line 87:
 
|}
 
|}
   
# sp_P02945_BACR_HALSA POSSIBLE N-term signal sequence
 
sp_P02945_BACR_HALSA TMHMM2.0 outside 1 22
 
sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 23 42
 
sp_P02945_BACR_HALSA TMHMM2.0 inside 43 54
 
sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 55 77
 
sp_P02945_BACR_HALSA TMHMM2.0 outside 78 91
 
sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 92 114
 
sp_P02945_BACR_HALSA TMHMM2.0 inside 115 120
 
sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 121 143
 
sp_P02945_BACR_HALSA TMHMM2.0 outside 144 147
 
sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 148 170
 
sp_P02945_BACR_HALSA TMHMM2.0 inside 171 189
 
sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 190 212
 
sp_P02945_BACR_HALSA TMHMM2.0 outside 213 262
 
   
 
===== Discussion =====
 
===== Discussion =====

Revision as of 11:25, 26 May 2011

Additional Proteins

The following proteins are additionally used for the prediction of transmembrand alpha-helices and signal peptides and for the prediction of GO Terms:

Secondary Structure Prediction

DSSP

Prediction of Disordered Regions

DISOPRED

POODLE

IUPRED

Meta-Disorder

Prediction of transmembrane alpha-helices and signal peptides

SignalP

TMHMM

TMHMM predicts transmembrane helices (TMH) using a Hidden Markov Model (HMM). The protein described by TMH model essentially consists of seven different states. Globular domains can occur on the cytoplasmic and the non-cytoplasmic side. On the cytoplsmic side, globular domains are linked to loops, ehich are agin linked to cytoplasimc caps. These caps are followed by the helex core and there is again a cap on the non-cytoplasmic side. These caps are linked to globular domains by either short or long non-cytoplasmic loops.
TMHMM outputs the most likely structure of the protein, ragarding to the above model. It also includes the orientation (cytoplasmic or non-cytoplasmic side) of the N-terminal signal sequence. The ouput consists of a plot - graphically showing the different states along the protein - and some additional statistics <ref> http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output </ref>:

  • The number of predicted transmembrane helices.
  • The expected number of amino acids in transmembrane helices. If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide).
  • The expected number of amino acids in transmembrane helices in the first 60 amino acids of the protein. If this number more than a few, you should be warned that a predicted transmembrane helix in the N-term could be a signal peptide.
  • The total probability that the N-term is on the cytoplasmic side of the membrane.


Protein #predicted TMHs #expected AAs in TMHs #expected AAs in TMHs in first 60 positions orientation (N-term at non-cyto. side)
ARS A 0 2.65106 2.63079 0.12149
A4_HUMAN 1 22.72525 0.0027 0.00015
INSL5_HUMAN 0 0.50415 0.50415 0.03772
LAMP1_HUMAN 2 44.89582 22.24286 0.99287
RET4_HUMAN 0 0.01196 0.01179 0.01909
BACR 6 140.4032 26.1196 0.01887


Discussion
  • ARS A:outside 1 507 (=all)
Graphical output of TMHMM for ARS A
  • A4_HUMAN: The topology is given below

outside 1 700
TMhelix 701 723
inside 724 770

Graphical output of TMHMM for A4_HUMAN
  • INSL5_HUMAN: outside 1 135 (all residues)
Graphical output of TMHMM for INSL5_HUMAN
  • LAMP1_HUMAN POSSIBLE N-term signal sequence
Graphical output of TMHMM for LAMP1_HUMAN

sp_P11279_LAMP1_HUMAN TMHMM2.0 inside 1 10
sp_P11279_LAMP1_HUMAN TMHMM2.0 TMhelix 11 33
sp_P11279_LAMP1_HUMAN TMHMM2.0 outside 34 383
sp_P11279_LAMP1_HUMAN TMHMM2.0 TMhelix 384 406
sp_P11279_LAMP1_HUMAN TMHMM2.0 inside 407 417

  • RET4_HUMAN: outside 1 201 (all)
Graphical output of TMHMM for RET4_HUMAN
  • BACR:
Graphical output of TMHMM for BACR
  1. sp_P02945_BACR_HALSA POSSIBLE N-term signal sequence

sp_P02945_BACR_HALSA TMHMM2.0 outside 1 22 sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 23 42 sp_P02945_BACR_HALSA TMHMM2.0 inside 43 54 sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 55 77 sp_P02945_BACR_HALSA TMHMM2.0 outside 78 91 sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 92 114 sp_P02945_BACR_HALSA TMHMM2.0 inside 115 120 sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 121 143 sp_P02945_BACR_HALSA TMHMM2.0 outside 144 147 sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 148 170 sp_P02945_BACR_HALSA TMHMM2.0 inside 171 189 sp_P02945_BACR_HALSA TMHMM2.0 TMhelix 190 212 sp_P02945_BACR_HALSA TMHMM2.0 outside 213 262

Phobius

Polyphobius

TargetP

TargetP is used to predict the cellular localization of a protein. It combines the two methods ChloroP and SignalP. The following targeting sequences can be identified:

  • chloroplast transit peptide (cTP)
  • mitochondrial targeting peptide (mTP)
  • secretory pathway signal peptide (SP)

TargetP uses a neural network to calculate and outputs scores for each of the above subcellular targets. TargetP finally predicts the location with the highest score. In our case all proteins are predicted to be targeted to the secretory pathway (S). Results are shown below. Note, that cTP is not included in our predictions, as we only considered eukaryotic and bacterial proteins. Also note, that TargetP is trained on eukaryotic proteins and hence the prediction for the protein "BACR", which is bacterial does not make sense, because there are completely different pathways of localization and secretion in eukayotes and bacteria (e.g. bacteria do not have an endoplasmatic reticulum, Golgi-Apparatus or Lysosome). Nevertheless, we included it in our analysis to see if TargetP predicts finds any localization sequence in it or predicts "-" (= no localization signal found).

Protein mTP SP other prediction
ARS A 0.070 0.926 0.054 S
A4_HUMAN 0.035 0.937 0.084 S
INSL5_HUMAN 0.074 0.899 0.037 S
LAMP1_HUMAN 0.043 0.953 0.017 S
RET4_HUMAN 0.242 0.928 0.020 S
BACR (bacterial) 0.019 0.897 0.562 S
Discussion

All proteins are assigned to the secretory pathway.

  • Arylsulfatase A is a lysosomal enzyme. Therefore, the prediction is correct, as lysosomal proteins are guided there by the secretory pathway, via the endoplasmatic reticulum and the Golgi apparatus.
  • coming
  • coming
  • coming
  • coming
  • As described above, BACR is a bacterial protein. TargetP assigns, that this protein is also guided to the secretory pathway, which makes no sense as the bacterial protein secretion is different from eukaryotic secretion. Nevertheless, the prediction is much less obvious in this case, compared to the others. The "other" class - meaning that no targeting sequence is found in the protein gets a considerable high score in this prediction, hence the assignment to S is more questionable here.

SignalP

sudo /apps/signalp-3.0/signalp -t gram- ../BACR.fasta > BACR.signalp
sudo /apps/signalp-3.0/signalp -t euk ../ARSA.fasta > ARSA.signalp
sudo /apps/signalp-3.0/signalp -t euk ../A4.fasta > A4.signalp
sudo /apps/signalp-3.0/signalp -t euk ../LAMP1.fasta > LAMP1.signalp
sudo /apps/signalp-3.0/signalp -t euk ../INSL5.fasta > INSL5.signalp
sudo /apps/signalp-3.0/signalp -t euk ../RET4.fasta > RET4.signalp

Prediction of GO Terms

GOPET

Pfam

ProtFun 2.2