Sequence-based analyses of ARS A
Contents
Additional Proteins
The following proteins are additionally used for the prediction of transmembrand alpha-helices and signal peptides and for the prediction of GO Terms:
BACR
BACR_HALSA is a bacterial membrane protein...
type | Position ' | Description |
Topological domain | 14 – 23 | Extracellular |
Transmembrane | 24 – 42 | Helical; Name=Helix A |
Topological domain | 43 – 56 | Cytoplasmic |
Transmembrane | 57 – 75 | Helical; Name=Helix B |
Topological domain | 76 – 91 | Extracellular |
Transmembrane | 92 – 109 | Helical; Name=Helix C |
Topological domain | 110 – 120 | Cytoplasmic |
Transmembrane | 121 – 140 | Helical; Name=Helix D |
Topological domain | 141 – 147 | Extracellular |
Transmembrane | 148 – 167 | Helical; Name=Helix E |
Topological domain | 168 – 185 | Cytoplasmic |
Transmembrane | 186 – 204 | Helical; Name=Helix F |
Topological domain | 205 – 216 | Extracellular |
Transmembrane | 217 – 236 | Helical; Name=Helix G |
Topological domain | 237 – 262 | Cytoplasmic |
RET 4
- RET4_HUMAN is a human retinal-binding protein. It delivers retinol from the liver stores to the peripheral tissues. Defects can cause night vision problems.
no regions available
INSL 5
- INSL5_HUMAN is a human insulin-like peptide. It consists of two chains and may have a role in gut contractility or in thymic development and regulation.
no regions available
LAMP 1
- LAMP1_HUMAN is a human membrane glycoprotein. It presents cabohydrate ligands to selectins.
type | Position ' | Description |
Topological Domain | 29 - 382 | Lumenal |
Transmembrane | 383 - 405 | Helical |
Topological Domain | 406 - 417 | Cytoplasmic |
Region | 29 - 194 | First lumenal domain |
Region | 195 - 227 | Hinge |
Region | 228 - 382 | Second lumenal domain |
A 4
- A4_HUMAN is a human cell surface receptor involved in neurite growth, neuronal adhesion and axonogenesis. It can be involved in Alzheimer disease and Amyloidosis.
type | Position ' | Description |
Topological domain | 18 - 699 | Extracellular |
Transmembrane | 700 - 723 | Helical |
Topological domain | 724 - 770 | Cytoplasmic |
Domain | 291 - 341 | BPTI / Kunitz inhibitor |
Region | 96 - 110 | Heparin-binding |
Region | 181 - 188 | Zinc-binding |
Region | 391 - 423 | Heparin-binding |
Region | 491 - 522 | Heparin-binding |
Region | 523 - 540 | Collagen-binding |
Region | 732 - 751 | Interaction with G(o)-alpha |
Motif | 724 - 734 | Basolateral sorting signal |
Motif | 759 - 762 | NPXY motif; contains endocytosis signal |
Compositional bias | 230 - 260 | Asp/Glu-rich (acidic) |
Compositional bias | 274 - 280 | Poly-Thr |
Secondary Structure Prediction
PSI-PRED
PSI-PRED creates a profile obtained from a PSI-BLAST search, which is fed into a feed-forward neural network. The output of this network then serves as input of a second network, which yields the final prediction. The average Q3 score, reached by PSI-PRED is 80,3 %. <ref name="psipred">Jones, D. T.. "[Protein secondary structure prediction based on position-specific scoring matrices.]". J Mol Biol, 1999</ref>
Jpred
Jpred also uses a neural network to predict secondary structure. The prediction relies on the Jnet algorithm, wich either takes a multiple sequence alignment or a single sequence as input. If a single sequence is passed to the program, Jpred also uses sequence profiles derived from a PSI-BLAST search. It reaches an average Q3 score up to 81,5 %. <ref name="jpred">Cole, C. and Barber, J. D. and Barton, G. J.. "[The Jpred 3 secondary structure prediction server.]". Nucleic Acid Res, 2008</ref>
DSSP
DSSP is a database of protein secondary structure assignments for all proteins in PDB. It is based on a method, which takes the 3D coordinates of a protein and assigns a hierarchical definition of secondary structure elements to the protein. <ref name="dssp">Kabsch W. and Sander C.. "[Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.]". Biopolymers, 1983</ref>
Results and Discussion
We predicted secondary structure of Arylsulfatase A with PSI-BLAST and Jpred3 using the Ebserver user interface. Further on, we downloaded the DSSP secondary structure assignment. DSSP assigns a hiearchical definition of secondary structure and therefore the assignment contains more structural classes than the 3 class prediction (H=helix, E=sheet, C=coil) of PSI-PRED and Jpred. To be able to compare the predictions to the assignemnt of DSSP, we converted the DSSP output classes to the three letter classification, using a perl script. The following table depicts DSSP classes, their description and the "3-letter-class", we converted it to.
DSSP class | Description ' | 3-letter class |
H | Helix | H |
G | 3-10 Helix | H |
I | Phi-Helix | H |
B | single bridge | E |
E | beta sheet | E |
T | turn | C |
S | bend | C |
\s | coil | C |
Both methods yield similar predictions. The following figure shows a schematic representation of the prediction. Besides, it depicts the true positive prediction - i.e. the same class was predicted by the method and assigned by DSSP - in green.
The actual predictions and the DSSP assignment are listed below. Missing residues in the DSSP output are marked by an "m".
mmmmmmmmmmmmmmmmmmCCCEEEEEEECCCCCCCCHHHCCCCCCCHHHHHHHHCCEEECCEECCCCCHHHHHHHHHHCCCHHHHCC (DSSP)
CCHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCCHHHHHHHHCCCEECCCCCCCCCCHHHHHHHHHCCCCCCCCC (JPRED)
CCHHHHHHHHHHHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCCCC (PSI-PRED)
CCCCCCCCECCECCCCCCCHHHHHHCCCCEEEEEECCCCECCHHHCCCHHHHCCCEEEECCCCCCCCECCCCEEECCCEECCCCECC (DSSP)
CCCCCCCCCCCCCCCCCCHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC (JPRED)
CCCCCCCCCCCCCCCCCCCHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC (PSI-PRED)
CCCCCCEEECCEEEEECCCHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHH (DSSP)
CCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHH (JPRED)
CCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHH (PSI-PRED)
HHHHHHCCCHHHEEEEEEECCCCCHHHHHHCCCCCCCCCCCCCCCHHHHECCCEEECCCCCCCEEECCCEEHHHHHHHHHHHHCCCC (DSSP)
HHHHHHCCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCCCCCECCCCCCCCHHHHHHHHHCCCC (JPRED)
HHHHHHCCCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCEECCCHHHHHHHHHHHHHHCCCC (PSI-PRED)
CCCCCCCCCCHHHHHCCCCCCCCEEEECCCCCCCCCCCCEEEECCEEEEEEECCCHHHCCCCCHHHCCCCCCEEEEEEEEEECCCCC (DSSP)
CCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCEEEEECCCEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCEECCCCCC (JPRED)
CCCCCCCCCCHHHHCCCCCCCCCEEEECCCCCCCCCCEEEEEECCCEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCC (PSI-PRED)
CCCCCCCCCmmmCCHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHCECHHHCCCCCCCCCCCCCCCCECmmmm (DSSP)
CCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC (JPRED)
CCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC (PSI-PRED)
Both methods show a good performance on the main part of the protein with an overall accurcy of 74 % for PSI-PRED and an accuracy of 71 % for Jpred3. Thus, the accuracy (Q3) in this prediction is around 10 % lower than the average Q3 scores in the original publications of PSI-PRED and Jpred. Both methods predict the wrong secondary structure for the region from around position 110-200. DSSP assigns very short helices and beta sheets in this regions. Perhaps these are too short for a proper prediction. It is also remarkable, that the scores within this false predicted region are as high as for the rest of the protein sequence.
Program | #TP | #FP | accuracy |
PSI-PRED | 374 | 133 | 0.74 |
Jpred | 359 | 148 | 0.71 |
Prediction of Disordered Regions
Three different servers were challenged to predict disordered regions in ARSA, but no region was found that is consistent between the three methods.
DISOPRED
DISOPRED predictions for a false positive rate threshold of: 2% conf: 930000000000012210000000000000000000000000000000000000000000 pred: *........................................................... AA: MGAPRSLLLALAAGLAVARPPNIVLIFADDLGYGDLGCYGHPSSTTPNLDQLAAGGLRFT 10 20 30 40 50 60 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: DFYVPVSLCTPSRAALLTGRLPVRMGMYPGVLVPSSRGGLPLEEVTVAEVLAARGYLTGM 70 80 90 100 110 120 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: AGKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIP 130 140 150 160 170 180 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: LLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAE 190 200 210 220 230 240 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: RSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRC 250 260 270 280 290 300 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: GKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAALAGAPLPNVTLDGFDLSP 310 320 330 340 350 360 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: LLLGTGKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTADPACHASSSL 370 380 390 400 410 420 conf: 000000000000000000000000000000000000000000000000000000000000 pred: ............................................................ AA: TAHEPPLLYDLSKDPGENYNLLGGVAGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARG 430 440 450 460 470 480 conf: 000000000000000002571699999 pred: ......................***** AA: EDPALQICCHPGCTPRPACCHCPDPHA 490 500 Asterisks (*) represent disorder predictions and dots (.) prediction of order. The confidence estimates give a rough indication of the probability that each residue is disordered.
As you can see, only the first residue and the five last residues are predicted to be in a disordered region. The confidence for not being disordered is very clear: only for the last ten residues there is a uncertainty.
POODLE
POODLE predicts many disordered residues. Depending on the treshold one can identify 6 or more disordered regions.
IUPred
The three different options of prediction were tried and are illustrated below. In general, IUPred did not predict any disordered region with a "Disorder tendency" above 0.6 except one very short region around residue 415 with the "long disorder"-option.
long disorder
The main profile of our server is to predict context-independent global disorder that encompasses at least 30 consecutive residues of predicted disorder. For this application the sequential neighbourhood of 100 residues is considered. <ref name="IUPred"> http://iupred.enzim.hu/Help.html</ref>
short disorder
It uses a parameter set suited for predicting short, probably context-dependent, disordered regions, such as missing residues in the X-ray structure of an otherwise globular protein. For this application the sequential neighbourhood of 25 residues is considered. As chain termini of globular proteins are often disordered in X-ray structures, this is taken into account by an end-adjustment parameter which favors disorder prediction at the ends. <ref name="IUPred"> http://iupred.enzim.hu/Help.html</ref>
structured domains
The dependable identification of ordered regions is a crucial step in target selection for structural studies and structural genomics projects. Finding putative structured domains suitable for stucture determination is another potential application of this server. In this case the algorithm takes the energy profile and finds continuous regions confidently predicted ordered. Neighbouring regions close to each other are merged, while regions shorter than the minimal domain size of at least 30 residues are ignored. When this prediction type is selected, the region(s) predicted to correspond to structured/globular domains are returned. <ref name="IUPred"> http://iupred.enzim.hu/Help.html</ref>
Meta-Disorder
PredictProtein needs a registration which I tried, but it does not work: "username does not exist!"
Prediction of transmembrane alpha-helices and signal peptides
The prediction of membrane proteins and their topology is very important, because the experimental determination of these protein is quite challenging. It is very dificult to determine the structure, because the influence of membrane mimetic environments might lead to non-native structures and thus lead to a wrongf structural model of the protein. <ref>Cross, Timothy, Mukesh Sharma, Myunggi Yi, Huan-Xiang Zhou (2010). "Influence of Solubilizing Environments on Membrane Protein Structures"</ref>
SignalP
SignalP uses a neural network and a HMM to calculate three different scores <ref name="signalp">Bendtsen, J. D. and Nielsen, H. and von Heijne, G. and Brunak, S.. "[Improved prediction of signal peptides: SignalP 3.0.]". J Mol Biol, 2004</ref>:
- S-score (=signal peptide score): High values indicate the presence of a signal peptides in the sequence.
- C-Score (=raw cleavage site score): This score is used to recognize the cleavage site.
- Y-score (= combined cleavage site score): This score optimizes the prediction of the cleavage site by considering the C-score and the S-score simultaneously. A cleavage site is predicted, if the C-score is high and the S-score is low.Optimierung des cleavage site scores.
We executed SignalP with the following commands:
sudo /apps/signalp-3.0/signalp -t gram- ../BACR.fasta > BACR.signalp
sudo /apps/signalp-3.0/signalp -t euk ../ARSA.fasta > ARSA.signalp
sudo /apps/signalp-3.0/signalp -t euk ../A4.fasta > A4.signalp
sudo /apps/signalp-3.0/signalp -t euk ../LAMP1.fasta > LAMP1.signalp
sudo /apps/signalp-3.0/signalp -t euk ../INSL5.fasta > INSL5.signalp
sudo /apps/signalp-3.0/signalp -t euk ../RET4.fasta > RET4.signalp
The graphical output of the method is shown below:
ARS A | A4 | RET4 | INSL5 | LAMP1 | BACR |
Discussion
The cleavage sites and signal peptides of all proteins are correctly predicted, compard to the UniprotKB annotation. In general the output of the neural network gives a more distinct prediction of the different regions. The bacterial membrane protein BACR does not contain a signal peptide, regarding the annotation of UniprotKB and SignalP does not predict one, but the S-score is very high between position 20-40, which is a transmembrane helix. This is due to the similar properties of signal peptides and transmembrane helices, which both exhibit a bias towards hydrophobic amino acids. But lacking the characteristics of a cleavage site, SignalP does not predict a Signalpeptide here. This shows that the program is able to properly distinguish between transmebrane helices and signal peptides.
TMHMM
TMHMM predicts transmembrane helices (TMH) using a Hidden Markov Model (HMM). The protein described by TMH model essentially consists of seven different states. Globular domains can occur on the cytoplasmic and the non-cytoplasmic side. On the cytoplsmic side, globular domains are linked to loops, ehich are agin linked to cytoplasimc caps. These caps are followed by the helex core and there is again a cap on the non-cytoplasmic side. These caps are linked to globular domains by either short or long non-cytoplasmic loops.
TMHMM outputs the most likely structure of the protein, ragarding to the above model. It also includes the orientation (cytoplasmic or non-cytoplasmic side) of the N-terminal signal sequence. The ouput consists of a plot - graphically showing the different states along the protein - and some additional statistics <ref> http://www.cbs.dtu.dk/services/TMHMM-2.0/TMHMM2.0.guide.html#output </ref>:
- The number of predicted transmembrane helices.
- The expected number of amino acids in transmembrane helices. If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide).
- The expected number of amino acids in transmembrane helices in the first 60 amino acids of the protein. If this number more than a few, you should be warned that a predicted transmembrane helix in the N-term could be a signal peptide.
- The total probability that the N-term is on the cytoplasmic side of the membrane.
Discussion
- ARS A: All amino acids are predicted to be "outside" the membrane, which is consistent with the UniprotKB annotation, as ARS A is not a membrane protein. The graphical output of TMHMM shows, that the probaility for a transmembrane helix is elevated at the start of the protein, which is due to the hydrophobicity of the signal peptide.
- A4_HUMAN: This protein contains exactly one transmembrane helix which is located from postion 700-723. TMHMM predicts the transmembrane helix at 701-723, which is quite stifying. The predited topology is given below:
Description | Position ' |
outside | 1-700 |
TMhelix | 701-723 |
inside | 724-770 |
- INSL5_HUMAN: This protein does not contain any transmembrane helices and none are predicted by TMHMM.
- LAMP1_HUMAN: TMHMM predicts a Possible N-terminal signal sequence and two potential transmembrane helices. LAMP1 indeed contains a N-terminal signal peptide, but the prediction of the first transmembrane helix is false. This false positive prediction overlaps to 50% with the signal peptide, which is located from position 1-22. However, the second predicted TM-helix highly overlaps with the annotated transmebrane helix in UniprotKB.
Description | Position ' |
inside | 1-10 |
TMhelix | 11-33 |
outside | 34-383 |
TMhelix | 384-406 |
inside | 407-417 |
- RET4_HUMAN: No TM-helices are predicted, which coincides with the annotation.
- BACR: A Possible N-terminal signal sequence is predicted, which is false. BACR contains 7 TM-helices. TMHMM only predicts 6 transmembrane helices, which highly overlap with the annotation. The prediction misses the last TM-helix in the protein. Despite the false prediction, the graphical output shows, that the probybility is quite high in this region.
Description | Position ' |
outside | 1-22 |
TMhelix | 23-42 |
inside | 43-54 |
TMhelix | 55-77 |
outside | 78-91 |
TMhelix | 92-114 |
inside | 115-120 |
TMhelix | 121-143 |
outside | 144-147 |
TMhelix | 148-170 |
inside | 171-189 |
TMhelix | 190-212 |
outside | 213-262 |
Phobius and Polyphobius
Phobius combines SignalP-HMM and TMHMM for the prediction of transmembrane proteins and their topology. The Hidden Markov Models of both progams are simply associated via the last state of SignalP-HMM and the non-cytoplasmic loop state of TMHMM. This is justified, because most signal peptides are located at the non-cytoplasmic side of the membrane, but it also limits the detection of proteins with the opposite location. <ref name="phobius">Kall, L. and Krogh, A. and Sonnhammer, E. L.. "[A combined transmembrane topology and signal peptide prediction method.]". J Mol Biol, 2004</ref>
Polyphobius extends the approach of Phobius by incorporating information from homologs using global alignments. <ref name="polyphobius">Kall, L. and Krogh, A. and Sonnhammer, E. L.. "[An HMM posterior decoder for sequence feature prediction that includes homology information.]". Bioinformatics, 2005</ref>
Discussion
- ARS A: The prediction of the signal peptide of Phobius is too long (1-28), whereas the prediction of Polyphobius is slighlty too short (1-16). Regarding to the annotaion, ARS A contains a signal peptide from position 1-18. Both methods don't predict a transmembrane helix.
- A4_HUMAN: Both methods correctly predict the location of the signal peptide. The prediction of the transmembrane helix only misses the first amino acid. This prediction is quite good.
- INSL5_HUMAN: Phobius and Polyphobius correctly predict the signal peptide from position 1-22 and no TM-helix.
- LAMP1_HUMAN: Both methods correctly predict the location of the signal peptide. The prediction of the transmembrane helix contains an additional amino acid at the start.
- RET4_HUMAN: Phobius and Polyphobius correctly predict the signal peptide from position 1-18 and no TM-helix.
- BACR: Both methods predict 7 transmembrane helices, which almost perfectly overlap with the annotation.
Phobius and Polyphobius yield very similar results. The only improvement for Polyphobius - which can be seen from our analysed proteins - is in the prediction of the location of the signal peptide for ARS A.
OCTOPUS and SPOCTOPUS
OCPTOPUS uses a combination of a Hidden Markov Model and neural network to predict the topology of a transmembrane protein. It uses BALST to create a sequence profile, whihc is then used by the neural network to predict the preference of the amino acids to be located within a transmembrane (M), interface (I), close loop (L) globular loop (G), inside (i) or outside (o). These scores are then passed to the HMM, which predicts the final states.
SPOCTOPUS extends the OCTOPUS algorithm with a preprocessing step. OCTOPUS does not predict signal peptides. The N-terminal targeting sequences mainly consist of hydrophobic residues and thus thier properties strongly resemble the transmembrane helices. Not considering the signal peptides in the prediction often leads to a false prediction of a transmembrane helix at the N-terminal domain. Therefore SPOCTOPUS extends the OCTOPUS algorithm with the prediction of signal peptide preference scores within the first 70 amino acids of the protein. The exact location of a potential signal peptide are then predicted by a HMM in OCTOPUS.
Protein | OCTOPUS | SPOCTOPUS |
ARS A | ||
A4 | ||
RET4 | ||
INSL5 | ||
LAMP1 | ||
BACR |
Discussion
The difference between OCTOPUS and SPOCTOPUS can be clearly seen in the predictions. As mentioned above, OCTOPUS does not include the prediction of signal peptides and thus confounds signal peptides with TM-helices.
TargetP
TargetP is used to predict the cellular localization of a protein. It combines the two methods ChloroP and SignalP. The following targeting sequences can be identified:
- chloroplast transit peptide (cTP)
- mitochondrial targeting peptide (mTP)
- secretory pathway signal peptide (SP)
TargetP uses a neural network to calculate and outputs scores for each of the above subcellular targets. TargetP finally predicts the location with the highest score. In our case all proteins are predicted to be targeted to the secretory pathway (S). Results are shown below. Note, that cTP is not included in our predictions, as we only considered eukaryotic and bacterial proteins. Also note, that TargetP is trained on eukaryotic proteins and hence the prediction for the protein "BACR", which is bacterial does not make sense, because there are completely different pathways of localization and secretion in eukayotes and bacteria (e.g. bacteria do not have an endoplasmatic reticulum, Golgi-Apparatus or Lysosome). Nevertheless, we included it in our analysis to see if TargetP predicts finds any localization sequence in it or predicts "-" (= no localization signal found).
Protein | mTP | SP | other | prediction |
ARS A | 0.070 | 0.926 | 0.054 | S |
A4_HUMAN | 0.035 | 0.937 | 0.084 | S |
INSL5_HUMAN | 0.074 | 0.899 | 0.037 | S |
LAMP1_HUMAN | 0.043 | 0.953 | 0.017 | S |
RET4_HUMAN | 0.242 | 0.928 | 0.020 | S |
BACR (bacterial) | 0.019 | 0.897 | 0.562 | S |
Discussion
All proteins are assigned to the secretory pathway.
- Arylsulfatase A is a lysosomal enzyme. Therefore, the prediction is correct, as lysosomal proteins are guided there by the secretory pathway, via the endoplasmatic reticulum and the Golgi apparatus.
- coming
- coming
- coming
- coming
- As described above, BACR is a bacterial protein. TargetP assigns, that this protein is also guided to the secretory pathway, which makes no sense as the bacterial protein secretion is different from eukaryotic secretion. Nevertheless, the prediction is much less obvious in this case, compared to the others. The "other" class - meaning that no targeting sequence is found in the protein gets a considerable high score in this prediction, hence the assignment to S is more questionable here.
Prediction of GO Terms
GOPET
GO-Terms for 6 different proteins were predicted. The results are shown below. Bold entries are GO-Terms which are really connected to the protein. <ref>http://www.ebi.ac.uk/QuickGO/</ref>
A4
GOid | Confidence | GO term |
GO:0004866 | 87% | endopeptidase inhibitor activity |
GO:0004867 | 86% | serine-type endopeptidase inhibitor activity |
GO:0030568 | 83% | plasmin inhibitor activity |
GO:0030304 | 83% | trypsin inhibitor activity |
GO:0030414 | 82% | peptidase inhibitor activity |
GO:0005488 | 79% | binding |
GO:0005515 | 74% | protein binding |
GO:0046872 | 73% | metal ion binding |
GO:0003677 | 71% | DNA binding |
GO:0008201 | 70% | heparin binding |
GO:0008270 | 69% | zinc ion binding |
GO:0005507 | 69% | copper ion binding |
GO:0005506 | 67% | iron ion binding |
ARS A
GOid | Confidence | GO term |
GO:0003824 | 97% | catalytic activity |
GO:0016787 | 96% | hydrolase activity |
GO:0008484 | 95% | sulfuric ester hydrolase activity |
GO:0004065 | 92% | arylsulfatase activity |
GO:0004098 | 89% | cerebroside-sulfatase activity |
GO:0003943 | 83% | N-acetylgalactosamine-4-sulfatase activity |
GO:0004773 | 82% | steryl-sulfatase activity |
GO:0004423 | 82% | iduronate-2-sulfatase activity |
GO:0008449 | 82% | N-acetylglucosamine-6-sulfatase activity |
GO:0047753 | 82% | choline-sulfatase activity |
GO:0018741 | 81% | alkyl sulfatase activity |
GO:0046872 | 63% | metal ion binding |
GO:0016250 | 61% | N-sulfoglucosamine sulfohydrolase activity |
BACR_HALSA
GOid | Confidence | GO term |
GO:0005216 | 77% | ion channel activity |
GO:0008020 | 75% | G-protein coupled photoreceptor activity |
GO:0015078 | 60% | hydrogen ion transmembrane transporter activity |
INSL 5
GOid | Confidence | GO term |
GO:0005179 | 80% | hormone activity |
LAMP 1
GOid | Confidence | GO term |
GO:0004812 | 60% | aminoacyl-tRNA ligase activity |
GO:0005524 | 60% | ATP binding |
RET 4
GOid | Confidence | GO term |
GO:0005488 | 90% | binding |
GO:0005501 | 81% | retinoid binding |
GO:0008289 | 80% | lipid binding |
GO:0019841 | 78% | retinol binding |
GO:0005215 | 78% | transporter activity |
GO:0016918 | 78% | retinal binding |
GO:0005319 | 69% | lipid transporter activity |
GO:0008035 | 60% | high-density lipoprotein particle binding |
Pfam
A4
ARS A
BACR HALSA
INSL 5
LAMP 1
RET 4
ProtFun 2.2
A4
############## ProtFun 2.2 predictions ############## >sp_P05067_A # Functional category Prob Odds Amino_acid_biosynthesis 0.020 0.921 Biosynthesis_of_cofactors 0.261 3.623 Cell_envelope => 0.804 13.186 Cellular_processes 0.053 0.730 Central_intermediary_metabolism 0.184 2.920 Energy_metabolism 0.023 0.259 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.417 1.716 Regulatory_functions 0.013 0.084 Replication_and_transcription 0.029 0.109 Translation 0.027 0.613 Transport_and_binding 0.827 2.016 # Enzyme/nonenzyme Prob Odds Enzyme => 0.392 1.368 Nonenzyme 0.608 0.852 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.024 0.114 Transferase (EC 2.-.-.-) 0.208 0.603 Hydrolase (EC 3.-.-.-) 0.190 0.600 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.324 Ligase (EC 6.-.-.-) 0.048 0.946 # Gene Ontology category Prob Odds Signal_transducer 0.126 0.586 Receptor 0.036 0.211 Hormone 0.001 0.206 Structural_protein => 0.034 1.205 Transporter 0.024 0.222 Ion_channel 0.009 0.162 Voltage-gated_ion_channel 0.002 0.108 Cation_channel 0.010 0.215 Transcription 0.043 0.335 Transcription_regulation 0.018 0.143 Stress_response 0.076 0.862 Immune_response 0.016 0.183 Growth_factor 0.005 0.372 Metal_ion_transport 0.009 0.020 //
ARS A
############## ProtFun 2.2 predictions ############## >sp_P15289_A # Functional category Prob Odds Amino_acid_biosynthesis 0.015 0.669 Biosynthesis_of_cofactors 0.048 0.668 Cell_envelope => 0.804 13.186 Cellular_processes 0.027 0.373 Central_intermediary_metabolism 0.404 6.416 Energy_metabolism 0.050 0.555 Fatty_acid_metabolism 0.028 2.138 Purines_and_pyrimidines 0.404 1.662 Regulatory_functions 0.013 0.081 Replication_and_transcription 0.021 0.080 Translation 0.032 0.717 Transport_and_binding 0.821 2.002 # Enzyme/nonenzyme Prob Odds Enzyme => 0.540 1.886 Nonenzyme 0.460 0.644 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.063 0.304 Transferase (EC 2.-.-.-) 0.062 0.180 Hydrolase (EC 3.-.-.-) 0.313 0.987 Lyase (EC 4.-.-.-) 0.038 0.803 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.206 0.965 Receptor 0.111 0.652 Hormone 0.002 0.323 Structural_protein 0.005 0.177 Transporter 0.025 0.229 Ion_channel 0.009 0.154 Voltage-gated_ion_channel 0.003 0.139 Cation_channel 0.010 0.215 Transcription 0.037 0.287 Transcription_regulation 0.018 0.142 Stress_response => 0.102 1.158 Immune_response 0.022 0.259 Growth_factor 0.005 0.391 Metal_ion_transport 0.009 0.020 //
BACR HALSA
############## ProtFun 2.2 predictions ############## >sp_P02945_B # Functional category Prob Odds Amino_acid_biosynthesis 0.033 1.495 Biosynthesis_of_cofactors 0.186 2.589 Cell_envelope 0.029 0.483 Cellular_processes 0.051 0.694 Central_intermediary_metabolism 0.045 0.711 Energy_metabolism 0.138 1.537 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.302 1.244 Regulatory_functions 0.013 0.080 Replication_and_transcription 0.019 0.073 Translation 0.059 1.339 Transport_and_binding => 0.791 1.929 # Enzyme/nonenzyme Prob Odds Enzyme 0.199 0.696 Nonenzyme => 0.801 1.122 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.114 0.549 Transferase (EC 2.-.-.-) 0.031 0.091 Hydrolase (EC 3.-.-.-) 0.057 0.180 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.258 1.205 Receptor 0.355 2.087 Hormone 0.001 0.206 Structural_protein 0.006 0.200 Transporter => 0.440 4.036 Ion_channel 0.010 0.169 Voltage-gated_ion_channel 0.004 0.172 Cation_channel 0.078 1.689 Transcription 0.026 0.205 Transcription_regulation 0.028 0.226 Stress_response 0.012 0.139 Immune_response 0.011 0.128 Growth_factor 0.010 0.727 Metal_ion_transport 0.049 0.106 //
INSL 5
############## ProtFun 2.2 predictions ############## >sp_Q9Y5Q6_I # Functional category Prob Odds Amino_acid_biosynthesis 0.011 0.484 Biosynthesis_of_cofactors 0.040 0.558 Cell_envelope => 0.756 12.393 Cellular_processes 0.033 0.448 Central_intermediary_metabolism 0.048 0.755 Energy_metabolism 0.036 0.397 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.144 0.592 Regulatory_functions 0.014 0.087 Replication_and_transcription 0.020 0.075 Translation 0.032 0.735 Transport_and_binding 0.834 2.033 # Enzyme/nonenzyme Prob Odds Enzyme 0.209 0.729 Nonenzyme => 0.791 1.109 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.056 0.268 Transferase (EC 2.-.-.-) 0.031 0.091 Hydrolase (EC 3.-.-.-) 0.062 0.195 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.327 # Gene Ontology category Prob Odds Signal_transducer 0.374 1.746 Receptor 0.128 0.750 Hormone => 0.247 37.936 Structural_protein 0.001 0.041 Transporter 0.025 0.228 Ion_channel 0.010 0.168 Voltage-gated_ion_channel 0.003 0.131 Cation_channel 0.010 0.215 Transcription 0.054 0.425 Transcription_regulation 0.091 0.724 Stress_response 0.099 1.128 Immune_response 0.178 2.090 Growth_factor 0.061 4.379 Metal_ion_transport 0.009 0.020 //
LAMP 1
############## ProtFun 2.2 predictions ############## >sp_P11279_L # Functional category Prob Odds Amino_acid_biosynthesis 0.011 0.484 Biosynthesis_of_cofactors 0.053 0.735 Cell_envelope => 0.804 13.186 Cellular_processes 0.027 0.373 Central_intermediary_metabolism 0.138 2.188 Energy_metabolism 0.037 0.411 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.533 2.195 Regulatory_functions 0.015 0.090 Replication_and_transcription 0.019 0.073 Translation 0.027 0.613 Transport_and_binding 0.834 2.033 # Enzyme/nonenzyme Prob Odds Enzyme 0.276 0.965 Nonenzyme => 0.724 1.014 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.039 0.187 Transferase (EC 2.-.-.-) 0.046 0.134 Hydrolase (EC 3.-.-.-) 0.058 0.184 Lyase (EC 4.-.-.-) 0.020 0.430 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.396 1.849 Receptor 0.282 1.659 Hormone 0.001 0.206 Structural_protein 0.011 0.408 Transporter 0.024 0.222 Ion_channel 0.008 0.147 Voltage-gated_ion_channel 0.002 0.111 Cation_channel 0.010 0.215 Transcription 0.032 0.247 Transcription_regulation 0.018 0.142 Stress_response 0.246 2.795 Immune_response => 0.371 4.368 Growth_factor 0.013 0.956 Metal_ion_transport 0.009 0.020 //
RET 4
############## ProtFun 2.2 predictions ############## >sp_P02753_R # Functional category Prob Odds Amino_acid_biosynthesis 0.017 0.751 Biosynthesis_of_cofactors 0.044 0.610 Cell_envelope => 0.804 13.186 Cellular_processes 0.075 1.021 Central_intermediary_metabolism 0.197 3.128 Energy_metabolism 0.043 0.475 Fatty_acid_metabolism 0.016 1.265 Purines_and_pyrimidines 0.275 1.131 Regulatory_functions 0.013 0.080 Replication_and_transcription 0.022 0.084 Translation 0.032 0.721 Transport_and_binding 0.800 1.951 # Enzyme/nonenzyme Prob Odds Enzyme => 0.544 1.900 Nonenzyme 0.456 0.639 # Enzyme class Prob Odds Oxidoreductase (EC 1.-.-.-) 0.095 0.458 Transferase (EC 2.-.-.-) 0.038 0.109 Hydrolase (EC 3.-.-.-) 0.235 0.742 Lyase (EC 4.-.-.-) => 0.059 1.264 Isomerase (EC 5.-.-.-) 0.010 0.321 Ligase (EC 6.-.-.-) 0.017 0.326 # Gene Ontology category Prob Odds Signal_transducer 0.202 0.942 Receptor 0.147 0.862 Hormone 0.004 0.667 Structural_protein 0.002 0.058 Transporter 0.025 0.232 Ion_channel 0.016 0.288 Voltage-gated_ion_channel 0.003 0.148 Cation_channel 0.010 0.215 Transcription 0.027 0.207 Transcription_regulation 0.025 0.196 Stress_response 0.161 1.829 Immune_response => 0.239 2.813 Growth_factor 0.023 1.617 Metal_ion_transport 0.009 0.020 //
References
<references />