Difference between revisions of "Task 3: Sequence-based predictions"
(→TMHMM) |
(→SignalP) |
||
Line 632: | Line 632: | ||
===== PAH ===== |
===== PAH ===== |
||
+ | |||
+ | SignalP predicted in both methods (HMM and NN) that there is no cleavage site for an signal peptide. |
||
===== BACR_HALSA ===== |
===== BACR_HALSA ===== |
||
+ | |||
+ | SignalP predicted in both methods (HMM and NN) that there is no cleavage site for an signal peptide. |
||
===== RET4_HUMAN ===== |
===== RET4_HUMAN ===== |
||
===== INSL5_HUMAN ===== |
===== INSL5_HUMAN ===== |
||
+ | |||
+ | SignalP predicted in both methods (HMM and NN) the cleavage site at positions 23. |
||
===== LAMP1_HUMAN ===== |
===== LAMP1_HUMAN ===== |
||
+ | |||
+ | SignalP predicted in both methods (HMM and NN) the cleavage site at positions 29. |
||
===== A4_HUMAN ===== |
===== A4_HUMAN ===== |
||
+ | |||
+ | SignalP predicted in both methods (HMM and NN) the cleavage site at positions 18. |
||
=== TargetP === |
=== TargetP === |
Revision as of 22:36, 29 May 2011
Contents
- 1 Task description
- 2 Task 3.1: Secondary structure prediction
- 3 Task 3.2: Prediction of disordered regions
- 4 Task 3.3: Prediction of transmembrane alpha-helices and signal peptides
- 5 Task 3.4: Prediction of GO terms
Task description
The full description of this task can be found here.
Task 3.1: Secondary structure prediction
Task 3.2: Prediction of disordered regions
Task 3.3: Prediction of transmembrane alpha-helices and signal peptides
Annotated sequence features
PAH
The phenylalanine-4-hydroxylase has no annotated signal peptide or transmembrane helices.
BACR_HALSA
The bacteriorhodopsin has the following annotated signal peptide and transmembrane helices:
Position | Feature Name | Description |
---|---|---|
1 - 13 | Propeptide | |
14 – 23 | Topological domain | Extracellular |
24 - 42 | Transmembrane | Helical; Name=Helix A |
43 – 56 | Topological domain | Cytoplasmic |
57 - 75 | Transmembrane | Helical; Name=Helix B |
76 – 91 | Topological domain | Extracellular |
92 - 109 | Transmembrane | Helical; Name=Helix C |
110 – 120 | Topological domain | Cytoplasmic |
121 - 140 | Transmembrane | Helical; Name=Helix D |
141 – 147 | Topological domain | Extracellular |
148 - 167 | Transmembrane | Helical; Name=Helix E |
168 – 185 | Topological domain | Cytoplasmic |
186 - 204 | Transmembrane | Helical; Name=Helix F |
205 – 216 | Topological domain | Extracellular |
217 - 236 | Transmembrane | Helical; Name=Helix G |
237 – 262 | Topological domain | Cytoplasmic |
RET4_HUMAN
The retinol-binding protein 4 has the following annotated signal peptide (no transmembrane helices are annotated):
Position | Feature Name | Description |
---|---|---|
1 - 18 | Signal peptide |
INSL5_HUMAN
The Insulin-like peptide INSL5 has the following annotated signal peptide (no transmembrane helices are annotated):
Position | Feature Name | Description |
---|---|---|
1 - 22 | Signal peptide |
LAMP1_HUMAN
The lysosome-associated membrane glycoprotein 1 has the following annotated signal peptide and transmembrane helices:
Position | Feature Name | Description |
---|---|---|
1 - 28 | Signal peptide | |
29 – 382 | Topological domain | Lumenal |
383 - 405 | Transmembrane | Helical; |
406 – 417 | Topological domain | Cytoplasmic |
A4_HUMAN
The Amyloid beta A4 protein has the following annotated signal peptide and transmembrane helices:
Position | Feature Name | Description |
---|---|---|
1 - 17 | Signal peptide | |
18 – 699 | Topological domain | Extracellular |
700 - 723 | Transmembrane | Helical; |
724 – 770 | Topological domain | Cytoplasmic |
General Questions to prediction of transmembrane alpha-helices and signal peptides
Why is the prediction of transmembrane helices and signal peptides grouped together here?
Methods which only predict transmembrane helices often predict signal peptides as transmembrane helices as well. The reason for this is that both, transmembrane helices and signal peptides consist mainly of hydrophobic residues. These false predictions lead to inaccurate topological features and thus to wrongly annotated function of a protein. To avoid these cases most recent methods couple their transmembrane prediction together with a signal peptide prediction.
Description of different signal peptides
Signalpeptides for the import to the endoplasmic reticulum (ER)
The import to the ER is usually required for the secretory pathway (to export proteins out of a cell). The import process can occur either co-translational (the nascent protein chain is translocated together with the ribosome) or post-translational (only the fully synthesized protein is transported to the ER). However, for both cases the SEC-pathway is mostly used.
The co-translational transport to the ER is done by the signal recognition particle (SRP). This particle recognizes the N-terminal signal-sequence of the nascent polypeptide chain and then transports it to the ER membrane where the complex, consisting of SRP, polypeptide chain and ribosome, is recognized by the ER membrane bound signal recognition particle receptor (SR). After this recognition the polypeptide chain is imported into the ER lumen via the SEC channel in an ATP dependent process.
The post-translational import to the ER lumen is done by chaperons which guide the polypeptide chain to the SEC channel which is then imported in an ATP dependent process.
However, not only the import to the ER lumen is possible, an import to the ER membrane is possible as well. So far, 5 different types of import to the ER membrane are known.
Type 1 requires an N-terminal signal sequence and an intrinsic stop transfer anchor sequence which will be the part which is inserted in the membrane.
Type 2 and 3 do not require a N-terminal signal sequence only a intrinsic signal anchor sequence is required. The difference between type 2 and 3 is that type 2 has positively charged residues before the signal anchor sequence (on the N-Terminal side) and type 3 has positively charged residues after the signal anchor sequence (on C-Terminal side). These charged residues of trans-membrane protein are always in the cytosol. Thus, type 2 inserted proteins have their N-terminal end residing in the cytosol whereas type 3 inserted proteins have a C-terminal end in the cytosol.
Type 4-A and 4-B insertion is also known as multipass membrane insertion. These proteins have not one trans-membrane helix like the proteins imported via Type 1,2 and 3, instead they have several trans-membrane helices. Hence, they consist of multiple internal stop-transfer anchor sequences and internal signal-anchor sequences. The difference between type 4-A and 4-B is that in type 4-A the N and C terminal ends are located in the cytosol whereas type 4-B import results in a N-terminal end residing in the ER lumen and a C-terminal end residing in the cytosol.
In addition to the N-terminal import of trans-membrane proteins there is also the possiblity for a C-terminal import. Obviously, these proteins are imported post-translation.
Signalpeptides for the import to the mitochondrion
There are several targets for import to the mitochondrion, proteins can be translocated to the matrix, the outer membrane, the inner membrane and the inter membrane space.
Proteins who are designated to be imported to the matrix of a mitochondrion have a N-terminal matrix-targeting sequence. This mitochondrial import to the matrix is assisted by chaperons (Hsc70) which guide the protein to the import pore complex of the mitochondrion. The import through the outer membrane is conducted by the TOM complex and the following import through the inner membrane is conducted by the TOM complex. After successful import to the matrix the signal sequence is cleaved off by proteolytic active enzymes.
Import to the inner membrane can occur in three ways. The first way is the TIM22 pathway, proteins using this pathway need internal targeting sequences. The next way is the stop transfer import, for this proteins need a stop transfer sequence and a N-terminal matrix targeting sequence. The third way is called conservative sorting proteins using this pathway have a N-terminal targeting sequence as well and in addition intrinsic Oxa1-targeting sequences which are recognized by Ox1-proteins which execute the import to the membrane.
Proteins imported to the outer membrane of a mitochondrion usually have PORTA domains which are recognized by the TOB/SAM complex.
Signalpeptides for the import to the chloroplast
Proteins heading to chloroplasts can target different parts of it. For example the stroma, inner and outer membrane, the thylakoids membrane or the thylakoids lumen.
Usually these protein have a N-terminal targeting sequence.
Signalpeptides for the import to the peroxisome
Peroxisomal proteins can be imported to the lumen or to the membrane. Proteins imported to the lumen have either a peroxisomal targeting signal at the C-termins (also known as PTS1) or a targeting sequence close to the N-terminus (also known as PTS2). Proteins imported to the membrane can have an intrinsic membrane peroxisomal targeting signal (mPTS). However, not all proteins have this mPTS. These proteins are imported to the ER and from there they bud off together with the mature peroxisome.
Signalpeptides for the import to the nucleus and the export form the nucleus
Proteins which are imported to the nucleus require a nuclear localisation signal (NLS) which is recognized by importin. The NLS containing protein is then imported via the nuclear pore complex (NPC) to the nucleoplasm.
Proteins which are exported from the nucleus require a nuclear export signal which is recognized by exportin, a protein which binds to the NES of the cargo protein. In addition to exportin a second component, known as Ran*GTP, is required to mediate the export through the NPC.
TMHMM
Details of the method
Author: Sonnhammer, Heijne & Krogh
Year: 1998
Reference: PubMed
Description
This method is based on a hidden markov model (HMM). The authors of this method tried to model the 'grammar' of transmembrane proteins in order to predict the protein topology of transmembrane more accurate than methods who only e.g. rely on propensity values and do not consider the topological constraints of these class of proteins.
TMHMM defined for their HMM for each feature one or more states which present this feature. For example the transmembrane helix is modeled by three sub models. A model for the helix core, the cap of the helix which lies partly in the cytoplasm and the membrane and the cap which is partly in the membrane and cytoplasm. In addition to this helix model they also created sub models for the cytoplasmic loop and the non-cytoplasmic loop as well as a sub model for the globular region. Each sub model can reflect one or more states in the HMM model. For example the globular sub model only consists of one HMM state whereas the helix-core and caps are modeled by multiple HMM states.
The 'grammar' is incorporated to this HMM model by defining the possible transitions from one sub model to another one. For example it is only possible to change from a cytoplasmic loop region to a cytoplasmic cap region and then to the helix core and after that either to non-cytoplasmic short loop or long non-cytoplasmic loop and so on.
Predicted features
This methods predicts the transmembrane helix and whether this part is in the cytoplasm (in) or outside of it (out).
Required information for the prediction
User who want to use it just need their amino acid sequence of their query sequence. The transmission and emission probabilities are derived from 160 transmembrane protein sequences.
Execution
Before we could execute TMHMM we had to change all occurrences of "/usr/local/bin/" to "/usr/bin" in these files: tmhmm, tmhmm.ORIG and tmhmmformat.pl
Then we executed the following command to retrieve the results for all sequences:
- tmhmm all.fa > task_33/tmhmm_out.txt
Results and discussion
PAH
Position | Feature Name |
---|---|
1 - 452 | outside |
BACR_HALSA
Position | Feature Name |
---|---|
1 - 22 | outside |
23 - 42 | TMhelix |
43 - 54 | inside |
55 - 77 | TMhelix |
78 - 91 | outside |
92 - 114 | TMhelix |
115 - 120 | inside |
121 - 143 | TMhelix |
144 - 147 | outside |
148 - 170 | TMhelix |
171 - 189 | inside |
190 - 212 | TMhelix |
213 - 262 | outside |
RET4_HUMAN
Position | Feature Name |
---|---|
1 - 201 | outside |
INSL5_HUMAN
Position | Feature Name |
---|---|
1 - 135 | outside |
LAMP1_HUMAN
Position | Feature Name |
---|---|
1 - 10 | inside |
11 - 33 | TMhelix |
34 - 383 | outside |
384 - 406 | TMhelix |
407 - 417 | inside |
A4_HUMAN
Position | Feature Name |
---|---|
1 - 700 | outside |
701 - 723 | TMhelix |
724 - 770 | inside |
Phobius
Details of the method
Author: Käll, Krogh, Sonnhammer
Year: 2004
Reference: PubMed
Description
Phobius is an HMM based prediction method to predict transmembrane helices as well as N-terminal signal peptides. More precisely, it is a combination of the two HMM models of TMHMM and SignalP which is merged into one HMM. This was done in order to overcome problems associated with transmembrane helix prediction: signale peptides are often wrongly predicted as transmembrane helices. The complete architecture can be seen in the figure.
Predicted features
Phobius predicts transmembrane helices, signal peptides and the topology of the loops (whether they are inside the cytoplasm or not).
Required information for the prediction
Users only has to enter the amino acid sequence of their query protein in FASTA format.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
PolyPhobius
Details of the method
Author: Käll L, Krogh A, Sonnhammer EL
Year: 2005
Reference: PubMed
Description
PolyPhobius is also based on a HMM which constraints the possible transitions from one state to another in order to reflect the 'grammar' of transmembrane proteins. However, the difference to the ordinary Phobius is that it uses knowledge homologous sequences of the query sequences as well to make the prediction more accurate.
In order to do so it calculates for each sequence position for each label (e.g. transmembrane helix, in, out, etc...) for each homologous sequence the posterior label probability (PLP). The PLP is defined as "the probability of a label at a certain position in the sequence, given the sequence and the model" (quoted from "Käll L, Krogh A, Sonnhammer EL. An HMM posterior decoder for sequence feature prediction that includes homology information Bioinformatics. 2005 Jun;21 Suppl 1:i251-7."). Then a multiple sequence alignment (MSA) of all homologous sequences is build, for each position in the MSA a average PLP is calculated. This average PLP will be then be used by the optimal accuracy algorithm to predict the most likely sequences of states for a given query sequence and thus the topology of the transmembrane helices.
Predicted features
This method predicts the same features as the ordinary Phobius, which means transmembrane helices, the signal peptide and whether the connecting loops of transmembrane helices are inside or outside.
Required information for the prediction
User need the amino acid sequence of their protein in FASTA format. An additional option is to specify the homologous sequences manually. If that is not done PolyPhobius will search for homologous sequences by itself by using BLAST.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
OCTOPUS
Details of the method
Author: Viklund H, Elofsson A.
Year: 2008
Reference: Bioinformatics
Description

OCTOPUS basically uses two methods to predict the topology of transmembrane proteins: artificial neural networks (ANN) and hidden markov models (HMM). In a first step BLAST searches for homologous sequences of a input FASTA sequence. From the found homologous sequences a multiple sequence alignment is build from which a raw sequence profile and a sequence profile based on PSSM are extracted. These profiles are used for two sets of ANNs.
The first set of ANNs contains four separate ANNs which predict the residue preference for M (Membrane), I (Interface), L (Loop), G (Globular). In order to make the predictions for G and M more smooth the output of the first row of ANNs output is used for a second ANN as input.The second set of ANNs is taken to predict the residue preferences for the inside/outside residues.
Finally. the output of these two sets of ANNs are used to parameterize the OCTOPUS-HMM for the actual topological feature prediction. This HMM is needed to model the 'grammar' of trans membrane proteins, which simply means that only certain state transitions are allowed. For example, if we assume we are currently in the transmembrane state then it is only allowed to go into the loop state and so on and so forth.
The state sequences which fits best the input sequence is then calculated by the Viterbi algorithm.
Predicted features
Predicted features are inside/outside (i/o), transmembrane (M), TM hairpin (H), reentrant (R) or membrane dip (D)
Required information for the prediction
Only the amino acid sequence of the users protein is required.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
SPOCTOPUS
Details of the method
Author: Viklund H, Bernsel A, Skwark M, Elofsson A.
Year: 2008
Reference: Bioinformatics
Description
SPOCTOPUS works the same way as OCTOPUS does. The only difference is that it includes a signal peptide prediction.
Predicted features
Predicted features are signal peptide, inside/outside (i/o), transmembrane (M), TM hairpin (H), reentrant (R) or membrane dip (D)
Required information for the prediction
Only the amino acid sequence of the query protein is required as input.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
SignalP
Details of the method
Author: Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne.
Year: 1997
Reference: PubMed
Description
This predictor takes two methods into account the first method used is a neural network the second is a hidden markov model.
There are two neural networks one which is predicting whether the first n amino acids belong to a signal peptide and the second network predicts the exact cleavage side positon.
In a later version of SignalP a hidden markov model (HMM) has been also build to predict signal peptides. However, this prediction is completely independent from the neural network prediction. This HMM models the N-terminal region of a signal peptide as well as the surrounding cleavage site.
Predicted features
Predicts the presence of signal peptidase I cleavage sites and whether the first n residues belong to a signal peptide.
Required information for the prediction
The amino acid sequence of the protein and whether this protein is from a eukaryote, gram-negative bacteria or gram-positive bacteria.
Execution
Before we could execute SignalP on our virtual machine we had to change the path of the signalp file to /apps/signalp-3.0
Then we executed for each protein the following commands:
- signalp -format short -t euk PAH.fa > task_33/signalp_pah_out
- signalp -format short -t euk A4_HUMAN.fa > task_33/signalp_a4_human_out
- signalp -format short -t gram- BACR_HALSA.fa > task_33/signalp_bacr_halsa_out
- signalp -format short -t euk LAMP1_HUMAN.fa > task_33/signalp_lamp1_human_out
- signalp -format short -t euk RET4_HUMAN.fa > task_33/signalp_ret4_human_out
- signalp -format short -t euk INSL5_HUMAN.fa > task_33/signalp_insl5_human_out
Results and discussion
PAH
SignalP predicted in both methods (HMM and NN) that there is no cleavage site for an signal peptide.
BACR_HALSA
SignalP predicted in both methods (HMM and NN) that there is no cleavage site for an signal peptide.
RET4_HUMAN
INSL5_HUMAN
SignalP predicted in both methods (HMM and NN) the cleavage site at positions 23.
LAMP1_HUMAN
SignalP predicted in both methods (HMM and NN) the cleavage site at positions 29.
A4_HUMAN
SignalP predicted in both methods (HMM and NN) the cleavage site at positions 18.
TargetP
Details of the method
Author: Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and Gunnar von Heijne.
Year: 1997
Reference: PubMed
Description
TargetP's prediction are based on trained neural networks. These neural networks are build up in a two layer setup. The first layer consists of three neural networks which are used to predict whether it is a chloroplast targeting sequence, a mitochondrial targeting sequence or a signal peptide. The output of this first layer is then used in the second layer neural network as input to make the final prediction. Then the decision unit decides whether the cutoffs are obeyed. The output is then one of three classes cTP/mTP/SP/other and a reliability class value (RC) which is an indicator for the predictions certainty.
However, if a non-plant protein is entered the prediction for cTP is not applied for obvious reasons.
Predicted features
Predicts the localization to the following targets: chloroplast, mitochondrion, ER/golgi/secreted, and "other".
Required information for the prediction
The amino acid sequence of the protein and whether this protein is from a plant or non-plant organism.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
Task 3.4: Prediction of GO terms
Annotated sequence features
PAH
The phenylalanine-4-hydroxylase has the following annotated GO terms:
Class | GO Identifier | GO Name |
---|---|---|
Function | GO:0003824 | catalytic activity |
Function | GO:0004497 | monooxygenase activity |
Function | GO:0004505 | phenylalanine 4-monooxygenase activity |
Function | GO:0005506 | iron ion binding |
Component | GO:0005829 | cytosol |
Process | GO:0006558 | L-phenylalanine metabolic process |
Process | GO:0006559 | L-phenylalanine catabolic process |
Process | GO:0006571 | tyrosine biosynthetic process |
Process | GO:0008152 | metabolic process |
Process | GO:0008652 | cellular amino acid biosynthetic process |
Process | GO:0009072 | aromatic amino acid family metabolic process |
Function | GO:0016491 | oxidoreductase activity |
Function | GO:0016597 | amino acid binding |
Function | GO:0016714 | oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, reduced pteridine as one donor, and incorporation of one atom of oxygen |
Process | GO:0018126 | protein hydroxylation |
Process | GO:0034641 | cellular nitrogen compound metabolic process |
Process | GO:0042136 | neurotransmitter biosynthetic process |
Process | GO:0042423 | catecholamine biosynthetic process |
Process | GO:0042558 | pteridine-containing compound metabolic process |
Function | GO:0042803 | protein homodimerization activity |
Process | GO:0046146 | tetrahydrobiopterin metabolic process |
Function | GO:0046872 | metal ion binding |
Function | GO:0048037 | cofactor binding |
Process | GO:0055114 | oxidation-reduction process |
BACR_HALSA
The bacteriorhodopsin has the following annotated GO terms:
Class | GO Identifier | GO Name |
---|---|---|
Function | GO:0004872 | receptor activity |
Function | GO:0005216 | ion channel activity |
Component | GO:0005886 | plasma membrane |
Process | GO:0006810 | transport |
Process | GO:0006811 | ion transport |
Process | GO:0007602 | phototransduction |
Function | GO:0009881 | photoreceptor activity |
Process | GO:0015992 | proton transport |
Component | GO:0016020 | membrane |
Component | GO:0016021 | integral to membrane |
Process | GO:0018298 | protein-chromophore linkage |
Process | GO:0050896 | response to stimulus |
RET4_HUMAN
The retinol-binding protein 4 has the following annotated GO terms:
Class | GO Identifier | GO Name |
---|---|---|
Process | GO:0001654 | eye development |
Function | GO:0005215 | transporter activity |
Function | GO:0005488 | binding |
Function | GO:0005501 | retinoid binding |
Function | GO:0005515 | protein binding |
Component | GO:0005576 | extracellular region |
Component | GO:0005615 | extracellular space |
Process | GO:0006094 | gluconeogenesis |
Process | GO:0006810 | transport |
Process | GO:0007283 | spermatogenesis |
Process | GO:0007507 | heart development |
Process | GO:0007601 | visual perception |
Process | GO:0008584 | male gonad development |
Process | GO:0009790 | embryo development |
Function | GO:0016918 | retinal binding |
Function | GO:0019841 | retinol binding |
Process | GO:0030277 | maintenance of gastrointestinal epithelium |
Process | GO:0030324 | lung development |
Process | GO:0032024 | positive regulation of insulin secretion |
Process | GO:0032526 | response to retinoic acid |
Process | GO:0032868 | response to insulin stimulus |
Function | GO:0034632 | retinol transporter activity |
Process | GO:0034633 | retinol transport |
Process | GO:0042572 | retinol metabolic process |
Process | GO:0042574 | retinal metabolic process |
Process | GO:0042593 | glucose homeostasis |
Process | GO:0045471 | response to ethanol |
Process | GO:0048562 | embryonic organ morphogenesis |
Process | GO:0048706 | embryonic skeletal system development |
Process | GO:0048738 | cardiac muscle tissue development |
Process | GO:0048807 | female genitalia morphogenesis |
Process | GO:0050896 | response to stimulus |
Process | GO:0050908 | detection of light stimulus involved in visual perception |
Process | GO:0051024 | positive regulation of immunoglobulin secretion |
Process | GO:0060041 | retina development in camera-type eye |
Process | GO:0060044 | negative regulation of cardiac muscle cell proliferation |
Process | GO:0060059 | embryonic retina morphogenesis in camera-type eye |
Process | GO:0060065 | uterus development |
Process | GO:0060068 | vagina development |
Process | GO:0060157 | urinary bladder development |
Process | GO:0060347 | heart trabecula formation |
INSL5_HUMAN
The insulin-like peptide INSL5 has the following annotated GO terms:
Class | GO Identifier | GO Name |
---|---|---|
Function | GO:0005179 | hormone activity |
Component | GO:0005575 | cellular_component |
Component | GO:0005576 | extracellular region |
Process | GO:0008150 | biological_process |
LAMP1_HUMAN
The lysosome-associated membrane glycoprotein 1 has the following annotated GO terms:
Class | GO Identifier | GO Name |
---|---|---|
Component | GO:0005624 | membrane fraction |
Component | GO:0005764 | lysosome |
Component | GO:0005765 | lysosomal membrane |
Component | GO:0005768 | endosome |
Component | GO:0005770 | late endosome |
Component | GO:0005771 | multivesicular body |
Component | GO:0005886 | plasma membrane |
Component | GO:0005887 | integral to plasma membrane |
Process | GO:0006914 | autophagy |
Component | GO:0009897 | external side of plasma membrane |
Component | GO:0009986 | cell surface |
Component | GO:0010008 | endosome membrane |
Component | GO:0016020 | membrane |
Component | GO:0016021 | integral to membrane |
Component | GO:0031982 | vesicle |
Component | GO:0042383 | sarcolemma |
Component | GO:0042470 | melanosome |
A4_HUMAN
The amyloid beta A4 protein has the following annotated GO terms:
Class | GO Identifier | GO Name |
---|---|---|
Process | GO:0000085 | G2 phase of mitotic cell cycle |
Process | GO:0001967 | suckling behavior |
Process | GO:0002576 | platelet degranulation |
Function | GO:0003677 | DNA binding |
Function | GO:0004867 | serine-type endopeptidase inhibitor activity |
Function | GO:0005102 | receptor binding |
Function | GO:0005488 | binding |
Function | GO:0005515 | protein binding |
Component | GO:0005576 | extracellular region |
Component | GO:0005624 | membrane fraction |
Component | GO:0005737 | cytoplasm |
Component | GO:0005794 | Golgi apparatus |
Component | GO:0005886 | plasma membrane |
Component | GO:0005887 | integral to plasma membrane |
Component | GO:0005905 | coated pit |
Process | GO:0006378 | mRNA polyadenylation |
Process | GO:0006417 | regulation of translation |
Process | GO:0006468 | protein phosphorylation |
Process | GO:0006878 | cellular copper ion homeostasis |
Process | GO:0006897 | endocytosis |
Process | GO:0006915 | apoptosis |
Process | GO:0006917 | induction of apoptosis |
Process | GO:0007155 | cell adhesion |
Process | GO:0007176 | regulation of epidermal growth factor receptor activity |
Process | GO:0007219 | Notch signaling pathway |
Process | GO:0007409 | axonogenesis |
Process | GO:0007596 | blood coagulation |
Process | GO:0007617 | mating behavior |
Process | GO:0007626 | locomotory behavior |
Process | GO:0008088 | axon cargo transport |
Function | GO:0008201 | heparin binding |
Process | GO:0008219 | cell death |
Process | GO:0008344 | adult locomotory behavior |
Process | GO:0008542 | visual learning |
Component | GO:0009986 | cell surface |
Process | GO:0010466 | negative regulation of peptidase activity |
Process | GO:0010952 | positive regulation of peptidase activity |
Component | GO:0016020 | membrane |
Component | GO:0016021 | integral to membrane |
Process | GO:0016199 | axon midline choice point recognition |
Process | GO:0016322 | neuron remodeling |
Process | GO:0016358 | dendrite development |
Function | GO:0016504 | peptidase activator activity |
Component | GO:0019717 | synaptosome |
Process | GO:0030168 | platelet activation |
Process | GO:0030198 | extracellular matrix organization |
Function | GO:0030414 | peptidase inhibitor activity |
Component | GO:0030424 | axon |
Process | GO:0030900 | forebrain development |
Component | GO:0031093 | platelet alpha granule lumen |
Process | GO:0031175 | neuron projection development |
Component | GO:0031410 | cytoplasmic vesicle |
Component | GO:0031594 | neuromuscular junction |
Function | GO:0033130 | acetylcholine receptor binding |
Process | GO:0035235 | ionotropic glutamate receptor signaling pathway |
Component | GO:0035253 | ciliary rootlet |
Process | GO:0040014 | regulation of multicellular organism growth |
Function | GO:0042802 | identical protein binding |
Component | GO:0043005 | neuron projection |
Component | GO:0043197 | dendritic spine |
Component | GO:0043198 | dendritic shaft |
Component | GO:0043231 | intracellular membrane-bounded organelle |
Process | GO:0045087 | innate immune response |
Component | GO:0045177 | apical part of cell |
Component | GO:0045202 | synapse |
Process | GO:0045665 | negative regulation of neuron differentiation |
Process | GO:0045931 | positive regulation of mitotic cell cycle |
Process | GO:0045944 | positive regulation of transcription from RNA polymerase II promoter |
Function | GO:0046872 | metal ion binding |
Component | GO:0048471 | perinuclear region of cytoplasm |
Process | GO:0048669 | collateral sprouting in absence of injury |
Process | GO:0050803 | regulation of synapse structure and activity |
Process | GO:0050885 | neuromuscular process controlling balance |
Process | GO:0051124 | synaptic growth at neuromuscular junction |
Component | GO:0051233 | spindle midzone |
Process | GO:0051402 | neuron apoptosis |
Function | GO:0051425 | PTB domain binding |
Process | GO:0051563 | smooth endoplasmic reticulum calcium ion homeostasis |
GOPET
Details of the method
Author: Vinayagam A, König R, Moormann J, Schubert F, Eils R, Glatting KH, Suhai S
Year: 2004
Reference: PubMed
Description
The prediction of GO terms is based on support vector machine (SVM) predictions. The training of this SVM was done with 39,740 selected GO-annotated cDNA sequences. For each of this training sequence they extract all annotated GO terms. In a next step they search for homologous sequences with blast with a e-value < 0.01. Sequences which fulfill this condition are used to extract attributes: including sequence similarity meas- ures, such as e-value, bitscore, identity, coverage score, alignment length, GO-term frequency, GO-term relationships between homologues, the level of annotation within the GO hierarchy and annotation quality of the homologues.
These attributes are then assigned to each GO term found in the training sequence. The training of the SVM is then done by taking the GO term and its associated attributes to train the SVM.
After the training the SVM is capable to predict GO terms from unknown cDNA or protein sequences in the same fashion.
Predicted features
GOPET predicts the GO term together with a confidence value.
Required information for the prediction
The cDNA or amino acid sequence of the protein is required.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
Pfam
Details of the method
Author: Wellcome Trust Sanger Institute and Howard Hughes Janelia Farm Research Campus
Year: latest release in March 2011
Reference: Oxford Journals
Description
Pfam is a protein family sequence database. In order to build families a seed sequence alignment of homologous sequences is build which all belong to the same family. This alignment is then used to build a profile hidden markov model (HMM) which is then represent one family. These profile HMM can then be used to search in your query sequence or in sequence database for significant family matches. The tool used to do all this is HMMER3.
Predicted features
Pfam predicts protein families.
Required information for the prediction
The amino acid sequence of the protein.
Execution
Results and discussion
PAH
BACR_HALSA
RET4_HUMAN
INSL5_HUMAN
LAMP1_HUMAN
A4_HUMAN
ProtFun 2.2
Details of the method
Author: L. Juhl Jensen, R. Gupta, N. Blom, D. Devos, J. Tamames, C. Kesmir, H. Nielsen, H. H. Stærfeldt, K. Rapacki, C. Workman, C. A. F. Andersen, S. Knudsen, A. Krogh, A. Valencia and S. Brunak.
Year: 2002
Reference: PubMed
Description
The prediction of GO terms is based on a neural network. The training sequence set was obtained from looking for protein families and their assigned GO terms in the InterPro database and then mapping these InterPro domain matches to SWISS-PROT and TrEMBL to get the actual sequence information. In order to avoid over-fitting a homology reduction was performed afterwards. Then a set of 16 features for each sequence was derived which include features such as propeptide cleavage site predictions and subcellular compartment predictions from TargetP.
Then the training to the neural network was applied to find out the best weight for each feature and GO term. However, after extensive training they figured out that the method gives only reliable predictions to 14 GO categories and thus only these were selected to be predicted by the neural network.
Predicted features
ProtFun predicts the cellular role, whether the protein is a enzyme or not, the enzyme class and the Gene ontology category. The predicted gene ontology categories are :
- Signal transducer
- Receptor
- Hormone
- Structural protein
- Transporter
- Ion channel
- Voltage-gated ion channel
- Cation channel
- Transcription
- Transcription regulation
- Stress response
- Immune response
- Growth factor
- Metal ion transport
Required information for the prediction
Only the amino acid sequence of the protein is required.
Execution