Difference between revisions of "Sequence-based predictions TSD"

From Bioinformatikpedia
(GOpet)
Line 102: Line 102:
 
! style="border-style: solid; border-width: 0 0 2px 0" |Confidence
 
! style="border-style: solid; border-width: 0 0 2px 0" |Confidence
 
! style="border-style: solid; border-width: 0 0 2px 0" |GO-Term description
 
! style="border-style: solid; border-width: 0 0 2px 0" |GO-Term description
  +
! style="border-style: solid; border-width: 0 0 2px 0" |Validation
 
|- align="center"
 
|- align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0003824
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0003824
Line 107: Line 108:
 
| style="border-style: solid; border-width: 0 0 0 0" | 97%
 
| style="border-style: solid; border-width: 0 0 0 0" | 97%
 
| style="border-style: solid; border-width: 0 0 0 0" | catalytic activity
 
| style="border-style: solid; border-width: 0 0 0 0" | catalytic activity
  +
| style="border-style: solid; border-width: 0 0 0 0" | true
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0004563
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0004563
Line 112: Line 114:
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | beta-N-acetylhexosaminidase activity
 
| style="border-style: solid; border-width: 0 0 0 0" | beta-N-acetylhexosaminidase activity
  +
| style="border-style: solid; border-width: 0 0 0 0" | true
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0015929
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0015929
Line 117: Line 120:
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | hexosaminidase activity
 
| style="border-style: solid; border-width: 0 0 0 0" | hexosaminidase activity
  +
| style="border-style: solid; border-width: 0 0 0 0" | false
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0016787
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0016787
Line 122: Line 126:
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity
  +
| style="border-style: solid; border-width: 0 0 0 0" | true
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0016798
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0016798
Line 127: Line 132:
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity acting on glycosyl bonds
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity acting on glycosyl bonds
  +
| style="border-style: solid; border-width: 0 0 0 0" | true
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0004553
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0004553
Line 132: Line 138:
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | 96%
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity hydrolyzing O-glycosyl compounds
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity hydrolyzing O-glycosyl compounds
  +
| style="border-style: solid; border-width: 0 0 0 0" | true
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0016799
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0016799
Line 137: Line 144:
 
| style="border-style: solid; border-width: 0 0 0 0" | 77%
 
| style="border-style: solid; border-width: 0 0 0 0" | 77%
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity hydrolyzing N-glycosyl compounds
 
| style="border-style: solid; border-width: 0 0 0 0" | hydrolase activity hydrolyzing N-glycosyl compounds
  +
| style="border-style: solid; border-width: 0 0 0 0" | false
 
|-align="center"
 
|-align="center"
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0046982
 
| style="border-style: solid; border-width: 0 0 0 0" | GO:0046982
Line 142: Line 150:
 
| style="border-style: solid; border-width: 0 0 0 0" | 61%
 
| style="border-style: solid; border-width: 0 0 0 0" | 61%
 
| style="border-style: solid; border-width: 0 0 0 0" | protein heterodimerization activity
 
| style="border-style: solid; border-width: 0 0 0 0" | protein heterodimerization activity
  +
| style="border-style: solid; border-width: 0 0 0 0" | true
 
|-
 
|-
 
|}
 
|}
'''Table TODO''':
+
'''Table TODO''': GO term prediction.
 
</figtable>
 
</figtable>

Revision as of 15:09, 15 May 2012

Thor: He's my brother

Natasha Romanoff: He killed 80 people in 2 days

Thor: ...He's adopted

If not noted otherwise, the sequence for all predictions is the HEXA Reference sequence (Uniprot P06865). A protocol for this task can be found here.

Secondary structure

Proteins: Ribonuclease inhibitor P10775 , CutA Q9X0E6 , CAM-PRP catalytic subunit Q08209
Ribonuclease inhibitor and CutA are located in the cytoplasm whereas the CAM-PRP catalytic subunit is located in the nucleus.

Disorder

Transmembrane helices

Proteins: Dopamine D3 receptor P35462 , KvAP Q9YDF8 , AQP-4 P47863
Dopamine D3 receptor, KvAP and AQP-4 are multi-pass membrane proteins. <figtable id="tab:gopetgo">

Positions of transmembrane helices
Drd3 33–35 66–88 105–126 150–170 188–212 330–351 367–388
KvAP 39–63 68–92 109–125 129–145 160–184 222–253
AQP-4 37–57 65–85 116–136 156–176 185–205 232–252

Table TODO: Assigned transmembrane regions in Uniprot </figtable>


Signal peptides

Proteins: Serum albumin P02768, LAMP-1 P11279, AQP-4 P47863
HEXA LAMP-1 and Serum albumin contain a signal peptide. HEXA has an assigned peptide between position 1 and 22, LAMP-1 between 1 and 28 and Serum albumin between position 1 and 18.
LAMP-1 is a membrane protein which passes the membrane with one helix. Serum albumin, the main protein of plasma, is a secreted extracellular protein. AQP-4 is a multi-pass membrane protein which forms a waterspecific channel and functions in transport.


The prediction of the displayed results was performed with SignalP version 4.0.
SignalP employs 3 main scores for the prediction of signal peptides, C, S and Y. The S-score stands for the actual signal peptide prediction, with high scores indicating that the corresponding amino acid is part of a signal peptide, and low scores indicating that the amino acid is part of a mature protein. The C-score is the cleavage score, which indicates the best cleavage cite when significantly high. (When a cleavage site position is referred to by a single number, the number indicates the first residue in the mature protein.) Y-max is a derivative of the C-score combined with the S-score calculated to give a better cleavage site prediction than the raw C-score alone.
There are two additional scores reported in the SignalP output, namely the S-mean and the D-score. The S-mean is the average of the S-score, ranging from the N-terminal amino acid to the amino acid assigned with the highest Y-max score. The D-score is implemented as a weighted average of the S-mean and the Y-max scores.
For non-secretory proteins all scores are supposed to be very low.

<figtable id="tbl:signalp">

Sp P06865 HEXA HUMAN.png
Sp P47863 AQP4 RAT TSD.png
Sp P11279 LAMP1 HUMAN TSD.png
Sp P02768 ALBU HUMAN TSD.png
Table : Signal peptide predictions.

</figtable>

The <xr id="tbl:signalp"/> displays the results of the SignalP predictions. The additional scores can be viewed here. HEXA, LAMP-1 and Serum albumin are correctly predicted one signal peptide at the beginning of the sequence and AQP-4 is identified as a mature protein.


GO terms

GOpet

<figtable id="tab:gopetgo">

GO-Term ID Type Confidence GO-Term description Validation
GO:0003824 Molecular function 97% catalytic activity true
GO:0004563 Molecular function 96% beta-N-acetylhexosaminidase activity true
GO:0015929 Molecular function 96% hexosaminidase activity false
GO:0016787 Molecular function 96% hydrolase activity true
GO:0016798 Molecular function 96% hydrolase activity acting on glycosyl bonds true
GO:0004553 Molecular function 96% hydrolase activity hydrolyzing O-glycosyl compounds true
GO:0016799 Molecular function 77% hydrolase activity hydrolyzing N-glycosyl compounds false
GO:0046982 Molecular function 61% protein heterodimerization activity true

Table TODO: GO term prediction. </figtable>