Difference between revisions of "Secondary Structure Prediction BCKDHA"

From Bioinformatikpedia
(Pfam)
(Pfam)
Line 959: Line 959:
 
=== Pfam ===
 
=== Pfam ===
   
  +
{| border="1" style="text-align:center; border-spacing:0;"
{|
 
 
!Query
 
!Query
 
!Cellular Component
 
!Cellular Component

Revision as of 10:41, 30 May 2011

1. Secondary structure prediction

PSIPRED

Basic information

author: David Jones (University College London)
year:1998
version: 2

References

[PSIPRED Server]
[Overview of prediction methods]
[History of the PSIPRED]

Theory

PSIPRED uses neuronal networks which has a single hidden layer and a feed-forward back-propagation architecture. FOr the online prediction on the server it is enough to enter a amino acid sequence. Since PSIPRED uses a very stringent cross validation method to evaluate the performance it reaches an average Q3 score of 80.7%.

Algorithm

The predicition is splitted into three different steps. In the first step sequence profiles are generated by using a position specific scoring matrix from PSI-BLAST as input for the neuronal network. In the next step the secondary structure is predicted. In the last step the output of the secundary structure prediction is filtered.

What is predicted

PSIPRED predicts the secondary structure

Features

There are three different options:
- Mask low complexity regions
- Mask transmembrane helices
- Mask coiled-coil regions

It is also possible to get an email with the results of PSIPRED.

Required information

PSIPRED requires the output of PSI-BLAST (Position Specific Iterated - BLAST) as input data.


Prediction

PSIPREDbild.png
start end structural element
1 1 C
2 16 H
17 17 C
18 25 H
26 77 C
78 82 E
83 98 C
99 124 H
125 136 C
137 146 H
147 152 C
153 155 E
156 159 C
160 166 H
167 170 C
171 178 H
179 212 C
213 125 H
126 130 C
231 236 E
237 242 C
243 256 H
257 259 C
260 265 E
266 276 C
277 278 H
279 282 C
283 287 H
288 296 C
297 298 E
299 300 C
301 318 H
319 323 C
324 329 E
330 347 C
348 356 H
357 360 C
361 370 H
371 375 C
377 399 H
400 404 C
405 413 H
414 417 C
418 434 H
435 445 C

legend: A=alpha helix, E=beta strand, C=coil

PSIPRED has predicted 23 coils, 16 alpha helices and 6 beta sheets.

Jpred3

Basic information

author: Cole C, Barber JD & Barton GJ (Bioinformatics and Computational Biology Research, University of Dundee)
year: 1998
version: 3

References

Jpred3 Server
About Jpred
FAQ

Theory

Jpred is using the neural network called Jnet to predict the secondary structure of a protein sequence or multiple alignment of protein sequences. The prediction accuracy for secondary strctures lies above 81%. Additionally Jpred makes predictions on Solvent Accessibility and Coiled-coil regions. It predicts wether a residue is burried or exposed by using the several cut-off values.

Algorithm
What is predicted

Jpred3 predicts secondary structure from the sequence or the multiple alignment.
It also predicts the relative solvent accessibility

Features

Jpred3 has two different modes: - single sequence - multiple alignment

Required information

Jpred3 needs a protein sequence or multiple alignment of protein sequences as input.

Multiple sequence: The first sequence has to be the target sequence since the alignment is modified so that the first sequence do not have any gaps. The alignemt has to be in the MSF or in the BLC format. Single sequence: For the sequence a multiple alignment is constructed with PSI-BLAST (3 iteratoins).


Prediction

1u5b (e-value:0) 1umd (e-value:6e-58) 1qs0 (e-value:1e-57) 3dv0 (e-value:2e-51)
EMBL-EBI EMBL-EBI EMBL-EBI EMBL-EBI
UniProt UniProt UniProt UniProt
position structural element position structural element position structural element position structural element
10–19 alpha helix
24-26 alpha helix
35-60 alpha helix 36–38 alpha helix
61-64 beta strand 44–47 alpha helix 44–69 alpha helix
48–51 alpha helix
67–69 alpha helix
74-83 alpha helix 74–99 alpha helix
83–91 alpha helix
91-93 alpha helix 89-92 beta strand
99-124 alpha helix 98-104 alpha helix 102–104 beta strand 98–100 beta strand
108-116 alpha helix 110–112 turn 106–111 alpha helix
122-125 turn 113–122 alpha helix 116–124 alpha helix
127-129 beta strand 127–130 beta strand 127–130 alpha helix
135-138 turn
138-146 alpha helix 144-147 beta strand 136–141 alpha helix
152-154 beta strand 150-162 alpha helix 146–154 alpha helix 147 – 161 alpha helix
161-166 alpha helix 160–163 turn
171-179 alpha helix 169-173 beta strand 173–175 alpha helix 168–173 beta strand
176-179 alpha helix 175–178 alpha helix
185-188 turn 181-193 alpha helix 182–185 beta strand 180–191 alpha helix
198-201 turn 196-204 beta strand 186–200 alpha helix 196–202 beta strand
207–212 beta strand 204–206 beta strand
209-211 turn 211–213 alpha helix
212-226 alpha helix 222-227 alpha helix 214–217 alpha helix 221–226 alpha helix
232-237 beta strand 232-236 beta strand 219–231 alpha helix 231–235 beta strand
240-242 alpha helix 240-255 alpha helix 235–241 beta strand 239–254 alpha helix
244-255 alpha helix 243–245 beta strand
250–253 alpha helix
254–257 turn
260-266 beta strand 261-267 beta strand 262–266 alpha helix 260–265 beta strand
268-270 beta strand
270–275 beta strand
275-277 alpha helix
280-282 beta strand 279 – 294 alpha helix
285-287 alpha helix 285–292 alpha helix
289-291 alpha helix
294-299 beta strand 296-305 alpha helix 300–305 beta strand 296–306 alpha helix
303-320 beta strand 306-308 turn 318–320 alpha helix
324-329 beta strand 312-334 alpha helix 326–329 alpha helix 312–334 alpha helix
341-345 alpha helix 335–345 alpha helix 341–346 alpha helix
348-351 beta strand
360-368 alpha helix 354-366 alpha helix 351–373 alpha helix 354–366 alpha helix
369-372 turn
376-399 alpha helix 378–380 beta strand
388–390 alpha helix
391–396 beta strand
405-408 alpha helix 399–406 alpha helix
412-415 beta strand
418-434 alpha helix
435-437 alpha helix
440-442 alpha helix


1u5b 1umd 1qs0 3dv0
e-value:0 e-value:6e-58 e-value:1e-57 e-value:2e-51
1u5bStructurePicture.png 1umdStructurePicture.png 1qs0StructurePicture.png 3dv0StructurePicture.png


DSSP

DSSPOutputColored.png

1. line: Sequence
2. line: structral elements
3. line: if a residue is involved in symmetrie contacts it is labeled with a star
4. line: if a residue is solvent accessible it is labeled with an "A"

Letter code for the secundary structure elements:

- H (blue): alpha
- 3 (yellow): residue in isolated beta-bridge
- T (red): hydrogen bonded turn
- S (green): bend

File:Output.pdf

2. Prediction of disordered regions

DISOPRED

DisopredOutseq.png
Disopredplot.png


The disordered regions in BCKDHA are predicted by DISOPRED in the beginning and in the end of the protein.

POODLE

POODLE-S (Missing residues)

S Missing Residues.png


POODLE-S (which predicts short disordered regions) with the option "Missing residues" predicted the disordered regions between the positions 1-56, 341-345 and 420-423. This is also shown in the plot above.

Detailed sequence with disordered region probability: File:PoodleSMissingResiduesOut.pdf

The probability can reach from 0 to 1. Where 0 means there is no disordered region and 1 that there is a disordered region.


POODLE-S (High B-Factor residues)

S BFactor.png


POODLE-S (which predicts short disordered regions) with the option "High B-Factor residues" predicted the disordered regions between the positions 6-9, 15-57, 93, 95-96, 340-354 and 379-402. This is also shown in the plot above.

Detailed sequence with disordered region probability: File:PoodleSFactorBOut.pdf

The probability can reach from 0 to 1. Where 0 means there is no disordered region and 1 that there is a disordered region.


POODLE-W

width=300px

The regions which could be disordered regions but poodle is not sure are bordered by blue squares and the disordered regions are bordered by red squares.

Detailed sequence with disordered region: File:PoodleWDOSeq.pdf
0=ordered regions
5=perhaps disordered regions
9=disordered regions

IUPred

Prediction type: long disorder

Long.png


Detailed sequence with disordered region probability: File:LongSeqOut.pdf


Prediction type: short disorder

Short.png


Detailed sequence with disordered region probability: File:ShortSeqOut.pdf


Prediction type: structured regions

Structural.png


With the option "structured regions" there was no prediction of disordered regions.
Only the command "Unkown globular domains: 1-445" appeared.

back to Maple syrup urine disease main page


3. Prediction of transmembrane alpha-helices and signal peptides

In the following section different tools for predicting transmembrane helices and signal peptides are tested. As the BCKDHA protein isn't a transmembrane protein, additional proteins were used for the transmembrane and signal peptide analysis:

name location transmembrane protein function reference
BACR_HALSA Cell membrane yes ion transport P02945
INSL5_HUMAN extracellular region no hormone Q9Y5Q6
LAMP1_HUMAN Cell membrane, Lysosome membrane, Endosome membrane yes Presents carbohydrate ligands to selectins P11279
A4_HUMAN Cell membrane yes Protease Inhibitor P05067
RET4_HUMAN extracellular space no Transport P02753

Transmembrane topology and signal peptides are features that are likely to be conserved during evolution.

TMHMM

  • TMHMM was developed by Sonnhammer, Heijne and Krogh in 1998 <ref> E.L. Sonnhammer, Heijne and A. Krogh, A hidden Markov model for predicting transmembrane helices in protein sequences, Proc Int Conf Intell Syst Mol Biol.(1998)</ref>
  • TMHMM predicts transmembrane helices in proteins.
  • TMHMM is a membrane topology prediction method based on a hidden Markov model.

Phobius and Polyphobius

  • Phobius was developed by Käll et al <ref> "A Combined Transmembrane Topology and Signal Peptide Prediction Method", Journal of Mol. Biology,338(5):1027-1036, 2004 </ref>
  • combined prediction of transmembrane regions and signal peptids
  • Required input information: only sequence in FASTA-Format (20 amino acids and B, Z, X are recognized)
  • As transmembrane topology and signal peptides are likely to be conserved during evolution, Polyphobius was established <ref>Käll et al., "An HMM posterior decoder for sequence feature prediction that includes homology information", Bioinformatics, 21 (Suppl 1):i251-i257, 2005</ref>, which includes information from homologous sequences to the query.
  • Required input: 2 Options: Query Sequence in FASTA-Format, which is then blasted agains uniprot_trembl or upload of an alignment in FASTA-Format which provides information about homologs.
BACR_HALSA
Phobius Polyphobius
BCKDHA Phobius BACR HALSA.png
sp|P02945|BACR_HALSA
TOPO_DOM 1 22 NON CYTOPLASMIC.
TRANSMEM 23 42
TOPO_DOM 43 53 CYTOPLASMIC.
TRANSMEM 54 76
TOPO_DOM 77 95 NON CYTOPLASMIC.
TRANSMEM 96 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 142
TOPO_DOM 143 147 NON CYTOPLASMIC.
TRANSMEM 148 169
TOPO_DOM 170 189 CYTOPLASMIC.
TRANSMEM 190 212
TOPO_DOM 213 217 NON CYTOPLASMIC.
TRANSMEM 218 237
TOPO_DOM 238 262 CYTOPLASMIC.

sp|P02945|BACR_HALSA
TOPO_DOM 1 21 NON CYTOPLASMIC.
TRANSMEM 22 43
TOPO_DOM 44 54 CYTOPLASMIC.
TRANSMEM 55 77
TOPO_DOM 78 94 NON CYTOPLASMIC.
TRANSMEM 95 114
TOPO_DOM 115 120 CYTOPLASMIC.
TRANSMEM 121 141
TOPO_DOM 142 147 NON CYTOPLASMIC.
TRANSMEM 148 166
TOPO_DOM 167 186 CYTOPLASMIC.
TRANSMEM 187 205
TOPO_DOM 206 215 NON CYTOPLASMIC.
TRANSMEM 216 237
TOPO_DOM 238 262 CYTOPLASMIC.

BCKDHA Polyphobius BACR HALSA.png


INSL5_HUMAN
Phobius Polyphobius
BCKDHA Phobius INSL5 HUMAN.png
sp|Q9Y5Q6|INSL5_HUMAN
SIGNAL 1 22
REGION 1 5 N-REGION
REGION 6 17 H-REGION
REGION 18 22 C-REGION
TOPO_DOM 23 135 NON CYTOPLASMIC

sp|Q9Y5Q6|INSL5_HUMAN
SIGNAL 1 22
REGION 1 4 N-REGION
REGION 5 16 H-REGION
REGION 17 22 C-REGION
TOPO_DOM 23 135 NON CYTOPLASMIC

BCKDHA Polyphobius INSL5 HUMAN.png
LAMP1_HUMAN
Phobius Polyphobius
BCKDHA Phobius LAMP1 HUMAN.png
sp|P11279|LAMP1_HUMAN
SIGNAL 1 28
REGION 1 10 N-REGION
REGION 11 22 H-REGION
REGION 23 28 C-REGION
TOPO_DOM 29 381 NON CYTOPLASMIC
TRANSMEM 382 405
TOPO_DOM 405 417 CYTOPLASMIC

sp|P11279|LAMP1_HUMAN
SIGNAL 1 28
REGION 1 9 N-REGION
REGION 10 22 H-REGION
REGION 23 28 C-REGION
TOPO_DOM 29 381 NON CYTOPLASMIC
TRANSMEM 382 405
TOPO_DOM 405 417 CYTOPLASMIC

BCKDHA Polyphobius LAMP1 HUMAN.png


A4_HUMAN
Phobius Polyphobius
BCKDHA Phobius A4 HUMAN.png
sp|P05067|A4_HUMAN
SIGNAL 1 17
REGION 1 1 N-REGION
REGION 2 12 H-REGION
REGION 13 17 C-REGION
TOPO_DOM 18 700 NON CYTOPLASMIC
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC

sp|P05067|A4_HUMAN
SIGNAL 1 17
REGION 1 3 N-REGION
REGION 4 12 H-REGION
REGION 13 17 C-REGION
TOPO_DOM 18 700 NON CYTOPLASMIC
TRANSMEM 701 723
TOPO_DOM 724 770 CYTOPLASMIC

BCKDHA Polyphobius A4 HUMAN.png


RET4_HUMAN
Phobius Polyphobius
BCKDHA Phobius RET4 HUMAN.png
sp|P02753|RET4_HUMAN
SIGNAL 1 18
REGION 1 2 N-REGION
REGION 3 13 H-REGION
REGION 14 18 C-REGION
TOPO_DOM 19 201 NON CYTOPLASMIC

sp|P02753|RET4_HUMAN
SIGNAL 1 18
REGION 1 3 N-REGION
REGION 4 13 H-REGION
REGION 14 18 C-REGION
TOPO_DOM 19 201 NON CYTOPLASMIC

BCKDHA Polyphobius RET4 HUMAN.png


For the BCKDHA-protein Phobius predicted a signal peptide with about 90% probability at the beginning of the sequence. The predicted signal peptide is 34 amino acids long. This matches the information given on Uniprot, which says, that BCKDHA contains a 45bp long signal peptide for the transfer into the mitochondrion. The rest of the amino acid is a non cytoplasmic protein sequence. No part of the protein is predicted to be transmembrane spanning. This is also true, as BCKDHA is a protein located in the mitochondrion matrix according to Uniprot.

BCKDHA
Phobius Polyphobius
Phobius BCKDHA.png
sp|P12694|ODBA_HUMAN (BCKDHA)
Signal 1 34
Region 1 16 N-Region
Region 17 25 H-Region
Region 26 34 C-Region
TOPO_DOM 35 445 non cytoplasmic

OBDA_HUMAN (BCKDHA)
TOPO_DOM 1 445 Non cytoplasmic

BCKDHA Polyphobius BCKDHA.png

Considering the information given on Uniprot, Polyphobius performed worse than Phobius on the BCKDHA-protein sequence. It predicted no signal sequence at the beginning of the protein sequence. There is a low probability for the amino acids between position 1-45 to be a signal sequence, but all in all the whole sequenc is predicted to be a non cytoplasmic protein.

OCTOPUS and SPOCTOPUS

  • OCTOPUS was developed by Viklund and Elofsson in 2008 <ref>Håkan Viklund and Arne Elofsson, "Improving topology prediction by two-track ANN-based preference scores and an extended topological grammar", Bioinformatics (2008)</ref>
  • OCTOPUS (obtainer of correct topologies for uncharacterized sequences) uses a combination of hidden Markov models and artificial neural networks.
  • It creates a sequence profile by doing a BLAST search to obtain homologous sequences. The profile is used as input for a neural network that predicts the probability for each residue to be located in a transmembrane(M), interface (I), close loop (L), or globular loop (G) environment as well as the preference to be inside (i) or outside (o) of the membrane. A hidden Markov model is used to calculate the most likely Protein Topology.
  • Required input: Protein Sequence in FASTA-Format
  • SPOCTOPUS (Viklund et al., 2008<ref>Viklund et al., "A combined predictor of signal peptides and membrane protein topology", Bioinformatics (2008)</ref>) is an extension of OCTOPUS which also predicts signal peptides. A neural network is used to predict a signal peptide preference score. The signal peptide's location is determined by a hidden Markov model. The output contains the information retrieved by OCTOPUS as well as the probabilty if a residue is predicted to be N-terminal of a signal peptide (n) or in a signal peptide (S).
  • Required input information: Protein sequence in FASTA-Format
BACR_HALSA
OCTOPUS BCKDHA Octopus BACR HALSA small.png
SPOCTOPUS BCKDHA Spoctopus BACR HALSA small.png
INSL5_HUMAN
OCTOPUS BCKDHA Octopus INSL5 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus INSL5 HUMAN small.png
LAMP1_HUMAN
OCTOPUS BCKDHA Octopus LAMP1 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus LAMP1 HUMAN small.png
A4_HUMAN
OCTOPUS BCKDHA Octopus A4 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus A4 HUMAN small.png
RET4_HUMAN
OCTOPUS BCKDHA Octopus RET4 HUMAN small.png
SPOCTOPUS BCKDHA Spoctopus RET4 HUMAN small.png
BCKDHA
OCTOPUS BCKDHA Octopus BCKDHA small.png
SPOCTOPUS BCKDHA Spoctopus BCKDHA small.png

SignalP

  • SignalP was established by Nielsen et al. in 1997<ref>Nielsen et al., "Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites", Protein Engineering, 10:1-6, 1997</ref>
  • SignalP is neural network based. It identifies signal peptides and cleavage sites.

TargetP

  • TargetP was developed by Emanuelsson et al. in 2002 <ref> Emanuelsson et al., "Predicting subcellular localization of proteins based on their N-terminal amino acid sequence", J. Mol. Biol., 200: 1005-1016, 2002</ref>
  • TargetP predicts the subcellular location of eukaryotic proteins. additionally: cleavage site predictions
  • This method is neural network based. The prediction is based on the N-terminal presequences: chloroplast transit peptide(cTP), mitochondiral targeting peptide (mTP) or secretory pathway signal peptide (SP)
  • Required input information: Sequence(s) in FASTA format, organism group

The TargetP prediction results can be seen in the following table:
BCKDHA TargetP.PNG

The ODBA_HUMAN (BCKDHA) is predicted to be located in the mitochondrion, which is true according to Uniprot.


back to Maple syrup urine disease main page

4. Prediction of GO terms

GOPET

GOPET results fot BCKDHA:

GOid Aspect Confidence GOTerm
GO:0003824 F 97% catalytic activity
Go:0016491 F 96% oxidoreductase activity
GO:0016624 F 95% oxidoredusctase activity acting on the aldehyde or oxo group of donors disulfide as acceptor
GO:0003863 F 90% 3-methyl-2-oxobutanoate dehydrogenase 2-methylpropanoyl-transferring activity
GO:0004739 F 89% pyruvate dehydrogenase acetyl-transferring activity
GO:0004738 F 78% pyruvat dehydrogenase activity
GO:0003826 F 77% alpha-ketoacid dehydrogenase activity
GO:0047101 F 75% 2-oxoisovalerate dehydrogenase acylting activity
GO:0008677 F 65% 2-dehydropantoate 2-reductase activity
GO:0019152 F 63% acetoin dehydrogenase activity
GO:0030955 F 63% potassium ion binding
GO:0016616 F 62% oxidoreductase activity acting on the CH-OH group of donors NAD or NADP as acceptor
GO:0046872 F 62% metal ion binding


GOPET results for A4_HUMAN

GOid Aspect Confidence GOTerm
GO:0004866 F 87% endopeptidase inhibitor activity
GO:0004867 F 86% serine-type endopeptidase inhibitor activity
GO:0030568 F 83% plasmin inhibitor activity
GO:0030304 F 83% trypsin inhibitor activity
GO:0030414 F 82% peptidase inhibitor activity
GO:0005488 F 79% binding
GO:0005515 F 74% protein binding
GO:0046872 F 73% metal ion binding
GO:0003677 F 71% DNA binding
GO:0008201 F 70% heparin binding
GO:0008270 F 69% zinc ion binding
GO:0005507 F 69% copper ion binding
GO:0005506 F 67% iron ion binding


GOPET results for BACR_HALSA:

GOid Aspect Confidence GOterm
GO:0005216 F 77% ion channel activiy
GO:0008020 F 75% G-protein coupled photoreceptor activity
GO:0015078 F 60% hydrogen ion transmembrane transporter activity


GOPET results for INSL5_HUMAN:

GOid Aspect Confidence GOterm
GO:0005179 F 80% hormone activity


GOPET results for LAMP1_HUMAN:

GOid Aspect Confidence GOterm
GO:0004812 F 60% aminoacyl-tRNA ligase activity
GO:0005524 F 60% ATP binding


GOPET results for RET4_HUMAN:

GOid Aspect Confidence GOterm
GO:0005488 F 90% binding
GO:0005501 F 81% retinoid binding
GO:0008289 F 80% lipid binding
GO:0019841 F 78% retinol binding
GO:0005215 F 78% transporter activity
GO:0016918 F 78% retinal binding
GO:0005319 F 69% lipid transporter activity
GO:0008035 F 60% high-density lipoprotein particle binding

Pfam

Query Cellular Component Molecular function Biological Process
BCKDHA GO:0016624 (oxidoreductase activity, acting on the aldehyde or oxo group of donors, disulfide as acceptor) GO:0008152 (metabolic process)
A4_HUMAN GO:0016021 (integral to membrane) GO:0005488 (binding)
BACR_HALSA GO:0016020 (membrane) GO:0005216 (ion channel activity) GO: 0006811 (ion transport)
INSL5_HUMAN GO:0005576 (extracellular region) GO:0005179 (hormone activity)
LAMP1_HUMAN GO:0016020 (membrane)
RET4_HUMAN GO:0005488 (binding)

References

<references />


back to Maple syrup urine disease main page