Difference between revisions of "Secondary Structure Prediction BCKDHA"
(→Pfam) |
(→Pfam) |
||
Line 958: | Line 958: | ||
=== Pfam === |
=== Pfam === |
||
+ | * Pfam was established by Finn et al. in 2008. It is described in <ref>Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A (2008). "The Pfam protein families database.". Nucleic Acids Res 36 (Database issue): D281–8</ref> |
||
+ | * |
||
{| border="1" style="text-align:center; border-spacing:0;" |
{| border="1" style="text-align:center; border-spacing:0;" |
||
Line 996: | Line 998: | ||
| |
| |
||
|} |
|} |
||
+ | |||
+ | |||
+ | === ProtFun 2.2 === |
||
== References == |
== References == |
Revision as of 10:45, 30 May 2011
Contents
1. Secondary structure prediction
PSIPRED
Basic information
author: David Jones (University College London)
year:1998
version: 2
References
[PSIPRED Server]
[Overview of prediction methods]
[History of the PSIPRED]
Theory
PSIPRED uses neuronal networks which has a single hidden layer and a feed-forward back-propagation architecture. FOr the online prediction on the server it is enough to enter a amino acid sequence. Since PSIPRED uses a very stringent cross validation method to evaluate the performance it reaches an average Q3 score of 80.7%.
Algorithm
The predicition is splitted into three different steps. In the first step sequence profiles are generated by using a position specific scoring matrix from PSI-BLAST as input for the neuronal network. In the next step the secondary structure is predicted. In the last step the output of the secundary structure prediction is filtered.
What is predicted
PSIPRED predicts the secondary structure
Features
There are three different options:
- Mask low complexity regions
- Mask transmembrane helices
- Mask coiled-coil regions
It is also possible to get an email with the results of PSIPRED.
Required information
PSIPRED requires the output of PSI-BLAST (Position Specific Iterated - BLAST) as input data.
Prediction
start | end | structural element |
---|---|---|
1 | 1 | C |
2 | 16 | H |
17 | 17 | C |
18 | 25 | H |
26 | 77 | C |
78 | 82 | E |
83 | 98 | C |
99 | 124 | H |
125 | 136 | C |
137 | 146 | H |
147 | 152 | C |
153 | 155 | E |
156 | 159 | C |
160 | 166 | H |
167 | 170 | C |
171 | 178 | H |
179 | 212 | C |
213 | 125 | H |
126 | 130 | C |
231 | 236 | E |
237 | 242 | C |
243 | 256 | H |
257 | 259 | C |
260 | 265 | E |
266 | 276 | C |
277 | 278 | H |
279 | 282 | C |
283 | 287 | H |
288 | 296 | C |
297 | 298 | E |
299 | 300 | C |
301 | 318 | H |
319 | 323 | C |
324 | 329 | E |
330 | 347 | C |
348 | 356 | H |
357 | 360 | C |
361 | 370 | H |
371 | 375 | C |
377 | 399 | H |
400 | 404 | C |
405 | 413 | H |
414 | 417 | C |
418 | 434 | H |
435 | 445 | C |
legend: A=alpha helix, E=beta strand, C=coil
PSIPRED has predicted 23 coils, 16 alpha helices and 6 beta sheets.
Jpred3
Basic information
author: Cole C, Barber JD & Barton GJ (Bioinformatics and Computational Biology Research, University of Dundee)
year: 1998
version: 3
References
Theory
Jpred is using the neural network called Jnet to predict the secondary structure of a protein sequence or multiple alignment of protein sequences. The prediction accuracy for secondary strctures lies above 81%. Additionally Jpred makes predictions on Solvent Accessibility and Coiled-coil regions. It predicts wether a residue is burried or exposed by using the several cut-off values.
Algorithm
What is predicted
Jpred3 predicts secondary structure from the sequence or the multiple alignment.
It also predicts the relative solvent accessibility
Features
Jpred3 has two different modes: - single sequence - multiple alignment
Required information
Jpred3 needs a protein sequence or multiple alignment of protein sequences as input.
Multiple sequence: The first sequence has to be the target sequence since the alignment is modified so that the first sequence do not have any gaps. The alignemt has to be in the MSF or in the BLC format. Single sequence: For the sequence a multiple alignment is constructed with PSI-BLAST (3 iteratoins).
Prediction
1u5b (e-value:0) | 1umd (e-value:6e-58) | 1qs0 (e-value:1e-57) | 3dv0 (e-value:2e-51) | |||||||
---|---|---|---|---|---|---|---|---|---|---|
EMBL-EBI | EMBL-EBI | EMBL-EBI | EMBL-EBI | |||||||
UniProt | UniProt | UniProt | UniProt | |||||||
position | structural element | position | structural element | position | structural element | position | structural element | |||
10–19 | alpha helix | |||||||||
24-26 | alpha helix | |||||||||
35-60 | alpha helix | 36–38 | alpha helix | |||||||
61-64 | beta strand | 44–47 | alpha helix | 44–69 | alpha helix | |||||
48–51 | alpha helix | |||||||||
67–69 | alpha helix | |||||||||
74-83 | alpha helix | 74–99 | alpha helix | |||||||
83–91 | alpha helix | |||||||||
91-93 | alpha helix | 89-92 | beta strand | |||||||
99-124 | alpha helix | 98-104 | alpha helix | 102–104 | beta strand | 98–100 | beta strand | |||
108-116 | alpha helix | 110–112 | turn | 106–111 | alpha helix | |||||
122-125 | turn | 113–122 | alpha helix | 116–124 | alpha helix | |||||
127-129 | beta strand | 127–130 | beta strand | 127–130 | alpha helix | |||||
135-138 | turn | |||||||||
138-146 | alpha helix | 144-147 | beta strand | 136–141 | alpha helix | |||||
152-154 | beta strand | 150-162 | alpha helix | 146–154 | alpha helix | 147 – 161 | alpha helix | |||
161-166 | alpha helix | 160–163 | turn | |||||||
171-179 | alpha helix | 169-173 | beta strand | 173–175 | alpha helix | 168–173 | beta strand | |||
176-179 | alpha helix | 175–178 | alpha helix | |||||||
185-188 | turn | 181-193 | alpha helix | 182–185 | beta strand | 180–191 | alpha helix | |||
198-201 | turn | 196-204 | beta strand | 186–200 | alpha helix | 196–202 | beta strand | |||
207–212 | beta strand | 204–206 | beta strand | |||||||
209-211 | turn | 211–213 | alpha helix | |||||||
212-226 | alpha helix | 222-227 | alpha helix | 214–217 | alpha helix | 221–226 | alpha helix | |||
232-237 | beta strand | 232-236 | beta strand | 219–231 | alpha helix | 231–235 | beta strand | |||
240-242 | alpha helix | 240-255 | alpha helix | 235–241 | beta strand | 239–254 | alpha helix | |||
244-255 | alpha helix | 243–245 | beta strand | |||||||
250–253 | alpha helix | |||||||||
254–257 | turn | |||||||||
260-266 | beta strand | 261-267 | beta strand | 262–266 | alpha helix | 260–265 | beta strand | |||
268-270 | beta strand | |||||||||
270–275 | beta strand | |||||||||
275-277 | alpha helix | |||||||||
280-282 | beta strand | 279 – 294 | alpha helix | |||||||
285-287 | alpha helix | 285–292 | alpha helix | |||||||
289-291 | alpha helix | |||||||||
294-299 | beta strand | 296-305 | alpha helix | 300–305 | beta strand | 296–306 | alpha helix | |||
303-320 | beta strand | 306-308 | turn | 318–320 | alpha helix | |||||
324-329 | beta strand | 312-334 | alpha helix | 326–329 | alpha helix | 312–334 | alpha helix | |||
341-345 | alpha helix | 335–345 | alpha helix | 341–346 | alpha helix | |||||
348-351 | beta strand | |||||||||
360-368 | alpha helix | 354-366 | alpha helix | 351–373 | alpha helix | 354–366 | alpha helix | |||
369-372 | turn | |||||||||
376-399 | alpha helix | 378–380 | beta strand | |||||||
388–390 | alpha helix | |||||||||
391–396 | beta strand | |||||||||
405-408 | alpha helix | 399–406 | alpha helix | |||||||
412-415 | beta strand | |||||||||
418-434 | alpha helix | |||||||||
435-437 | alpha helix | |||||||||
440-442 | alpha helix |
1u5b | 1umd | 1qs0 | 3dv0 |
---|---|---|---|
e-value:0 | e-value:6e-58 | e-value:1e-57 | e-value:2e-51 |
![]() |
![]() |
![]() |
![]()
|
DSSP
1. line: Sequence
2. line: structral elements
3. line: if a residue is involved in symmetrie contacts it is labeled with a star
4. line: if a residue is solvent accessible it is labeled with an "A"
Letter code for the secundary structure elements:
- H (blue): alpha
- 3 (yellow): residue in isolated beta-bridge
- T (red): hydrogen bonded turn
- S (green): bend
File:Output.pdf
2. Prediction of disordered regions
DISOPRED
![]() |
The disordered regions in BCKDHA are predicted by DISOPRED in the beginning and in the end of the protein.
POODLE
POODLE-S (Missing residues)
POODLE-S (which predicts short disordered regions) with the option "Missing residues" predicted the disordered regions between the positions 1-56, 341-345 and 420-423. This is also shown in the plot above.
Detailed sequence with disordered region probability: File:PoodleSMissingResiduesOut.pdf
The probability can reach from 0 to 1. Where 0 means there is no disordered region and 1 that there is a disordered region.
POODLE-S (High B-Factor residues)
POODLE-S (which predicts short disordered regions) with the option "High B-Factor residues" predicted the disordered regions between the positions 6-9, 15-57, 93, 95-96, 340-354 and 379-402. This is also shown in the plot above.
Detailed sequence with disordered region probability: File:PoodleSFactorBOut.pdf
The probability can reach from 0 to 1. Where 0 means there is no disordered region and 1 that there is a disordered region.
POODLE-W
The regions which could be disordered regions but poodle is not sure are bordered by blue squares and the disordered regions are bordered by red squares.
Detailed sequence with disordered region:
File:PoodleWDOSeq.pdf
0=ordered regions
5=perhaps disordered regions
9=disordered regions
IUPred
Prediction type: long disorder
Detailed sequence with disordered region probability: File:LongSeqOut.pdf
Prediction type: short disorder
Detailed sequence with disordered region probability: File:ShortSeqOut.pdf
Prediction type: structured regions
With the option "structured regions" there was no prediction of disordered regions.
Only the command "Unkown globular domains: 1-445" appeared.
back to Maple syrup urine disease main page
3. Prediction of transmembrane alpha-helices and signal peptides
In the following section different tools for predicting transmembrane helices and signal peptides are tested. As the BCKDHA protein isn't a transmembrane protein, additional proteins were used for the transmembrane and signal peptide analysis:
name | location | transmembrane protein | function | reference |
---|---|---|---|---|
BACR_HALSA | Cell membrane | yes | ion transport | P02945 |
INSL5_HUMAN | extracellular region | no | hormone | Q9Y5Q6 |
LAMP1_HUMAN | Cell membrane, Lysosome membrane, Endosome membrane | yes | Presents carbohydrate ligands to selectins | P11279 |
A4_HUMAN | Cell membrane | yes | Protease Inhibitor | P05067 |
RET4_HUMAN | extracellular space | no | Transport | P02753 |
Transmembrane topology and signal peptides are features that are likely to be conserved during evolution.
TMHMM
- TMHMM was developed by Sonnhammer, Heijne and Krogh in 1998 <ref> E.L. Sonnhammer, Heijne and A. Krogh, A hidden Markov model for predicting transmembrane helices in protein sequences, Proc Int Conf Intell Syst Mol Biol.(1998)</ref>
- TMHMM predicts transmembrane helices in proteins.
- TMHMM is a membrane topology prediction method based on a hidden Markov model.
Phobius and Polyphobius
- Phobius was developed by Käll et al <ref> "A Combined Transmembrane Topology and Signal Peptide Prediction Method", Journal of Mol. Biology,338(5):1027-1036, 2004 </ref>
- combined prediction of transmembrane regions and signal peptids
- Required input information: only sequence in FASTA-Format (20 amino acids and B, Z, X are recognized)
- As transmembrane topology and signal peptides are likely to be conserved during evolution, Polyphobius was established <ref>Käll et al., "An HMM posterior decoder for sequence feature prediction that includes homology information", Bioinformatics, 21 (Suppl 1):i251-i257, 2005</ref>, which includes information from homologous sequences to the query.
- Required input: 2 Options: Query Sequence in FASTA-Format, which is then blasted agains uniprot_trembl or upload of an alignment in FASTA-Format which provides information about homologs.
BACR_HALSA | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Phobius | Polyphobius | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
INSL5_HUMAN | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Phobius | Polyphobius | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
LAMP1_HUMAN | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Phobius | Polyphobius | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
A4_HUMAN | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Phobius | Polyphobius | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
RET4_HUMAN | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Phobius | Polyphobius | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
For the BCKDHA-protein Phobius predicted a signal peptide with about 90% probability at the beginning of the sequence. The predicted signal peptide is 34 amino acids long. This matches the information given on Uniprot, which says, that BCKDHA contains a 45bp long signal peptide for the transfer into the mitochondrion. The rest of the amino acid is a non cytoplasmic protein sequence. No part of the protein is predicted to be transmembrane spanning. This is also true, as BCKDHA is a protein located in the mitochondrion matrix according to Uniprot.
BCKDHA | |||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Phobius | Polyphobius | ||||||||||||||||||||||||||||||||||
|
|
Considering the information given on Uniprot, Polyphobius performed worse than Phobius on the BCKDHA-protein sequence. It predicted no signal sequence at the beginning of the protein sequence. There is a low probability for the amino acids between position 1-45 to be a signal sequence, but all in all the whole sequenc is predicted to be a non cytoplasmic protein.
OCTOPUS and SPOCTOPUS
- OCTOPUS was developed by Viklund and Elofsson in 2008 <ref>Håkan Viklund and Arne Elofsson, "Improving topology prediction by two-track ANN-based preference scores and an extended topological grammar", Bioinformatics (2008)</ref>
- OCTOPUS (obtainer of correct topologies for uncharacterized sequences) uses a combination of hidden Markov models and artificial neural networks.
- It creates a sequence profile by doing a BLAST search to obtain homologous sequences. The profile is used as input for a neural network that predicts the probability for each residue to be located in a transmembrane(M), interface (I), close loop (L), or globular loop (G) environment as well as the preference to be inside (i) or outside (o) of the membrane. A hidden Markov model is used to calculate the most likely Protein Topology.
- Required input: Protein Sequence in FASTA-Format
- SPOCTOPUS (Viklund et al., 2008<ref>Viklund et al., "A combined predictor of signal peptides and membrane protein topology", Bioinformatics (2008)</ref>) is an extension of OCTOPUS which also predicts signal peptides. A neural network is used to predict a signal peptide preference score. The signal peptide's location is determined by a hidden Markov model. The output contains the information retrieved by OCTOPUS as well as the probabilty if a residue is predicted to be N-terminal of a signal peptide (n) or in a signal peptide (S).
- Required input information: Protein sequence in FASTA-Format
BACR_HALSA | |
OCTOPUS | ![]() |
SPOCTOPUS | ![]() |
INSL5_HUMAN | |
OCTOPUS | ![]() |
SPOCTOPUS | ![]() |
LAMP1_HUMAN | |
OCTOPUS | ![]() |
SPOCTOPUS | ![]() |
A4_HUMAN | |
OCTOPUS | ![]() |
SPOCTOPUS | ![]() |
RET4_HUMAN | |
OCTOPUS | ![]() |
SPOCTOPUS | ![]() |
BCKDHA | |
OCTOPUS | ![]() |
SPOCTOPUS | ![]() |
SignalP
- SignalP was established by Nielsen et al. in 1997<ref>Nielsen et al., "Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites", Protein Engineering, 10:1-6, 1997</ref>
- SignalP is neural network based. It identifies signal peptides and cleavage sites.
TargetP
- TargetP was developed by Emanuelsson et al. in 2002 <ref> Emanuelsson et al., "Predicting subcellular localization of proteins based on their N-terminal amino acid sequence", J. Mol. Biol., 200: 1005-1016, 2002</ref>
- TargetP predicts the subcellular location of eukaryotic proteins. additionally: cleavage site predictions
- This method is neural network based. The prediction is based on the N-terminal presequences: chloroplast transit peptide(cTP), mitochondiral targeting peptide (mTP) or secretory pathway signal peptide (SP)
- Required input information: Sequence(s) in FASTA format, organism group
The TargetP prediction results can be seen in the following table:
The ODBA_HUMAN (BCKDHA) is predicted to be located in the mitochondrion, which is true according to Uniprot.
back to Maple syrup urine disease main page
4. Prediction of GO terms
GOPET
GOPET results fot BCKDHA:
GOid | Aspect | Confidence | GOTerm |
---|---|---|---|
GO:0003824 | F | 97% | catalytic activity |
Go:0016491 | F | 96% | oxidoreductase activity |
GO:0016624 | F | 95% | oxidoredusctase activity acting on the aldehyde or oxo group of donors disulfide as acceptor |
GO:0003863 | F | 90% | 3-methyl-2-oxobutanoate dehydrogenase 2-methylpropanoyl-transferring activity |
GO:0004739 | F | 89% | pyruvate dehydrogenase acetyl-transferring activity |
GO:0004738 | F | 78% | pyruvat dehydrogenase activity |
GO:0003826 | F | 77% | alpha-ketoacid dehydrogenase activity |
GO:0047101 | F | 75% | 2-oxoisovalerate dehydrogenase acylting activity |
GO:0008677 | F | 65% | 2-dehydropantoate 2-reductase activity |
GO:0019152 | F | 63% | acetoin dehydrogenase activity |
GO:0030955 | F | 63% | potassium ion binding |
GO:0016616 | F | 62% | oxidoreductase activity acting on the CH-OH group of donors NAD or NADP as acceptor |
GO:0046872 | F | 62% | metal ion binding |
GOPET results for A4_HUMAN
GOid | Aspect | Confidence | GOTerm |
---|---|---|---|
GO:0004866 | F | 87% | endopeptidase inhibitor activity |
GO:0004867 | F | 86% | serine-type endopeptidase inhibitor activity |
GO:0030568 | F | 83% | plasmin inhibitor activity |
GO:0030304 | F | 83% | trypsin inhibitor activity |
GO:0030414 | F | 82% | peptidase inhibitor activity |
GO:0005488 | F | 79% | binding |
GO:0005515 | F | 74% | protein binding |
GO:0046872 | F | 73% | metal ion binding |
GO:0003677 | F | 71% | DNA binding |
GO:0008201 | F | 70% | heparin binding |
GO:0008270 | F | 69% | zinc ion binding |
GO:0005507 | F | 69% | copper ion binding |
GO:0005506 | F | 67% | iron ion binding |
GOPET results for BACR_HALSA:
GOid | Aspect | Confidence | GOterm |
---|---|---|---|
GO:0005216 | F | 77% | ion channel activiy |
GO:0008020 | F | 75% | G-protein coupled photoreceptor activity |
GO:0015078 | F | 60% | hydrogen ion transmembrane transporter activity |
GOPET results for INSL5_HUMAN:
GOid | Aspect | Confidence | GOterm |
---|---|---|---|
GO:0005179 | F | 80% | hormone activity |
GOPET results for LAMP1_HUMAN:
GOid | Aspect | Confidence | GOterm |
---|---|---|---|
GO:0004812 | F | 60% | aminoacyl-tRNA ligase activity |
GO:0005524 | F | 60% | ATP binding |
GOPET results for RET4_HUMAN:
GOid | Aspect | Confidence | GOterm |
---|---|---|---|
GO:0005488 | F | 90% | binding |
GO:0005501 | F | 81% | retinoid binding |
GO:0008289 | F | 80% | lipid binding |
GO:0019841 | F | 78% | retinol binding |
GO:0005215 | F | 78% | transporter activity |
GO:0016918 | F | 78% | retinal binding |
GO:0005319 | F | 69% | lipid transporter activity |
GO:0008035 | F | 60% | high-density lipoprotein particle binding |
Pfam
- Pfam was established by Finn et al. in 2008. It is described in <ref>Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A (2008). "The Pfam protein families database.". Nucleic Acids Res 36 (Database issue): D281–8</ref>
Query | Cellular Component | Molecular function | Biological Process | |
---|---|---|---|---|
BCKDHA | GO:0016624 (oxidoreductase activity, acting on the aldehyde or oxo group of donors, disulfide as acceptor) | GO:0008152 (metabolic process) | ||
A4_HUMAN | GO:0016021 (integral to membrane) | GO:0005488 (binding) | ||
BACR_HALSA | GO:0016020 (membrane) | GO:0005216 (ion channel activity) | GO: 0006811 (ion transport) | |
INSL5_HUMAN | GO:0005576 (extracellular region) | GO:0005179 (hormone activity) | ||
LAMP1_HUMAN | GO:0016020 (membrane) | |||
RET4_HUMAN | GO:0005488 (binding) |
ProtFun 2.2
References
<references />
back to Maple syrup urine disease main page