Secondary Structure Prediction BCKDHA
- 1 1. Secondary structure prediction
- 2 2. Prediction of disordered regions
- 3 3. Prediction of transmembrane alpha-helices and signal peptides
- 4 4. Prediction of GO terms
- 5 References
1. Secondary structure prediction
author: David Jones (University College London)
PSIPRED uses neuronal networks which has a single hidden layer and a feed-forward back-propagation architecture. FOr the online prediction on the server it is enough to enter a amino acid sequence. Since PSIPRED uses a very stringent cross validation method to evaluate the performance it reaches an average Q3 score of 80.7%.
The predicition is splitted into three different steps. In the first step sequence profiles are generated by using a position specific scoring matrix from PSI-BLAST as input for the neuronal network. In the next step the secondary structure is predicted. In the last step the output of the secundary structure prediction is filtered.
What is predicted
PSIPRED predicts the secondary structure
There are three different options:
- Mask low complexity regions
- Mask transmembrane helices
- Mask coiled-coil regions
It is also possible to get an email with the results of PSIPRED.
PSIPRED requires the output of PSI-BLAST (Position Specific Iterated - BLAST) as input data.
legend: A=alpha helix, E=beta strand, C=coil
PSIPRED has predicted 23 coils, 16 alpha helices and 6 beta sheets.
author: Cole C, Barber JD & Barton GJ (Bioinformatics and Computational Biology Research, University of Dundee)
Jpred is using the neural network called Jnet to predict the secondary structure of a protein sequence or multiple alignment of protein sequences. The prediction accuracy for secondary strctures lies above 81%. Additionally Jpred makes predictions on Solvent Accessibility and Coiled-coil regions. It predicts wether a residue is burried or exposed by using the several cut-off values.
What is predicted
Jpred3 predicts secondary structure from the sequence or the multiple alignment.
It also predicts the relative solvent accessibility
Jpred3 has two different modes: - single sequence - multiple alignment
Jpred3 needs a protein sequence or multiple alignment of protein sequences as input.
Multiple sequence: The first sequence has to be the target sequence since the alignment is modified so that the first sequence do not have any gaps. The alignemt has to be in the MSF or in the BLC format. Single sequence: For the sequence a multiple alignment is constructed with PSI-BLAST (3 iteratoins).
|1u5b (e-value:0)||1umd (e-value:6e-58)||1qs0 (e-value:1e-57)||3dv0 (e-value:2e-51)|
|position||structural element||position||structural element||position||structural element||position||structural element|
|35-60||alpha helix||36–38||alpha helix|
|61-64||beta strand||44–47||alpha helix||44–69||alpha helix|
|74-83||alpha helix||74–99||alpha helix|
|91-93||alpha helix||89-92||beta strand|
|99-124||alpha helix||98-104||alpha helix||102–104||beta strand||98–100||beta strand|
|108-116||alpha helix||110–112||turn||106–111||alpha helix|
|122-125||turn||113–122||alpha helix||116–124||alpha helix|
|127-129||beta strand||127–130||beta strand||127–130||alpha helix|
|138-146||alpha helix||144-147||beta strand||136–141||alpha helix|
|152-154||beta strand||150-162||alpha helix||146–154||alpha helix||147 – 161||alpha helix|
|171-179||alpha helix||169-173||beta strand||173–175||alpha helix||168–173||beta strand|
|176-179||alpha helix||175–178||alpha helix|
|185-188||turn||181-193||alpha helix||182–185||beta strand||180–191||alpha helix|
|198-201||turn||196-204||beta strand||186–200||alpha helix||196–202||beta strand|
|207–212||beta strand||204–206||beta strand|
|212-226||alpha helix||222-227||alpha helix||214–217||alpha helix||221–226||alpha helix|
|232-237||beta strand||232-236||beta strand||219–231||alpha helix||231–235||beta strand|
|240-242||alpha helix||240-255||alpha helix||235–241||beta strand||239–254||alpha helix|
|244-255||alpha helix||243–245||beta strand|
|260-266||beta strand||261-267||beta strand||262–266||alpha helix||260–265||beta strand|
|280-282||beta strand||279 – 294||alpha helix|
|285-287||alpha helix||285–292||alpha helix|
|294-299||beta strand||296-305||alpha helix||300–305||beta strand||296–306||alpha helix|
|303-320||beta strand||306-308||turn||318–320||alpha helix|
|324-329||beta strand||312-334||alpha helix||326–329||alpha helix||312–334||alpha helix|
|341-345||alpha helix||335–345||alpha helix||341–346||alpha helix|
|360-368||alpha helix||354-366||alpha helix||351–373||alpha helix||354–366||alpha helix|
|376-399||alpha helix||378–380||beta strand|
|405-408||alpha helix||399–406||alpha helix|
Letter code for the secundary structure elements:
- H (blue): alpha
- 3 (yellow): residue in isolated beta-bridge
- T (red): hydrogen bonded turn
- S (green): bend
2. Prediction of disordered regions
The disordered regions in BCKDHA are predicted by DISOPRED in the beginning and in the end of the protein.
POODLE-S (Missing residues)
POODLE-S (which predicts short disordered regions) with the option "Missing residues" predicted the disordered regions between the positions 1-56, 341-345 and 420-423. This is also shown in the plot above.
Detailed sequence with disordered region probability: File:PoodleSMissingResiduesOut.pdf
The probability can reach from 0 to 1. Where 0 means there is no disordered region and 1 that there is a disordered region.
POODLE-S (High B-Factor residues)
POODLE-S (which predicts short disordered regions) with the option "High B-Factor residues" predicted the disordered regions between the positions 6-9, 15-57, 93, 95-96, 340-354 and 379-402. This is also shown in the plot above.
Detailed sequence with disordered region probability: File:PoodleSFactorBOut.pdf
The regions which could be disordered regions but poodle is not sure are bordered by blue squares and the disordered regions are bordered by red squares.
Detailed sequence with disordered region:
5=perhaps disordered regions
Prediction type: long disorder
Detailed sequence with disordered region probability: File:LongSeqOut.pdf
Prediction type: short disorder
Detailed sequence with disordered region probability: File:ShortSeqOut.pdf
Prediction type: structured regions
With the option "structured regions" there was no prediction of disordered regions.
Only the command "Unkown globular domains: 1-445" appeared.
back to Maple syrup urine disease main page
3. Prediction of transmembrane alpha-helices and signal peptides
In the following section different tools for predicting transmembrane helices and signal peptides are tested. As the BCKDHA protein isn't a transmembrane protein, additional proteins were used for the transmembrane and signal peptide analysis:
|BACR_HALSA||Cell membrane||yes||ion transport||P02945|
|LAMP1_HUMAN||Cell membrane, Lysosome membrane, Endosome membrane||yes||Presents carbohydrate ligands to selectins||P11279|
|A4_HUMAN||Cell membrane||yes||Protease Inhibitor||P05067|
Transmembrane topology and signal peptides are features that are likely to be conserved during evolution.
- TMHMM was developed by Sonnhammer, Heijne and Krogh in 1998 <ref> E.L. Sonnhammer, Heijne and A. Krogh, A hidden Markov model for predicting transmembrane helices in protein sequences, Proc Int Conf Intell Syst Mol Biol.(1998)</ref>
- TMHMM predicts transmembrane helices in proteins.
- TMHMM is a membrane topology prediction method based on a hidden Markov model.
Phobius and Polyphobius
- Phobius was developed by Käll et al <ref> "A Combined Transmembrane Topology and Signal Peptide Prediction Method", Journal of Mol. Biology,338(5):1027-1036, 2004 </ref>
- combined prediction of transmembrane regions and signal peptids
- Required input information: only sequence in FASTA-Format (20 amino acids and B, Z, X are recognized)
- As transmembrane topology and signal peptides are likely to be conserved during evolution, Polyphobius was established <ref>Käll et al., "An HMM posterior decoder for sequence feature prediction that includes homology information", Bioinformatics, 21 (Suppl 1):i251-i257, 2005</ref>, which includes information from homologous sequences to the query.
- Required input: 2 Options: Query Sequence in FASTA-Format, which is then blasted agains uniprot_trembl or upload of an alignment in FASTA-Format which provides information about homologs.
For the BCKDHA-protein Phobius predicted a signal peptide with about 90% probability at the beginning of the sequence. The predicted signal peptide is 34 amino acids long. This matches the information given on Uniprot, which says, that BCKDHA contains a 45bp long signal peptide for the transfer into the mitochondrion. The rest of the amino acid is a non cytoplasmic protein sequence. No part of the protein is predicted to be transmembrane spanning. This is also true, as BCKDHA is a protein located in the mitochondrion matrix according to Uniprot.
Considering the information given on Uniprot, Polyphobius performed worse than Phobius on the BCKDHA-protein sequence. It predicted no signal sequence at the beginning of the protein sequence. There is a low probability for the amino acids between position 1-45 to be a signal sequence, but all in all the whole sequenc is predicted to be a non cytoplasmic protein.
OCTOPUS and SPOCTOPUS
- OCTOPUS was developed by Viklund and Elofsson in 2008 <ref>Håkan Viklund and Arne Elofsson, "Improving topology prediction by two-track ANN-based preference scores and an extended topological grammar", Bioinformatics (2008)</ref>
- OCTOPUS (obtainer of correct topologies for uncharacterized sequences) uses a combination of hidden Markov models and artificial neural networks.
- It creates a sequence profile by doing a BLAST search to obtain homologous sequences. The profile is used as input for a neural network that predicts the probability for each residue to be located in a transmembrane(M), interface (I), close loop (L), or globular loop (G) environment as well as the preference to be inside (i) or outside (o) of the membrane. A hidden Markov model is used to calculate the most likely Protein Topology.
- Required input: Protein Sequence in FASTA-Format
- SPOCTOPUS (Viklund et al., 2008<ref>Viklund et al., "A combined predictor of signal peptides and membrane protein topology", Bioinformatics (2008)</ref>) is an extension of OCTOPUS which also predicts signal peptides. A neural network is used to predict a signal peptide preference score. The signal peptide's location is determined by a hidden Markov model. The output contains the information retrieved by OCTOPUS as well as the probabilty if a residue is predicted to be N-terminal of a signal peptide (n) or in a signal peptide (S).
- Required input information: Protein sequence in FASTA-Format
- SignalP was established by Nielsen et al. in 1997<ref>Nielsen et al., "Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites", Protein Engineering, 10:1-6, 1997</ref>
- SignalP is neural network based. It identifies signal peptides and cleavage sites.
- TargetP was developed by Emanuelsson et al. in 2002 <ref> Emanuelsson et al., "Predicting subcellular localization of proteins based on their N-terminal amino acid sequence", J. Mol. Biol., 200: 1005-1016, 2002</ref>
- TargetP predicts the subcellular location of eukaryotic proteins. additionally: cleavage site predictions
- This method is neural network based. The prediction is based on the N-terminal presequences: chloroplast transit peptide(cTP), mitochondiral targeting peptide (mTP) or secretory pathway signal peptide (SP)
- Required input information: Sequence(s) in FASTA format, organism group
The ODBA_HUMAN (BCKDHA) is predicted to be located in the mitochondrion, which is true according to Uniprot.
back to Maple syrup urine disease main page
4. Prediction of GO terms
GOPET results fot BCKDHA:
|GO:0016624||F||95%||oxidoredusctase activity acting on the aldehyde or oxo group of donors disulfide as acceptor|
|GO:0003863||F||90%||3-methyl-2-oxobutanoate dehydrogenase 2-methylpropanoyl-transferring activity|
|GO:0004739||F||89%||pyruvate dehydrogenase acetyl-transferring activity|
|GO:0004738||F||78%||pyruvat dehydrogenase activity|
|GO:0003826||F||77%||alpha-ketoacid dehydrogenase activity|
|GO:0047101||F||75%||2-oxoisovalerate dehydrogenase acylting activity|
|GO:0008677||F||65%||2-dehydropantoate 2-reductase activity|
|GO:0019152||F||63%||acetoin dehydrogenase activity|
|GO:0030955||F||63%||potassium ion binding|
|GO:0016616||F||62%||oxidoreductase activity acting on the CH-OH group of donors NAD or NADP as acceptor|
|GO:0046872||F||62%||metal ion binding|
GOPET results for A4_HUMAN
|GO:0004866||F||87%||endopeptidase inhibitor activity|
|GO:0004867||F||86%||serine-type endopeptidase inhibitor activity|
|GO:0030568||F||83%||plasmin inhibitor activity|
|GO:0030304||F||83%||trypsin inhibitor activity|
|GO:0030414||F||82%||peptidase inhibitor activity|
|GO:0046872||F||73%||metal ion binding|
|GO:0008270||F||69%||zinc ion binding|
|GO:0005507||F||69%||copper ion binding|
|GO:0005506||F||67%||iron ion binding|
GOPET results for BACR_HALSA:
|GO:0005216||F||77%||ion channel activiy|
|GO:0008020||F||75%||G-protein coupled photoreceptor activity|
|GO:0015078||F||60%||hydrogen ion transmembrane transporter activity|
GOPET results for INSL5_HUMAN:
GOPET results for LAMP1_HUMAN:
|GO:0004812||F||60%||aminoacyl-tRNA ligase activity|
GOPET results for RET4_HUMAN:
|GO:0005319||F||69%||lipid transporter activity|
|GO:0008035||F||60%||high-density lipoprotein particle binding|
|Query||Cellular Component||Molecular function||Biological Process|
|BCKDHA||GO:0016624 (oxidoreductase activity, acting on the aldehyde or oxo group of donors, disulfide as acceptor)||GO:0008152 (metabolic process)|
|A4_HUMAN||GO:0016021 (integral to membrane)||GO:0005488 (binding)|
|BACR_HALSA||GO:0016020 (membrane)||GO:0005216 (ion channel activity)||GO: 0006811 (ion transport)|
|INSL5_HUMAN||GO:0005576 (extracellular region)||GO:0005179 (hormone activity)|
back to Maple syrup urine disease main page