Sequence-based predictions
From Bioinformatikpedia
Contents
Sequence-based predictions
1. Secondary structure prediction
PSIPRED
PSIPRED HFORMAT (PSIPRED V3.0) Conf: 999851589999999877513567886245556456636899750389988756755687
Pred: CCCCCHHHHHHHHHHHHHHHCCCCCCCEEEEEEEEEEECCCCCCCEEEEEEEECCEEEEE
AA: MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF
10 20 30 40 50 60
Conf: 318998225536664688990669998865311211002358577441156788603899
Pred: ECCCCCCEEECCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCCCCEEEE
AA: YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV
70 80 90 100 110 120
Conf: 987799319835459889765910588728988756689786135787788899999876
Pred: EEEEEEECCCEEEEEEEEEECCCEEEEECCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHH
AA: ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR
130 140 150 160 170 180
Conf: 310271499889888616322000378810000468999601699981450765189996
Pred: HHHCCCHHHHHHHHHHCCCCCCCCCCCCCEEEECCCCCCCEEEEEEEEEECCCCEEEEEE
AA: AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL
190 200 210 220 230 240
Conf: 288106667520025355899875899999965999872169986699998826885259
Pred: ECCEECCCCCCCCCCCEECCCCCEEEEEEEEECCCCCCCEEEEEECCCCCCCEEEEEECC
AA: KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS
250 260 270 280 290 300
Conf: 999711124320001367777622367764115889887620212359
Pred: CCCCCEEEEEEEEEEEEEEEEEEEEEEEEEECCCCCCCCCCEEECCCC
AA: PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
310 320 330 340
Jpred3
Seq: MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDD SS: ------HHHHHHHHHHHHH---------EEEEEEEEE-------EEEEEEEEE-- Seq: QLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHN SS: EEEEEE-----EEEE----------HHHHHHHHHHHHHHHHHHHHHHHHHH---- Seq: HSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPT SS: -----EEEEEEEEEE------EEEEEEE-----EEEEEE----EEE-------HH Seq: KLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSV SS: HHHHH--HHHHHHHHHH------HHHHHHHHHH-H-------EEEEE-------- Seq: TTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPG SS:-EEEEEEE------EEEEEEE----------EE----------EEEEEEEEE--- Seq: EEQRYTCQVEHPGLDQPLIVIWEPSPSGTLVIGVISGIAVFVVILFIGILFIILR SS: ---EEEEEEEE------EEEEE---------HHHHHHHHHHHHHHHHHHHHHHHH Seq: KRQGSRGAMGHYVLAERE SS: HH----------------
Comparison with DSSP
2. Prediction of disordered regions
DISOPRED
AA:Target sequence Pred:Residue disorder prediction(.)= ordered residue(*)=Disordered residue conf:997600000000000000000000000000000000000000000000000000000000 pred:**.......................................................... AA:MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF 10 20 30 40 50 60 conf:000120011000000000000000000000000000000000000000000000000000 pred:............................................................ AA:YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV 70 80 90 100 110 120 conf:000000000000000000000000000000000000000000000000000000000000 pred:............................................................ AA:ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR 130 140 150 160 170 180 conf:000000000000000000000002456777878777766530000000000000000000 pred:..............................*.*........................... AA:AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL 190 200 210 220 230 240 conf:000035555545543000000000000000000000000000000000000001354667 pred:............................................................ AA:KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS 250 260 270 280 290 300 conf:777766643300000000000000047889999999999999898999 pred:...........................********************* AA:PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE 310 320 330 340 DISOPRED predictions for a false positive rate threshold of: 2%
POODLE
POODLE stands for Prediction Of Order and Disorder by machine LEarning.
POODLE-I
POODLE-S (using missing residues) predicted 6 short disordert regions within the protein sequence.
MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF -**************--------------------------------------------- YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV -------**********---******---------------------------------- ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR ------------------------------------------------------------ AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL ---------------------***************------------------------ KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS ----*********----------------------------------------******* PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE *--------------------------------********-------
POODLE-S
POODLE-S (using missing residues) predicted 6 short disordert regions within the protein sequence.
MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF -**************--------------------------------------------- YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV -------**********---******---------------------------------- ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR ------------------------------------------------------------ AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL ---------------------***************------------------------ KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS ----*********----------------------------------------******* PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE *--------------------------------********-------
POODLE-S (using High B-Factor residues) predicted 2 short disordert regions within the protein sequence.
MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF -*-***------------------------------------------------------ YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV ------------------------------------------------------------ ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR ------------------------------------------------******------ AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL ------------------------------------------------------------ KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS ------------------------------------------------------------ PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE ------------------------------------------------
POODLE-L
POODLE-L predicted a disorderd region from 296 to the end.
MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF ------------------------------------------------------------ YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV ------------------------------------------------------------ ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR ------------------------------------------------------------ AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL ------------------------------------------------------------ KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS ------------------------------------------------------****** PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE ************************************************
3. Prediction of transmembrane alpha-helices and signal peptides
TMHMM
Phobius and PolyPhobius
OCTOPUS and SPOCTOPUS
SignalP
TargetP
4. Prediction of GO terms
Generel
HFE is annotated with 27 different GO Terms which are <ref>http://www.ebi.ac.uk/QuickGO/GProtein?ac=Q30201</ref>:
GOID | GO Term | Aspect |
---|---|---|
GO:0002474 | antigen processing and presentation of peptide antigen via MHC class I | Process |
GO:0005515 | protein binding | Function |
GO:0005737 | cytoplasm | Component |
GO:0005769 | early endosome | Component |
GO:0005886 | plasma membrane | Component |
GO:0005887 | integral to plasma membrane | Component |
GO:0006461 | protein complex assembly | Process |
GO:0006810 | transport | Process |
GO:0006811 | ion transport | Process |
GO:0006826 | iron ion transport | Process |
GO:0006879 | cellular iron ion homeostasis | Process |
GO:0006898 | receptor-mediated endocytosis | Process |
GO:0006955 | immune response | Process |
GO:0007565 | female pregnancy | Process |
GO:0010106 | cellular response to iron ion starvation | Process |
GO:0016020 | membrane | Component |
GO:0016021 | integral to membrane | Component |
GO:0019882 | antigen processing and presentation | Process |
GO:0031410 | cytoplasmic vesicle | Component |
GO:0042446 | hormone biosynthetic process | Process |
GO:0042612 | MHC class I protein complex | Component |
GO:0045177 | apical part of cell | Component |
GO:0045178 | basal part of cell | Component |
GO:0048471 | perinuclear region of cytoplasm | Component |
GO:0055037 | recycling endosome | Component |
GO:0055072 | iron ion homeostasis | Process |
GO:0060586 | multicellular organismal iron ion homeostasis | Process |
GOPET
Gopet predicted 2 GO-Terms which have no overlab to the annotation.
GOID | Aspect | Confidence | GO Term |
---|---|---|---|
GO:0004872 | Molecular Function | 91% | receptor activity |
GO:0030106 | Molecular Function | 88% | MHC class I receptor activity |
Pfam
ProtFun 2.2
Functional category Prob Odds
Amino_acid_biosynthesis 0.011 0.484
Biosynthesis_of_cofactors 0.105 1.452
Cell_envelope => 0.633 10.377
Cellular_processes 0.095 1.297
Central_intermediary_metabolism 0.231 3.663
Energy_metabolism 0.059 0.659
Fatty_acid_metabolism 0.016 1.265
Purines_and_pyrimidines 0.583 2.400
Regulatory_functions 0.013 0.079
Replication_and_transcription 0.019 0.073
Translation 0.079 1.801
Transport_and_binding 0.732 1.785
Enzyme/nonenzyme Prob Odds
Enzyme 0.208 0.727
Nonenzyme => 0.792 1.110
Enzyme class Prob Odds
Oxidoreductase (EC 1.-.-.-) 0.084 0.404
Transferase (EC 2.-.-.-) 0.062 0.179
Hydrolase (EC 3.-.-.-) 0.135 0.425
Lyase (EC 4.-.-.-) 0.049 1.054
Isomerase (EC 5.-.-.-) 0.010 0.321
Ligase (EC 6.-.-.-) 0.042 0.827
Gene Ontology category Prob Odds
Signal_transducer 0.201 0.939
Receptor 0.353 2.076
Hormone 0.002 0.365
Structural_protein 0.005 0.190
Transporter 0.024 0.219
Ion_channel 0.008 0.147
Voltage-gated_ion_channel 0.002 0.085
Cation_channel 0.010 0.221
Transcription 0.036 0.283
Transcription_regulation 0.018 0.147
Stress_response 0.274 3.108
Immune_response => 0.381 4.486
Growth_factor 0.013 0.943
Metal_ion_transport 0.009 0.02