Sequence-based predictions

From Bioinformatikpedia
Revision as of 13:41, 4 June 2011 by Landerer (talk | contribs) (POODLE)

Sequence-based predictions

1. Secondary structure prediction

PSIPRED

Secondary Structure predicted by PSIPRED
PSIPRED HFORMAT (PSIPRED V3.0)
Conf: 999851589999999877513567886245556456636899750389988756755687
Pred: CCCCCHHHHHHHHHHHHHHHCCCCCCCEEEEEEEEEEECCCCCCCEEEEEEEECCEEEEE
AA: MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF
10 20 30 40 50 60
Conf: 318998225536664688990669998865311211002358577441156788603899
Pred: ECCCCCCEEECCCCCCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCCCCEEEE
AA: YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV
70 80 90 100 110 120
Conf: 987799319835459889765910588728988756689786135787788899999876
Pred: EEEEEEECCCEEEEEEEEEECCCEEEEECCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHH
AA: ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR
130 140 150 160 170 180
Conf: 310271499889888616322000378810000468999601699981450765189996
Pred: HHHCCCHHHHHHHHHHCCCCCCCCCCCCCEEEECCCCCCCEEEEEEEEEECCCCEEEEEE
AA: AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL
190 200 210 220 230 240
Conf: 288106667520025355899875899999965999872169986699998826885259
Pred: ECCEECCCCCCCCCCCEECCCCCEEEEEEEEECCCCCCCEEEEEECCCCCCCEEEEEECC
AA: KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS
250 260 270 280 290 300
Conf: 999711124320001367777622367764115889887620212359
Pred: CCCCCEEEEEEEEEEEEEEEEEEEEEEEEEECCCCCCCCCCEEECCCC
AA: PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
310 320 330 340

Jpred3

Seq: MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDD
 SS: ------HHHHHHHHHHHHH---------EEEEEEEEE-------EEEEEEEEE--

Seq: QLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHN
 SS: EEEEEE-----EEEE----------HHHHHHHHHHHHHHHHHHHHHHHHHH----

Seq: HSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPT
 SS: -----EEEEEEEEEE------EEEEEEE-----EEEEEE----EEE-------HH

Seq: KLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSV
 SS: HHHHH--HHHHHHHHHH------HHHHHHHHHH-H-------EEEEE--------

Seq: TTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPG
 SS:-EEEEEEE------EEEEEEE----------EE----------EEEEEEEEE---

Seq: EEQRYTCQVEHPGLDQPLIVIWEPSPSGTLVIGVISGIAVFVVILFIGILFIILR
 SS: ---EEEEEEEE------EEEEE---------HHHHHHHHHHHHHHHHHHHHHHHH
Seq: KRQGSRGAMGHYVLAERE
 SS: HH----------------

Comparison with DSSP

2. Prediction of disordered regions

DISOPRED

AA:Target sequence
Pred:Residue disorder prediction(.)= ordered residue(*)=Disordered residue
conf:997600000000000000000000000000000000000000000000000000000000
pred:**..........................................................
  AA:MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF
             10        20	  30 	    40	      50	60
conf:000120011000000000000000000000000000000000000000000000000000
pred:............................................................
  AA:YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV
             70        80	  90	   100	     110       120
conf:000000000000000000000000000000000000000000000000000000000000
pred:............................................................
  AA:ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR
            130       140       150       160       170       180
conf:000000000000000000000002456777878777766530000000000000000000
pred:..............................*.*...........................
  AA:AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL
            190       200       210       220       230       240
conf:000035555545543000000000000000000000000000000000000001354667
pred:............................................................
  AA:KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS
            250       260       270       280       290       300
conf:777766643300000000000000047889999999999999898999
pred:...........................*********************
  AA:PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
            310       320       330       340
DISOPRED predictions for a false positive rate threshold of: 2%

POODLE

POODLE stands for Prediction Of Order and Disorder by machine LEarning.

POODLE-L

Distribution of disordert region over the AS-Sequence predicted by POODLE-L

POODLE-L predicted a disorderd region from 296 to the end.

MGPRARPALLLLMLLQTAVLQGRLLRSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVF 
------------------------------------------------------------
YDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQV 
------------------------------------------------------------
ILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNR 
------------------------------------------------------------
AYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWL 
------------------------------------------------------------
KDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIWEPS 
------------------------------------------------------******
PSGTLVIGVISGIAVFVVILFIGILFIILRKRQGSRGAMGHYVLAERE
************************************************

3. Prediction of transmembrane alpha-helices and signal peptides

TMHMM

Phobius and PolyPhobius

OCTOPUS and SPOCTOPUS

SignalP

TargetP

4. Prediction of GO terms

Generel

HFE is annotated with 27 different GO Terms which are <ref>http://www.ebi.ac.uk/QuickGO/GProtein?ac=Q30201</ref>:

GOID GO Term Aspect
GO:0002474 antigen processing and presentation of peptide antigen via MHC class I Process
GO:0005515 protein binding Function
GO:0005737 cytoplasm Component
GO:0005769 early endosome Component
GO:0005886 plasma membrane Component
GO:0005887 integral to plasma membrane Component
GO:0006461 protein complex assembly Process
GO:0006810 transport Process
GO:0006811 ion transport Process
GO:0006826 iron ion transport Process
GO:0006879 cellular iron ion homeostasis Process
GO:0006898 receptor-mediated endocytosis Process
GO:0006955 immune response Process
GO:0007565 female pregnancy Process
GO:0010106 cellular response to iron ion starvation Process
GO:0016020 membrane Component
GO:0016021 integral to membrane Component
GO:0019882 antigen processing and presentation Process
GO:0031410 cytoplasmic vesicle Component
GO:0042446 hormone biosynthetic process Process
GO:0042612 MHC class I protein complex Component
GO:0045177 apical part of cell Component
GO:0045178 basal part of cell Component
GO:0048471 perinuclear region of cytoplasm Component
GO:0055037 recycling endosome Component
GO:0055072 iron ion homeostasis Process
GO:0060586 multicellular organismal iron ion homeostasis Process

GOPET

Gopet predicted 2 GO-Terms which have no overlab to the annotation.

GOID Aspect Confidence GO Term
GO:0004872 Molecular Function 91% receptor activity
GO:0030106 Molecular Function 88% MHC class I receptor activity

Pfam

ProtFun 2.2

 Functional category                  Prob     Odds
 Amino_acid_biosynthesis              0.011    0.484
 Biosynthesis_of_cofactors            0.105    1.452
 Cell_envelope                     => 0.633   10.377
 Cellular_processes                   0.095    1.297
 Central_intermediary_metabolism      0.231    3.663
 Energy_metabolism                    0.059    0.659
 Fatty_acid_metabolism                0.016    1.265
 Purines_and_pyrimidines              0.583    2.400
 Regulatory_functions                 0.013    0.079
 Replication_and_transcription        0.019    0.073
 Translation                          0.079    1.801
 Transport_and_binding                0.732    1.785

 Enzyme/nonenzyme                     Prob     Odds
 Enzyme                               0.208    0.727
 Nonenzyme                         => 0.792    1.110

 Enzyme class                         Prob     Odds
 Oxidoreductase (EC 1.-.-.-)          0.084    0.404
 Transferase    (EC 2.-.-.-)          0.062    0.179
 Hydrolase      (EC 3.-.-.-)          0.135    0.425
 Lyase          (EC 4.-.-.-)          0.049    1.054
 Isomerase      (EC 5.-.-.-)          0.010    0.321
 Ligase         (EC 6.-.-.-)          0.042    0.827

 Gene Ontology category               Prob     Odds
 Signal_transducer                    0.201    0.939
 Receptor                             0.353    2.076
 Hormone                              0.002    0.365
 Structural_protein                   0.005    0.190
 Transporter                          0.024    0.219
 Ion_channel                          0.008    0.147
 Voltage-gated_ion_channel            0.002    0.085
 Cation_channel                       0.010    0.221
 Transcription                        0.036    0.283
 Transcription_regulation             0.018    0.147
 Stress_response                      0.274    3.108
 Immune_response                   => 0.381    4.486
 Growth_factor                        0.013    0.943
 Metal_ion_transport                  0.009    0.02