Task 3: odba human Sequence-based predictions

From Bioinformatikpedia
Revision as of 15:15, 12 May 2012 by Kiening (talk | contribs) (psipred)

secondary structure

To predict secondary structure we use the following tools and compare the results:

-reprof
-psipred
-DSSP_Server

reprof

to run reprof from the command line the following command is used:

reprof -i seq.fasta

reprof then calculates the secondary structure prediction and provides an output file "seq.fasta.reprof" Result: (H = Helix, E = Extended/Sheet, L = Loop)

obda_human
Reliability( 0-9 (most reliable) )
sec-structure
AA-sequence
9221124455554036207776653067862000247852012212357787787762666544200476501154066765467703167878778656
LLHHHHHHHHHHHHLLLLHHHHHHHLLLLLLLELLLLLLLLHLLLLLLLLLLLLLLLLHHHHHHHHLLLLLLELLLLEEEEELLLLLEELLLLLLLLLHH
MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE

7776655320100000301100557547888740466664011100046751342012001024530573233245541430113666535300255543
HHHHHHHHHHHHHLHLHHEEELLLLLEEEEEEELLLLLLLLELLLLELLLLLEEEEEELLLLEEEELLLLHHHHHHHHHLLHLLLLLLLLLLLELLLLLL
KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER

2565305212002477767777653177627888842564565652123003342344178887067503105676210256402047640478887567
EEEEELLLHHHHLHHHHHHHHHHHHLLLLEEEEEEELLLLLLLLLLLLLLEEEEELLLLEEEEEELLLEEELLLLLLLELLLLEELLLLLEEEEEEEELL
HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG

7234432321577765553267212200001212125776522000114312346677733555545324788866888888643278889998876138
LLEEEEELLLHHHHHHHHHLLLLEEEEHEEEEELLLLLLLLLLHLLLLLLLLLLLLLLLLHHHHHHHHHLLLLLLHHHHHHHHHHHHHHHHHHHHHHHLL
NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK

898975011024431728777888989998887256688886799
LLLLLLEEELHHHHHLHHHHHHHHHHHHHHHHHHLLLLLLLLLLL
PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
P10775
Reliability( 0-9 (most reliable) )
sec-structure
AA-sequence

9721000003610177776776314314516778775677778877514877115421004336431001011566864102210024543110024337
LLLLELHHLLLLLHHHHHHHHHHHLLEEEELLLLLLHHHHHHHHHHHHLLLLHHHHHHHHLLLLLLLHEEEHLLLLLLLLEEEEELLLLLLLLHLLLLLL
MNLDIHCEQLSDARWTELLPLLQQYEVVRLDDCGLTEEHCKDIGSALRANPSLTELCLRTNELGDAGVHLVLQGLQSPTCKIQKLSLQNCSLTEAGCGVL

0133000013330258887661566556631578202223423334201324622688888888877652057775225666664004000153444411
HHHHLHLHHHHHHLLLLLLLLHHHHHHHHHLLLLLHLLHHHHHHHHHHLLLLLLHHHHHHHHHHHHHHHHLLLLLLHHHHHHHHHLLLLLLHHHHHHHHH
PSTLRSLPTLRELHLSDNPLGDAGLRLLCEGLLDPQCHLEKLQLEYCRLTAASCEPLASVLRATRALKELTVSNNDIGEAGARVLGQGLADSACQLETLR

1047778532245655530000101103476775603366543044581121001011000456101257888875566677776613656705677777
HLLLLLLLLLHHHHHHHHHLHLLHHHLLLLLLLLLHHHHHHHLLLLLLLHHHHLHHEEEHLLLLLHHHHHHHHHHHHHHHHHHHHHHLLLLLLHHHHHHH
LENCGLTPANCKDLCGIVASQASLRELDLGSNGLGDAGIAELCPGLLSPASRLKTLWLWECDITASGCRDLCRVLQAKETLKELSLAGNKLGDEGARLLC

7631777612221231100357767777776410311433210357677236888888842788636677614532246313688888875112001025
HHHLLLLLLHHHHHHHHLHHHHHHHHHHHHHHHLLHHHHHHHLLLLLLLLHHHHHHHHHHLLLLLEEEEEEELLLLLLLLLHHHHHHHHHHHLLHHHHLL
ESLLQPGCQLESLWVKSCSLTAACCQHVSLMLTQNKHLLELQLSSNKLGDSGIQELCQALSQPGTTLRVLCLGDCEVTNSGCSSLASLLLANRSLRELDL

65766771365554034675421654433056661778999988407888851138
LLLLLLLHHHHHHHLLLLLLLLHHHHHHHLLLLLLHHHHHHHHHHHLLLLLLLELL
SNNCVGDPGVLQLLGSLEQPGCALEQLVLYDTYWTEEVEDRLQALEGSKPGLRVIS
Q08209
Reliability( 0-9 (most reliable) )
sec-structure
AA-sequence

9888511664455661466520588761225577627887133322010123575211247777614640222244420133525663200001133420
LLLLLLLLLLLLLLLLEEEEELLLLLLLEEEEEEELLLLLLEEEEEHHHELLLLLLLEEEEEEEEELLLLEEELLLLLLLLLLLEEEEELLLLHHHHHHE
MSEPKAIDPKLSTTDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESVALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMK

1331577766237775323055315633101345543037640577623532202211441221344665000134433220422233320340467624
EEEELLLLLLLEEEEEEEELLLLEEEEEEEHHHHHHHLLLLLEEEEEELLLLLLEEEEEEEEEEEEEEEEELHHHHHHHHHLLLLLHHHHHLLLEEEEEL
LFEVGGSPANTRYLFLGDYVDRGYFSIECVLYLWALKILYPKTLFLLRGNHECRHLTEYFTFKQECKIKYSERVYDACMDAFDCLPLAALMNQQFLCVHG

6474436343554310157887757611246217755467764332010220100255530521010002254110100100235413777501557763
LLLLLLLLHHHHHHHHLLLLLLLLLLLEEEEELLLLLLLLLLLLLLELLLLLELLEEEEELLLLEEEEHLLLLHHHHEHHHLLLLLLEEEEEELLLLLLL
GLSPEINTLDDIRKLDRFKEPPAYGPMCDILWSDPLEDFGNEKTQEHFTHNTVRGCSYFYSYPAVCEFLQHNNLLSILRAHEAQDAGYRMYRKSQTTGFP

2577760664021214631678750541456651137786315542201455324354045676554101135675567656767301776667777554
EEEEEEELLLEEEEELLLEEEEEELLLEEEEEEELLLLLLLLLLLLLEEEEEELLLLLHHHHHHHHHHHEELLLLLLLLLLLLLLLHHHHHHHHHHHHHH
SLITIFSAPNYLDVYNNKAAVLKYENNVMNIRQFNCSPHPYWLPNFMDVFTWSLPFVGEKVTEMLVNVLNICSDDELGSEEDGFDGATAAARKEVIRNKI

4421010235433046662477622277652464221277133333243232000332155875402001230153121135788877678876434554
HHHLHHEEEEEEEELLLLLEEEEELLLLLLLLLLLEELLLLEEEEEEEEEEELLHHHHLLLLLLLEEEEHHHHLLLLHHLLLLLLLLLLLLLLLLHHHHH
RAIGKMARVFSVLREESESVLTLKGLTPTGMLPSGVLSGGKQTLQSATVEAIEADEAIKGFSPQHKITSFEEAKGLDRINERMPPRRDAMPSDANLNSIN

331156787677886677889
HHHLLLLLLLLLLLLLLLLLL
KALTSETNGTDSNGSNSSNIQ
Q9X0E6
Reliability( 0-9 (most reliable) )
sec-structure
AA-sequence

7687613771687888888888877787530020104624520132330256774264687898899987508777820100246778889988877524
LEEEEELLLLHHHHHHHHHHHHHHHHHHHHLHLHHHLLLEEELEEELLHHHHHHHLLLHHHHHHHHHHHHHLLLLLLLHHEHHHHHHHHHHHHHHHHHHL
MILVYSTFPNEEKALEIGRKLLEKRLIACFNAFEIRSGYWWKGEIVQDKEWAAIFKTTEEKEKELYEELRKLHPYETPAIFTLKVENVLTEYMNWLRESV

9
L
L

psipred

the version from [1] was used to predict secondary structure with psipred. Results:

obda_human
confidence
sec-structure
AA-sequence
915554344652010125789986408888898888999867679889999999999943
CHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAE

345544347897889982787589997259999999999999999999799999999999
CCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHH
FIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILY

984258734444798615999998531399982418689312241328998999998626
HHHHCCCCCCCCCCCCCHHHHHHHHHHCCCCCEEECCCCCCHHHHHCCCCHHHHHHHHCC
ESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYG

978889988877778888776122353334681566759888767099958999818886
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEEECCCC
NISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGA

670358878577674079759999579823236884113641143204567987321028
CCHHHHHHHHHHHHHHCCCEEEEEECCCEEECCCCCCCCCCCHHHHHCCCCCCCCCEECC
ASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG

209999999999999999089974998642117999999999999997788866514995
CCHHHHHHHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHHHHCCCC
NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHP

799999999779999999999999999999999999992999996677866421799789
HHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHCCCCHHH
ISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQL

9999999999998299899998789
HHHHHHHHHHHHHHCCCCCCCCCCC
RKQQESLARHLQTYGEHYPLDHFDK
P10775
confidence
sec-structure
AA-sequence

989828999999999999677137869971179999887999999643699967898647
CEEECCCCCCCHHHHHHHHHHHCCCCEEEECCCCCCHHHHHHHHHHHCCCCCCCEEECCC
MNLDIHCEQLSDARWTELLPLLQQYEVVRLDDCGLTEEHCKDIGSALRANPSLTELCLRT

999958999999760499984138982169999455648999723799858897979999
CCCCHHHHHHHHHHHCCCCCCCCEEEEECCCCCHHHHHHHHHHHCCCCCCCEEECCCCCC
NELGDAGVHLVLQGLQSPTCKIQKLSLQNCSLTEAGCGVLPSTLRSLPTLRELHLSDNPL

926999999884299986488981148899014899999871699979997869999968
CHHHHHHHHHHHCCCCCCCCEEEEECCCCCHHHHHHHHHHHCCCCCCCEEECCCCCCCHH
GDAGLRLLCEGLLDPQCHLEKLQLEYCRLTAASCEPLASVLRATRALKELTVSNNDIGEA

999999750299985368971389999776999999882199979796999999918999
HHHHHHHHCCCCCCCCCEEECCCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCHHHHH
GARVLGQGLADSACQLETLRLENCGLTPANCKDLCGIVASQASLRELDLGSNGLGDAGIA

998530499984148970289999665999999860499969996889999907999999
HHHHHHCCCCCCCCEEECCCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCHHHHHHHH
ELCPGLLSPASRLKTLWLWECDITASGCRDLCRVLQAKETLKELSLAGNKLGDEGARLLC

980499986458961189789888999999882499749896999999925899999851
HHHCCCCCCCCEEEECCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCCHHHHHHHHHC
ESLLQPGCQLESLWVKSCSLTAACCQHVSLMLTQNKHLLELQLSSNKLGDSGIQELCQAL

899961119980268899666999999984399979885999999938999999840699
CCCCCCEEEEECCCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCHHHHHHHHHHHCCC
SQPGTTLRVLCLGDCEVTNSGCSSLASLLLANRSLRELDLSNNCVGDPGVLQLLGSLEQP

997678830589887899999999995599831219
CCCCCEEECCCCCCCHHHHHHHHHHHHCCCCCEECC
GCALEQLVLYDTYWTEEVEDRLQALEGSKPGLRVIS
Q08209
confidence
sec-structure
AA-sequence

999988999887776433454799998899321139899989999999996317799999
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCHHH
MSEPKAIDPKLSTTDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESV

999999999997219880321498447446663057899999816999998630024543
HHHHHHHHHHHHHHCCCCEEECCCEEEECCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCC
ALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMKLFEVGGSPANTRYLFLGDYV

689971899999999834099957772157432122131554799995408989999985
CCCCCHHHHHHHHHHHHHCCCCCEEEECCCCCCCCCCCCCCHHHHHHHHCCHHHHHHHHH
DRGYFSIECVLYLWALKILYPKTLFLLRGNHECRHLTEYFTFKQECKIKYSERVYDACMD

429830331043915485046798999911110268999999999741002589999889
HCCCCHHHHHCCCCEEEEECCCCCCCCCHHHHCCCCCCCCCCCCCCCCHHCCCCCCCCCC
AFDCLPLAALMNQQFLCVHGGLSPEINTLDDIRKLDRFKEPPAYGPMCDILWSDPLEDFG

976434567889874069747689999987549803433323114013201134467999
CCCCCCCCCCCCCCCCEEECCHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCCCCCC
NEKTQEHFTHNTVRGCSYFYSYPAVCEFLQHNNLLSILRAHEAQDAGYRMYRKSQTTGFP

469980378752345882379998376020786434899998898885444531367899
CEEEEECCCCCCCCCCCCEEEEEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHH
SLITIFSAPNYLDVYNNKAAVLKYENNVMNIRQFNCSPHPYWLPNFMDVFTWSLPFVGEK

999999984067865568988888820389999999997775556788889999842100
HHHHHHHHHCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
VTEMLVNVLNICSDDELGSEEDGFDGATAAARKEVIRNKIRAIGKMARVFSVLREESESV

000137999999886667723566666677776665430699998898548887401200
HHHCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHH
LTLKGLTPTGMLPSGVLSGGKQTLQSATVEAIEADEAIKGFSPQHKITSFEEAKGLDRIN

38999989998864522211113457789999999999999
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ERMPPRRDAMPSDANLNSINKALTSETNGTDSNGSNSSNIQ
Q9X0E6
confidence
sec-structure
AA-sequence
999999079999999999998635611237776532476544610112148789766711
CEEEEECCCCHHHHHHHHHHHHHCCCCCEEEEEEEEEEEEECCCEEECCEEEEEEECCCC
MILVYSTFPNEEKALEIGRKLLEKRLIACFNAFEIRSGYWWKGEIVQDKEWAAIFKTTEE

19999999986599966418998366555778899875229
CHHHHHHHHHHHCCCCCCEEEEEECCCCCHHHHHHHHHHCC
KEKELYEELRKLHPYETPAIFTLKVENVLTEYMNWLRESVL

Disorder

Transmembrane helices

Signal peptides

GO terms