Difference between revisions of "Task 3: odba human Sequence-based predictions"
(→Comparison) |
(→Comparison) |
||
Line 465: | Line 465: | ||
<tr><td>psipred</td><td>Q9X0E6</td><td>90 / 104</td><td>86.53%</td></tr> |
<tr><td>psipred</td><td>Q9X0E6</td><td>90 / 104</td><td>86.53%</td></tr> |
||
</table> |
</table> |
||
− | </td><td> |
+ | </td><td width=300> |
This table shows the result of our comparison. Psipred performs on all targets way better than reprof using a single fasta file, but reprof outperforms psipred in 3/4 cases using a HHBlits PSSM as query, in case 4 they perform even. |
This table shows the result of our comparison. Psipred performs on all targets way better than reprof using a single fasta file, but reprof outperforms psipred in 3/4 cases using a HHBlits PSSM as query, in case 4 they perform even. |
||
</td> |
</td> |
Revision as of 13:24, 14 May 2012
Contents
secondary structure
To predict secondary structure we use the following tools and compare the results:
-reprof -psipred -DSSP_Server
reprof
to run reprof from the command line the following command is used:
reprof -i seq.fasta
reprof then calculates the secondary structure prediction and provides an output file "seq.fasta.reprof". Reprof can be run with a fasta file, or with a BLAST/HHBlits - PSSM file. We will try both variants, because the second variant promises more accurate results. Result: (H = Helix, E = Extended/Sheet, L = Loop)
reprof with fasta
obda_human Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9221124455554036207776653067862000247852012212357787787762666544200476501154066765467703167878778656 LLHHHHHHHHHHHHLLLLHHHHHHHLLLLLLLELLLLLLLLHLLLLLLLLLLLLLLLLHHHHHHHHLLLLLLELLLLEEEEELLLLLEELLLLLLLLLHH MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE 7776655320100000301100557547888740466664011100046751342012001024530573233245541430113666535300255543 HHHHHHHHHHHHHLHLHHEEELLLLLEEEEEEELLLLLLLLELLLLELLLLLEEEEEELLLLEEEELLLLHHHHHHHHHLLHLLLLLLLLLLLELLLLLL KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER 2565305212002477767777653177627888842564565652123003342344178887067503105676210256402047640478887567 EEEEELLLHHHHLHHHHHHHHHHHHLLLLEEEEEEELLLLLLLLLLLLLLEEEEELLLLEEEEEELLLEEELLLLLLLELLLLEELLLLLEEEEEEEELL HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG 7234432321577765553267212200001212125776522000114312346677733555545324788866888888643278889998876138 LLEEEEELLLHHHHHHHHHLLLLEEEEHEEEEELLLLLLLLLLHLLLLLLLLLLLLLLLLHHHHHHHHHLLLLLLHHHHHHHHHHHHHHHHHHHHHHHLL NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK 898975011024431728777888989998887256688886799 LLLLLLEEELHHHHHLHHHHHHHHHHHHHHHHHHLLLLLLLLLLL PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
P10775 Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9721000003610177776776314314516778775677778877514877115421004336431001011566864102210024543110024337 LLLLELHHLLLLLHHHHHHHHHHHLLEEEELLLLLLHHHHHHHHHHHHLLLLHHHHHHHHLLLLLLLHEEEHLLLLLLLLEEEEELLLLLLLLHLLLLLL MNLDIHCEQLSDARWTELLPLLQQYEVVRLDDCGLTEEHCKDIGSALRANPSLTELCLRTNELGDAGVHLVLQGLQSPTCKIQKLSLQNCSLTEAGCGVL 0133000013330258887661566556631578202223423334201324622688888888877652057775225666664004000153444411 HHHHLHLHHHHHHLLLLLLLLHHHHHHHHHLLLLLHLLHHHHHHHHHHLLLLLLHHHHHHHHHHHHHHHHLLLLLLHHHHHHHHHLLLLLLHHHHHHHHH PSTLRSLPTLRELHLSDNPLGDAGLRLLCEGLLDPQCHLEKLQLEYCRLTAASCEPLASVLRATRALKELTVSNNDIGEAGARVLGQGLADSACQLETLR 1047778532245655530000101103476775603366543044581121001011000456101257888875566677776613656705677777 HLLLLLLLLLHHHHHHHHHLHLLHHHLLLLLLLLLHHHHHHHLLLLLLLHHHHLHHEEEHLLLLLHHHHHHHHHHHHHHHHHHHHHHLLLLLLHHHHHHH LENCGLTPANCKDLCGIVASQASLRELDLGSNGLGDAGIAELCPGLLSPASRLKTLWLWECDITASGCRDLCRVLQAKETLKELSLAGNKLGDEGARLLC 7631777612221231100357767777776410311433210357677236888888842788636677614532246313688888875112001025 HHHLLLLLLHHHHHHHHLHHHHHHHHHHHHHHHLLHHHHHHHLLLLLLLLHHHHHHHHHHLLLLLEEEEEEELLLLLLLLLHHHHHHHHHHHLLHHHHLL ESLLQPGCQLESLWVKSCSLTAACCQHVSLMLTQNKHLLELQLSSNKLGDSGIQELCQALSQPGTTLRVLCLGDCEVTNSGCSSLASLLLANRSLRELDL 65766771365554034675421654433056661778999988407888851138 LLLLLLLHHHHHHHLLLLLLLLHHHHHHHLLLLLLHHHHHHHHHHHLLLLLLLELL SNNCVGDPGVLQLLGSLEQPGCALEQLVLYDTYWTEEVEDRLQALEGSKPGLRVIS
Q08209 Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9888511664455661466520588761225577627887133322010123575211247777614640222244420133525663200001133420 LLLLLLLLLLLLLLLLEEEEELLLLLLLEEEEEEELLLLLLEEEEEHHHELLLLLLLEEEEEEEEELLLLEEELLLLLLLLLLLEEEEELLLLHHHHHHE MSEPKAIDPKLSTTDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESVALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMK 1331577766237775323055315633101345543037640577623532202211441221344665000134433220422233320340467624 EEEELLLLLLLEEEEEEEELLLLEEEEEEEHHHHHHHLLLLLEEEEEELLLLLLEEEEEEEEEEEEEEEEELHHHHHHHHHLLLLLHHHHHLLLEEEEEL LFEVGGSPANTRYLFLGDYVDRGYFSIECVLYLWALKILYPKTLFLLRGNHECRHLTEYFTFKQECKIKYSERVYDACMDAFDCLPLAALMNQQFLCVHG 6474436343554310157887757611246217755467764332010220100255530521010002254110100100235413777501557763 LLLLLLLLHHHHHHHHLLLLLLLLLLLEEEEELLLLLLLLLLLLLLELLLLLELLEEEEELLLLEEEEHLLLLHHHHEHHHLLLLLLEEEEEELLLLLLL GLSPEINTLDDIRKLDRFKEPPAYGPMCDILWSDPLEDFGNEKTQEHFTHNTVRGCSYFYSYPAVCEFLQHNNLLSILRAHEAQDAGYRMYRKSQTTGFP 2577760664021214631678750541456651137786315542201455324354045676554101135675567656767301776667777554 EEEEEEELLLEEEEELLLEEEEEELLLEEEEEEELLLLLLLLLLLLLEEEEEELLLLLHHHHHHHHHHHEELLLLLLLLLLLLLLLHHHHHHHHHHHHHH SLITIFSAPNYLDVYNNKAAVLKYENNVMNIRQFNCSPHPYWLPNFMDVFTWSLPFVGEKVTEMLVNVLNICSDDELGSEEDGFDGATAAARKEVIRNKI 4421010235433046662477622277652464221277133333243232000332155875402001230153121135788877678876434554 HHHLHHEEEEEEEELLLLLEEEEELLLLLLLLLLLEELLLLEEEEEEEEEEELLHHHHLLLLLLLEEEEHHHHLLLLHHLLLLLLLLLLLLLLLLHHHHH RAIGKMARVFSVLREESESVLTLKGLTPTGMLPSGVLSGGKQTLQSATVEAIEADEAIKGFSPQHKITSFEEAKGLDRINERMPPRRDAMPSDANLNSIN 331156787677886677889 HHHLLLLLLLLLLLLLLLLLL KALTSETNGTDSNGSNSSNIQ
Q9X0E6 Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 7687613771687888888888877787530020104624520132330256774264687898899987508777820100246778889988877524 LEEEEELLLLHHHHHHHHHHHHHHHHHHHHLHLHHHLLLEEELEEELLHHHHHHHLLLHHHHHHHHHHHHHLLLLLLLHHEHHHHHHHHHHHHHHHHHHL MILVYSTFPNEEKALEIGRKLLEKRLIACFNAFEIRSGYWWKGEIVQDKEWAAIFKTTEEKEKELYEELRKLHPYETPAIFTLKVENVLTEYMNWLRESV 9 L L
reprof with HHBlits - PSSM (up20)
To retrieve PSSM-files from hhblits, the tool hhblits_pssm.pl from the hhsuite is used( we used the version installed in "/opt/hhblits/hhblits/" on jobtest ). It is started from the command line with the following command:
hhblits_pssm.pl --infile query.fasta --outfile query.pssm -h "/mnt/project/rost_db/data/hhblits/uniprot20_current"
now reprof is run using the created pssm's. Results:
obda_human - PSSM - UP20 HHBlits Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9011122245644236115555543116766654453556544557765555458776453234143366656786728986778724477557888978 LLHLLLHHHHHHHLLLLLHHHHHHHHLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLEEEEELLLLLELLLLLLLLLLHH MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKE 8888998898788888898885047875763577741478899998606897795021488998773998789899885003564567752011066777 HHHHHHHHHHHHHHHHHHHHHHHLLLLLLLLLLLLLHHHHHHHHHHHLLLLLEEEELLHHHHHHHHLLLLHHHHHHHHHLLLLLLLLLLLLLEELLLLLL KVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKER 6100101256557999999988553178816889881350116558999999774499889998848703362023556840288988637982799855 LLLLLELEHHHHHHHHHHHHHHHHHLLLLLEEEEEEELLLHLLLHHHHHHHHHHHLLLLEEEEEELLLEEEEEELLLLLLLLHHHHHHHHLLLLEEEEEL HFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG 6878889899999988752489768999973032777776755657888999986047859999988886799997888889889899889888888736 LLHHHHHHHHHHHHHHHHHLLLLEEEEEEEELLLLLLLLLLLLLLLLHHHHHHHHHLLLHHHHHHHHHHHLLLLLHHHHHHHHHHHHHHHHHHHHHHHHL NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHPISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERK 674889999984266986689888988999872585477765789 LLLLHHHHHHHHLLLLLHHHHHHHHHHHHHHHHLLLLLLLLLLLL PKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK
P10775 - PSSM UP20 HHBlits Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9565456756800079887228884688850676780015788998731898358884146568446899988752215662788841464672215778 LEEELLLLLLLHHLHHHHHHLLLLLEEEEEELLLLLLLLHHHHHHHHHLLLLLEEEEEELLLLLLLLHHHHHHHHHLLLLLEEEEEEELLLLLLLLHHHH MNLDIHCEQLSDARWTELLPLLQQYEVVRLDDCGLTEEHCKDIGSALRANPSLTELCLRTNELGDAGVHLVLQGLQSPTCKIQKLSLQNCSLTEAGCGVL 9887308884588830464683328888888862143407788542534755457889887128881688852465564557888877740054607888 HHHHHHLLLLEEEEEELLLLLLLLHHHHHHHHHHHLLLEEEEEEELLLLLLLLHHHHHHHHHLLLLLEEEEEELLLLLLLLHHHHHHHHHLHLLLEEEEE PSTLRSLPTLRELHLSDNPLGDAGLRLLCEGLLDPQCHLEKLQLEYCRLTAASCEPLASVLRATRALKELTVSNNDIGEAGARVLGQGLADSACQLETLR 5235237654577642014523537877135656742178888877401756388886436667013788898873078807887046736732088888 EELLLLLLLLHHHHHHHLLLLLLEEEEEELLLLLLLLLHHHHHHHHHLLLLLEEEEEEELLLLLHHHHHHHHHHHHHLLLLEEEEELLLLLLLLLHHHHH LENCGLTPANCKDLCGIVASQASLRELDLGSNGLGDAGIAELCPGLLSPASRLKTLWLWECDITASGCRDLCRVLQAKETLKELSLAGNKLGDEGARLLC 6630326755888851566674434568988711888168782056578645899987620337752788731675685427788988830888358783 HHHHLLLLLEEEEEEELLLLLLLLHHHHHHHHHLLLLLEEEEELLLLLLLLLHHHHHHHHLLLLLLEEEEEELLLLLLLLLHHHHHHHHHLLLLLEEEEE ESLLQPGCQLESLWVKSCSLTAACCQHVSLMLTQNKHLLELQLSSNKLGDSGIQELCQALSQPGTTLRVLCLGDCEVTNSGCSSLASLLLANRSLRELDL 16768915799999887624877067785067688677999998898767846738 LLLLLLHHHHHHHHHHHHLLLLLEEEEEEELLLLLHHHHHHHHHHHHHLLLLEEEL SNNCVGDPGVLQLLGSLEQPGCALEQLVLYDTYWTEEVEDRLQALEGSKPGLRVIS
Q08209 - PSSM - UP20 HHBlits Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9877777751245775557766787777782220055787587899999852899898889999999999885489834700233453125510899999 LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLHHHLELLLLLLLHHHHHHHHHLLLLLLHHHHHHHHHHHHHHHHLLLLEEEEELLEEEEEELLLLHHHHHH MSEPKAIDPKLSTTDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESVALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMK 9874046554058998856078976799999999864215760899860665688887742352656642488899989999871264064687267614 HHHHLLLLLLLEEEEEEEELLLLLLHHHHHHHHHHHHHHLLLLEEEEELLLLHHHHHHHHLLLHHHHHHHHHHHHHHHHHHHHHLHHHEEELLLEEEEEL LFEVGGSPANTRYLFLGDYVDRGYFSIECVLYLWALKILYPKTLFLLRGNHECRHLTEYFTFKQECKIKYSERVYDACMDAFDCLPLAALMNQQFLCVHG 4468877624101025657787543410023305421124444555301256567542128789999988769817898434133301220477656630 LLLLLLLLHHHLHHLLLLLLLLLLLLLHEEEELLLLLLLLLLLLLLLEELLLLLLLLLELLHHHHHHHHHHLLLLEEEEELLLLLLLEEELLLLLLLLLL GLSPEINTLDDIRKLDRFKEPPAYGPMCDILWSDPLEDFGNEKTQEHFTHNTVRGCSYFYSYPAVCEFLQHNNLLSILRAHEAQDAGYRMYRKSQTTGFP 6999984654465668357999981884238986037887888976653121332023457888998531456765455553223355057788999999 EEEEEEELLLLLLLLLLEEEEEEEELLLLEEEEEELLLLLLLLLLLLLLLLEELLLHHHHHHHHHHHHHHLLLLLLLLLLLLLLLLLLHHHHHHHHHHHH SLITIFSAPNYLDVYNNKAAVLKYENNVMNIRQFNCSPHPYWLPNFMDVFTWSLPFVGEKVTEMLVNVLNICSDDELGSEEDGFDGATAAARKEVIRNKI 9999998788876551124787312477874554102476332344445542100135665766532426788752502146888766555656657666 HHHHHHHHHHHHHHHHLLLHHHHLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLHLLLLLLLLLLLLLLLHHHHHHLLLHLLLLLLLLLLLLLLLLLLLLL RAIGKMARVFSVLREESESVLTLKGLTPTGMLPSGVLSGGKQTLQSATVEAIEADEAIKGFSPQHKITSFEEAKGLDRINERMPPRRDAMPSDANLNSIN 554655567766555544559 LLLLLLLLLLLLLLLLLLLLL KALTSETNGTDSNGSNSSNIQ
Q9X0E6 - PSSM - UP20 HHBlits Reliability( 0-9 (most reliable) ) sec-structure AA-sequence 9999996599688889999999956312676518237898878224320588999844887899899999729988980788864026878999999843 LEEEEEELLLHHHHHHHHHHHHHHLLEEEEEELLLEEEEEELLELLLLLEEEEEEEELHHHHHHHHHHHHHLLLLLLLLEEEEELLLLLHHHHHHHHHHL MILVYSTFPNEEKALEIGRKLLEKRLIACFNAFEIRSGYWWKGEIVQDKEWAAIFKTTEEKEKELYEELRKLHPYETPAIFTLKVENVLTEYMNWLRESV 9 L L
psipred
the version from [1] was used to predict secondary structure with psipred. Results:
obda_human confidence sec-structure AA-sequence 915554344652010125789986408888898888999867679889999999999943 CHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAE 345544347897889982787589997259999999999999999999799999999999 CCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHH FIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILY 984258734444798615999998531399982418689312241328998999998626 HHHHCCCCCCCCCCCCCHHHHHHHHHHCCCCCEEECCCCCCHHHHHCCCCHHHHHHHHCC ESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYG 978889988877778888776122353334681566759888767099958999818886 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCEEEEEECCCC NISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGA 670358878577674079759999579823236884113641143204567987321028 CCHHHHHHHHHHHHHHCCCEEEEEECCCEEECCCCCCCCCCCHHHHHCCCCCCCCCEECC ASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG 209999999999999999089974998642117999999999999997788866514995 CCHHHHHHHHHHHHHHHHHCCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHHHHCCCC NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHP 799999999779999999999999999999999999992999996677866421799789 HHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHCCCCHHH ISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQL 9999999999998299899998789 HHHHHHHHHHHHHHCCCCCCCCCCC RKQQESLARHLQTYGEHYPLDHFDK
P10775 confidence sec-structure AA-sequence 989828999999999999677137869971179999887999999643699967898647 CEEECCCCCCCHHHHHHHHHHHCCCCEEEECCCCCCHHHHHHHHHHHCCCCCCCEEECCC MNLDIHCEQLSDARWTELLPLLQQYEVVRLDDCGLTEEHCKDIGSALRANPSLTELCLRT 999958999999760499984138982169999455648999723799858897979999 CCCCHHHHHHHHHHHCCCCCCCCEEEEECCCCCHHHHHHHHHHHCCCCCCCEEECCCCCC NELGDAGVHLVLQGLQSPTCKIQKLSLQNCSLTEAGCGVLPSTLRSLPTLRELHLSDNPL 926999999884299986488981148899014899999871699979997869999968 CHHHHHHHHHHHCCCCCCCCEEEEECCCCCHHHHHHHHHHHCCCCCCCEEECCCCCCCHH GDAGLRLLCEGLLDPQCHLEKLQLEYCRLTAASCEPLASVLRATRALKELTVSNNDIGEA 999999750299985368971389999776999999882199979796999999918999 HHHHHHHHCCCCCCCCCEEECCCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCHHHHH GARVLGQGLADSACQLETLRLENCGLTPANCKDLCGIVASQASLRELDLGSNGLGDAGIA 998530499984148970289999665999999860499969996889999907999999 HHHHHHCCCCCCCCEEECCCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCHHHHHHHH ELCPGLLSPASRLKTLWLWECDITASGCRDLCRVLQAKETLKELSLAGNKLGDEGARLLC 980499986458961189789888999999882499749896999999925899999851 HHHCCCCCCCCEEEECCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCCHHHHHHHHHC ESLLQPGCQLESLWVKSCSLTAACCQHVSLMLTQNKHLLELQLSSNKLGDSGIQELCQAL 899961119980268899666999999984399979885999999938999999840699 CCCCCCEEEEECCCCCCCHHHHHHHHHHHHCCCCCCEEECCCCCCCHHHHHHHHHHHCCC SQPGTTLRVLCLGDCEVTNSGCSSLASLLLANRSLRELDLSNNCVGDPGVLQLLGSLEQP 997678830589887899999999995599831219 CCCCCEEECCCCCCCHHHHHHHHHHHHCCCCCEECC GCALEQLVLYDTYWTEEVEDRLQALEGSKPGLRVIS
Q08209 confidence sec-structure AA-sequence 999988999887776433454799998899321139899989999999996317799999 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCHHH MSEPKAIDPKLSTTDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESV 999999999997219880321498447446663057899999816999998630024543 HHHHHHHHHHHHHHCCCCEEECCCEEEECCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCC ALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMKLFEVGGSPANTRYLFLGDYV 689971899999999834099957772157432122131554799995408989999985 CCCCCHHHHHHHHHHHHHCCCCCEEEECCCCCCCCCCCCCCHHHHHHHHCCHHHHHHHHH DRGYFSIECVLYLWALKILYPKTLFLLRGNHECRHLTEYFTFKQECKIKYSERVYDACMD 429830331043915485046798999911110268999999999741002589999889 HCCCCHHHHHCCCCEEEEECCCCCCCCCHHHHCCCCCCCCCCCCCCCCHHCCCCCCCCCC AFDCLPLAALMNQQFLCVHGGLSPEINTLDDIRKLDRFKEPPAYGPMCDILWSDPLEDFG 976434567889874069747689999987549803433323114013201134467999 CCCCCCCCCCCCCCCCEEECCHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCCCCCC NEKTQEHFTHNTVRGCSYFYSYPAVCEFLQHNNLLSILRAHEAQDAGYRMYRKSQTTGFP 469980378752345882379998376020786434899998898885444531367899 CEEEEECCCCCCCCCCCCEEEEEEECCCCEEEEEECCCCCCCCCCCCCCCCCCHHHHHHH SLITIFSAPNYLDVYNNKAAVLKYENNVMNIRQFNCSPHPYWLPNFMDVFTWSLPFVGEK 999999984067865568988888820389999999997775556788889999842100 HHHHHHHHHCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH VTEMLVNVLNICSDDELGSEEDGFDGATAAARKEVIRNKIRAIGKMARVFSVLREESESV 000137999999886667723566666677776665430699998898548887401200 HHHCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHH LTLKGLTPTGMLPSGVLSGGKQTLQSATVEAIEADEAIKGFSPQHKITSFEEAKGLDRIN 38999989998864522211113457789999999999999 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC ERMPPRRDAMPSDANLNSINKALTSETNGTDSNGSNSSNIQ
Q9X0E6 confidence sec-structure AA-sequence 999999079999999999998635611237776532476544610112148789766711 CEEEEECCCCHHHHHHHHHHHHHCCCCCEEEEEEEEEEEEECCCEEECCEEEEEEECCCC MILVYSTFPNEEKALEIGRKLLEKRLIACFNAFEIRSGYWWKGEIVQDKEWAAIFKTTEE 19999999986599966418998366555778899875229 CHHHHHHHHHHHCCCCCCEEEEEECCCCCHHHHHHHHHHCC KEKELYEELRKLHPYETPAIFTLKVENVLTEYMNWLRESVL
DSSP_Server
to use DSSP_Server we first had to determine which pdb-ID's are associated with the uniprot ID's P12694, P10775, Q9X0E6, Q08209
uniprot ID | pdb ID's |
P12694 | 1DTW, 1OLS, 1OLU, 1OLX, 1U5B, 1V11, 1V16, 1V1M, 1V1R, 1WCI, 1X7W, 1X7Y, 1X7Z, 1X80, 2BEU, 2BEV, 2BEW, 2BFB, 2BFC, 2BFD, 2BFE, 2BFF, 2J9F |
P10775 | 1DFJ, 2BNH |
Q08209 | 1AUI, 1M63, 1MF8, 2JOG, 2JZI, 2P6B, 2R28, 2W73, 3LL8 |
Q9X0E6 | 1KR4, 1O5J, 1VHF |
now DSSP_Server is run for each uniprot ID with the corresponding pdb ID with the best resolution or greates span over the protein. Results:
P12694 - 2BFD - Position 46-445 sec-structure ( H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend ) aa-sequence S TT SS SS S EE SB TTS BS GGG HHHHHHHHHHHHHHHHHHHHHHHHHHTTSSS TT HHHHHHHHHTS AKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILYESQRQGRISFYMTNYGEEGTHVGSAAALD TTSEEE S HHHHHHTT HHHHHHHHHT TT TTTT S SS BTTTTB SSTTTHHHHHHHHHHHHHHHT EEEEEETTGGGSHHHHHH NTDLVFGQAREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAG HHHHHHTT EEEEEEE SEETTEEGGGT SSSTTGGGTGGGT EEEEEETT HHHHHHHHHHHHHHHHHHT EEEEEE HHHHHHHHH FNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIG!ST!DHPISRLRHYL TTTT HHHHHHHHHHHHHHHHHHHHHHHHS B GGGGSTTSSSS HHHHHHHHHHHHHHHHHGGGS GGGB S EEEE HHHHHHHHH LSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK!AHF!EYGQTQKMNLFQSVTSAL HHHHHH TT EEEETTTTTT TTSTTTTHHHHH TTTEEE S HHHHHHHHHHHHHTT EEEE SSGGG GGGHHHHHTTGGGHHHHTTTSS TTEE DNSLAKDPTAVIFGEDVAFGGVFRCTVGLRDKYGKDRVFNTPLCEQGIVGFGIGIAVTGATAIAEIQFADYIFPAFDQIVNEAAKYRYRSGDLFNCGSLT EEEEES SS GGGSS HHHHHTSTT EEE SSHHHHHHHHHHHHHSSS EEEEEEGGGTTS EEEESS SS EEEE SSEEEEE TTHH IRSPWGCVGHGALYHSQSPEAFFAHCPGIKVVIPRSPFQAKGLLLSCIEDKNPCIFFEPKILYRAAAEEVPIEPYNIPLSQAEVIQEGSDVTLVAWGTQV HHHHHHHHHHHHHH EEEEE EEES HHHHHHHHHHHS EEEEEEEESTT HHHHHHHHHHHHHGGG SS EEEEE SS STTHHHHS HHH HVIREVASMAKEKLGVSCEVIDLRTIIPWDVDTICKSVIKTGRLLISHEAPLTGGFASEISSTVQEECFLNLEAPISRVCGYDTPFPHIFEPFYIPDKWK HHHHHHHHHT CYDALRKMINY
P10775 - 2BNH - position 1-456 sec-structure ( H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend ) aa-sequence S B EES HHHHHHHHHHHTT SEEEEET HHHHHHHHHHHTT TT EEE S HHHHHHHHHHHHSSTT EEE TTS GGGGGS AMNLDIHCEQLSDARWTELLPLLQQYEVVRLDDCGLTEEHCKDIGSALRANPSLTELCLRTNELGDAGVHLVLQGLQSPTCKIQKLSLQNCSLTEAGCGV HHHHHHH TT EEE S HHHHHHHHHHHHHSTT EEE TT BHHHHHHHHHHHHH S EEE TTSB HHHHHHHHHHHHHT S EE LPSTLRSLPTLRELHLSDNPLGDAGLRLLCEGLLDPQCHLEKLQLEYCRLTAASCEPLASVLRATRALKELTVSNNDIGEAGARVLGQGLADSACQLETL E TTS HHHHHHHHHHHHH TT EEE SS HHHHHHHHHHHHT TT EEE TTS HHHHHHHHHHHHH SS EEE TTS HHHHHHHH RLENCGLTPANCKDLCGIVASQASLRELDLGSNGLGDAGIAELCPGLLSPASRLKTLWLWECDITASGCRDLCRVLQAKETLKELSLAGNKLGDEGARLL HHHHTSTT EEE TTS BGGGHHHHHHHHHH SS EEE SSB HHHHHHHHHHHTTSSS EEE TTS HHHHHHHHHHHHH S EEE CESLLQPGCQLESLWVKSCSLTAACCQHVSLMLTQNKHLLELQLSSNKLGDSGIQELCQALSQPGTTLRVLCLGDCEVTNSGCSSLASLLLANRSLRELD TTSS HHHHHHHHHHHTSSS EEE TT HHHHHHHHHHHHH SS EEE LSNNCVGDPGVLQLLGSLEQPGCALEQLVLYDTYWTEEVEDRLQALEGSKPGLRVIS
Q08209 - 1AUI - position 1-521 sec-structure ( H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend ) aa-sequence S SSTTS B HHHHB TTS B HHHHHHHHHTT B HHHHHHHHHHHHHHHHTS SEEEE SSEEEE TT HHHHHHHHHHH TTT ATDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESVALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMKLFEVGGSPANTR EEE S SSSSS HHHHHHHHHHHHHHSTTTEEE TTSSHHHHHHSSHHHHHHHHS HHHHHHHHHHHTTS EEEETTTEEEESS TT SHHHH YLFLGDYVDRGYFSIECVLYLWALKILYPKTLFLLRGNHECRHLTEYFTFKQECKIKYSERVYDACMDAFDCLPLAALMNQQFLCVHGGLSPEINTLDDI HHS SSS SSSHHHHHHH EE TTTTS SS EEE TTTTSSEEE HHHHHHHHHHTT SEEEE S TTSEEE B TTTSSBSEEEE SSGG RKLDRFKEPPAYGPMCDILWSDPLEDFGNEKTQEHFTHNTVRGCSYFYSYPAVCEFLQHNNLLSILRAHEAQDAGYRMYRKSQTTGFPSLITIFSAPNYL GTS EEEEEEETTEEEEEEE GGG HHHHHHHHHHHHHHHHHHHHHTT HHHHHHHHGGGGS S HHHHHHHH DVYNNKAAVLKYENNVMNIRQFNCSPHPYWLPNFMDVFTWSLPFVGEKVTEMLVNVLNICS!SFEEAKGLDRINERMPPR!SYPLEMCSHFDADEIKRLG HHHHHH TT SEE HHHHTTSHHHHT TTHHHHHHHH TT SSSEEHHHHHHHHGGG TT HHHHHHHHHHHH TT SSEE HHHHHHHHHHHHTTSS KRFKKLDLDNSGSLSVEEFMSLPELQQNPLVQRVIDIFDTDGNGEVDFKEFIEGVSQFSVKGDKEQKLRFAFRIYDMDKDGYISNGELFQVLKMMVGNNL HHHHHHHHHHHHHHH TTSSSSEEHHHHHHHHGGG GGGG KDTQLQQIVDKTIINADKDGDGRISFEEFCAVVGGLDIHKKMVVDV
Q9X0E6 - 1KR4 - position 1-101 sec-structure ( H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend ) aa-sequence S EE EEEEEEEESSHHHHHHHHHHHHHTTS SEEEEEEEEEEEEETTEEEEEEEEEEEEEEEGGGHHHHHHHHHHH SSSS EEEE EEHHH AALYFXGHXILVYSTFPNEEKALEIGRKLLEKRLIACFNAFEIRSGYWWKGEIVQDKEWAAIFKTTEEKEKELYEELRKLHPYETPAIFTLKVENILTEY HHHHHHHTS XNWLRESVLGS
Comparison
|
This table shows the result of our comparison. Psipred performs on all targets way better than reprof using a single fasta file, but reprof outperforms psipred in 3/4 cases using a HHBlits PSSM as query, in case 4 they perform even. |