Canavan Task 3 - Sequence-based predictions

From Bioinformatikpedia
Revision as of 15:00, 15 May 2012 by Vorbergs (talk | contribs) (other methods)
Oh, I would sing of mackerel skies,
And why the sea is wet,
Of jelly-fish and conger-eels,
And things that I forget. 

(taken from "The Cumberbunce" by Paul West)

Protocol

Commands, Source Code and other methodocial issues are kept in the protocoll.


Secondary Structure Prediction

Information on Proteins

//TODO: pics
Identifier P10775 Q08209 Q9X0E6
Protein Ribonuclease inhibitor Serine/threonine phosphatase (alt. name:Calcineurin Divalent-cation tolerance protein
Organism Sus scrofa (pig) Homo Sapiens Thermotoga maritima
Sequence length 456 521 101
Subcellular location Cytoplasm Nucleus Cytoplasm
PDB Identifier 2BNH 1AUI 1O5J
Structure CD 3PBL.jpg CD 1ORQ.jpg CD 2D57.jpg

Prediction of disordered regions

Transmembrane Helix Prediction

Information on Proteins

Identifier P35462 Q9YDF8 P47863
Protein D(3) dopamine receptor Voltage-gated potassium channel Aquaporin-4
Organism Homo sapiens (Human) Aeropyrum pernix Rattus norvegicus (Rat)
Sequence length 400 295 323
Subcellular location Cell membrane; Multi-pass membrane protein Cell membrane; Multi-pass membrane protein Membrane; Multi-pass membrane protein
PDB Identifier 3PBL 1ORQ 2D57
Structure CD 3PBL.jpg CD 1ORQ.jpg CD 2D57.jpg

TM prediction of our Protein yielded the expected prediction of only cytoplasmic residues.


P35462

<figure id="CD_tm_3pbl">

<xr nolink id="CD_tm_3pbl"/>

</figure>

Polyphobius found 7 TMH for P35462. This is in accordance with the annotation for transmembrane regions in UniProt. There is only one structure listed for P35462: 3pbl. The annotation for TMH for this structure found in PDBTM and OPM also agrees with the Polyphobius result.

In <xr id="CD_tm_3pbl"/> the length distribution for the predicted and annotated TMH is depicted. One can see that PDBTM in general finds shorter TMH, whereas Polyphobius and OPM find longer helices.

<figtable id="table_2D57">



AA Position of the predicted/annotated TMH for different methods/sources
UniProt Polyphobius PDBTM OPM
33-55 30-55 35-52 34-52
66-88 66-88 68-84 67-91
105-126 105-126 109-123 101-126
150-170 150-170 152-166 150-170
188-212 188-212 191-206 187-209
330-351 329-352 334-347 330-351
367-388 367-386 368-382 363-386

</figtable>

P47863

<figure id="CD_tm_2D57">

<xr nolink id="CD_tm_2D57"/>

</figure>

Polyphobius found 6 TMH for P47863. This is in accordance with the annotation for transmembrane regions in UniProt. There are three structures listed for P47863:

  • 2D57 X-ray 3.20 A
  • 2ZZ9 X-ray 2.80 A
  • 3IYZ electron microscopy 10.00 A

Since 2ZZ9 is a mutant, we decided use 2D57 for comparison with the Phobius Output. Interestingly, OPM lists 8 TMH for P47863, whereas PDBTM agrees with the UniProt annotation and the Polyphobius output. The two additional TMH found by OPM are rather short (<10 AA) and therefore might be neglected. In <xr id="table_2D57"/> the AA positions for the predicted and annotated TMH are given.

<figtable id="table_2D57">



AA Position of the predicted/annotated TMH for different methods/sources
UniProt Polyphobius PDBTM OPM
37-57 34-58 39-55 34-56
65-85 70-91 72-89 70-88
95-106(loop) 98-107
116-136 115-136 116-133 112-136
156-176 156-177 158-177 156-178
185-205 188-208 188-205 189-203
209-222(loop) 214-223
232-252 231-252 231-248 231-252

</figtable>


For the remaining TMH, all three methods find TMH of about the same length. In <xr id="CD_tm_2D57"/> the length distribution for the predicted and annotated TMH is depicted.

Q9YDF8

<figure id="CD_tm_Q9YDF8">

<xr nolink id="CD_tm_Q9YDF8"/>

</figure>

For Q9YDF8, Polyphobius did not find any homologues with the blast search. Therefore, no homolgy information could be used for the TMH prediction. In UniProt one can find the annotation for 6 TM regions and 2 intramembrane regions for this protein and lists four structures:

  • 1ORQ X-ray 3.20 A 31-253
  • 1ORS X-ray 1.90 A 33-160
  • 2A0L X-ray 3.90 A 20-259
  • 2KYH NMR - 19-160

Since in 1ORS, only residues 33-160 have been crystalized, we decided to use 1ORQ for comparison with the Polyphobius output. The TMH prediction done by Polyphobius in generel coincedes with the UniProt annotation. However, OPM and PDBTM list very diverse results. There is only a consensus on TMH 5 and 7. In <xr id="CD_tm_Q9YDF8"/> the length distribution for the TMH prediction is presented.

When comparing the annotation of OPM for the two structures 1orq and 1ors, one can find tremendous differences:

  • 1ors: C - Tilt: 19° - Segments: 1(25-46), 2(55-78), 3(86-97), 4(100-107), 5(117-148)
  • 1orq: C - Tilt: 31° - Segments: 1(153-172), 2(183-195), 3(207-225)

Yet, if one considers the sequence shift of 13 AA for the 1orq PDB sequence and the Q9YDF8 UniProt sequence (see <xr id=seq_shift />), both annotations together represent the identified TMH with Phobius. <figure id="seq_shift"> 1orq seq shift.png</figure>

Polyphobius finds 7 TMH, which correlates with the UniProt annotation:

UniProt Polyphobius PDBTM OPM(=1ors&1orq +13AA)
39 – 63 42-60 34-65(1ors:40-63) 38-59
68 – 92 68-88 70-93(1ors:68-88) 68-91
99-110
109 – 125 108-129 iors:101-120 113-120
129 – 145 137-157 1ors:131-155 130-161
160 – 184 163-184 164-184 166-185
196 – 208(intramembrane) 196-213 196-208
222 – 253 224-244 222-249 220-238

other methods

We checked the results from other TMH prediction methods.

  • TMHMM
    • P35462 7TMH found: 32-54, 67-89, 104-126, 150-172, 192-214, 331-353, 368-390
    • Q9YDF8 6TMH found: 39-61, 68-87, 107-129, 162-184, 199-218, 225-244
    • P47863 6TMH found: 33-55, 70-92, 112-134, 154-176, 189-211, 231-253
    • P45381 none found
  • DAS
    • P35462 6TMH found (with 2.2 cutoff): 85-101, 117-139, 155-171, 202-219, 241-261, 381-398
    • Q9YDF8 7TMH found (with 2.2 cutoff): 135-148, 160-174, 200-209, 216-234, 256-271, 291-295, 314-333
    • P47863 6TMH found (with 2.2 cutoff): 87-99, 115-128, 169-182, 205-219, 235-247, 281-291
    • P45381 none found

Signal Peptide Prediction

Information on Proteins

Identifier P02768 P11279 P47863
Protein Serum albumin Lysosome-associated membrane glycoprotein 1 Aquaporin-4
Organism Homo sapiens (Human) Homo sapiens (Human) Rattus norvegicus (Rat)
Sequence length 609 417 323
Subcellular location Secreted Lysosome membrane, Single-pass type I membrane protein Membrane; Multi-pass membrane protein
PDB Identifier 1E7I - 2D57
Structure CD 1E7I.jpg CD 2D57.jpg


GO terms and Pfam

Pfam

AstE_AspA family: Succinylglutamate desuccinylase / Aspartoacylase family