Difference between revisions of "Canavan Task 3 - Sequence-based predictions"

From Bioinformatikpedia
(Q9YDF8)
(Transmembrane Helix Prediction)
Line 395: Line 395:
 
====Q9YDF8====
 
====Q9YDF8====
   
  +
<figure id="CD_tm_Q9YDF8">[[File:CD_tm_Q9YDF8.png|thumb|right|300px|<xr nolink id="CD_tm_Q9YDF8"/>]]</figure>
 
   
 
For Q9YDF8, Polyphobius did not find any homologues with the blast search. Therefore, no homolgy information could be used for the TMH prediction.
 
For Q9YDF8, Polyphobius did not find any homologues with the blast search. Therefore, no homolgy information could be used for the TMH prediction.
Line 500: Line 500:
 
</table>
 
</table>
 
</figtable>
 
</figtable>
  +
  +
<table align="center"><tr>
  +
<td>
  +
<figure id="CD_tm_Q9YDF8">[[File:CD_tm_Q9YDF8.png|thumb|right|400px|<xr nolink id="CD_tm_Q9YDF8"/>]]</figure>
  +
</td>
  +
<td>
  +
<figure id="vis_1orq">[[File:CD_1orq_ov.png|thumb|right|600px|<xr nolink id="vis_1orq"/>]]</figure>
  +
</td>
  +
</tr></table>
   
 
===other methods===
 
===other methods===

Revision as of 13:04, 17 May 2012

Oh, I would sing of mackerel skies,
And why the sea is wet,
Of jelly-fish and conger-eels,
And things that I forget. 

(taken from "The Cumberbunce" by Paul West)

Protocol

Commands, Source Code and other methodocial issues are kept in the protocoll.


Secondary Structure Prediction

Information on Proteins

//TODO: pics
Identifier P10775 Q08209 Q9X0E6
Protein Ribonuclease inhibitor Serine/threonine phosphatase (alt. name:Calcineurin Divalent-cation tolerance protein
Organism Sus scrofa (pig) Homo Sapiens Thermotoga maritima
Sequence length 456 521 101
Subcellular location Cytoplasm Nucleus Cytoplasm
PDB Identifier 2BNH 1AUI 1O5J
Structure CD 3PBL.jpg CD 1ORQ.jpg CD 2D57.jpg

Prediction of disordered regions

Transmembrane Helix Prediction

We analyzed the prediction of Transmembrane Helices for the proteins listed in <xr id="table_TMH_info"/> and for our protein Aspartoacylase. Next to Polyphobius, we also examined the results for other TMH Predictors, namely TMHMM, DAS and PHDhtm.

Information on Proteins

<figtable id="table_TMH_info"> <xr nolink id="table_TMH_info"/> Information on the proteins used for the evaluation of different TMH prediction methods.

Identifier P35462 Q9YDF8 P47863
Protein D(3) dopamine receptor Voltage-gated potassium channel Aquaporin-4
Organism Homo sapiens (Human) Aeropyrum pernix Rattus norvegicus (Rat)
Sequence length 400 295 323
Subcellular location Cell membrane; Multi-pass membrane protein Cell membrane; Multi-pass membrane protein Membrane; Multi-pass membrane protein
PDB Identifier 3PBL 1ORQ 2D57
Structure CD 3PBL.jpg CD 1ORQ.jpg CD 2D57.jpg

</figtable>

Aspartoacylase

TMH prediction of our Protein yielded the expected prediction of only cytoplasmic residues.


P35462

For P35462 there is only one structure listed in UniProt : 3pbl. For this structure, OPM and PDBTM list 7 TMH, which is the same amount of TMH that can be found in the Uniprot annotation for P35462. There is only a slight difference in the localization of the TMH. Usually, the annotation between these three references differs about 1-4 amino acid residues.

Except for DAS, all prediction methods yield the same result of TMH. Furthermore Polyphobius, TMHMM and PHDhtm predict the TMH with about the same small deviation, that exists between the annotations of UniProt, PDBTM and OPM.


In <xr id="table_3pbl"/> the exact localization of the TMH of the reference sources UniProt, PDBTM and OPM is listed as well as for the prediction methods Polyphobius, TMHMM, DAS and PHDthm. In <xr id="CD_tm_3pbl"/> the length distribution for the predicted and annotated TMH is depicted. One can see that PDBTM in general finds shorter TMH, whereas Polyphobius and OPM find longer helices. Furthermore the location of the TMH within the sequence is visualized in <xr id="vis_3pbl"/>.


<figtable id="table_3pbl" >

<xr nolink id="table_3pbl"/> AA Position of the predicted/annotated TMH for different methods/sources
UniProt PDBTM OPM Polyphobius TMHMM DAS(2.2 cutoff) PHDhtm
33-55 35-52 34-52 30-55 32-54 - 31-55
66-88 68-84 67-91 66-88 67-89 85-101 65-90
105-126 109-123 101-126 105-126 104-126 117-139 101-130
150-170 152-166 150-170 150-170 150-172 155-171 151-170
188-212 191-206 187-209 188-212 192-214 202-219 188-213
330-351 334-347 330-351 329-352 331-353 241-261 331-353
367-388 368-382 363-386 367-386 368-390 381-398 362-387

</figtable>


<figure id="CD_tm_3pbl">
<xr nolink id="CD_tm_3pbl"/>
</figure>
<figure id="vis_3pbl">
<xr nolink id="vis_3pbl"/>
</figure>

P47863

In UniProt there are several structures listed for P47863:

  • 2D57 X-ray 3.20 A
  • 2ZZ9 X-ray 2.80 A
  • 3IYZ electron microscopy 10.00 A

Since 2ZZ9 is a mutant, we decided to use 2D57 as a reference structure with OPM and PDBTM.

Interestingly, OPM lists 8 TMH for P47863, whereas PDBTM agrees with the UniProt annotation and lists 6 TMH. Yet, the two additional helices in OPM are rather short (<10 AA) and correspond to two loop segments in the PDBTM annotation.

Just as there is disagreement between the reference sources, the different prediction methods yield deviating results. Polyphobius and TMHMM predict 6 helices, which correspond to the 6 helices listed in UniProt, PDBTM and OPM. PHDhtm finds only 5 helices, of which helix 2 is about 60 amino residues long and matches helix 2 and 3 found by the other methods. This long helix also incorporates the loop region annotated in PDBTM and the additional helix listed in OPM. Therefore PHDhtm just merged these 3 structural elements into one helical region. DAS yields very divergent results, as already observed for P35462.


In <xr id="table_2D57"/> the exact localization of the TMH of the reference sources UniProt, PDBTM and OPM is listed as well as for the prediction methods Polyphobius, TMHMM, DAS and PHDthm. In <xr id="CD_tm_2D57"/> the length distribution for the predicted TMH with polyphobius and the annotated TMH is depicted. One can see that PDBTM in general finds shorter helices, wheras OPM and Polyphobius find longer ones. Furthermore the location of the TMH within the sequence is visualized in <xr id="vis_2D57"/>.


<figtable id="table_2D57" >

<xr nolink id="table_2D57"/> AA Position of the predicted/annotated TMH for different methods/sources
UniProt PDBTM OPM Polyphobius TMHMM DAS(cutoff 2.2) PHDhtm
37-57 39-55 34-56 34-58 33-55 34-56
65-85 72-89 70-88 70-91 70-92 70-137
95-106(loop) 98-107 87-99 70-137(cont)
116-136 116-133 112-136 115-136 112-134 115-128 70-137(cont)
156-176 158-177 156-178 156-177 154-176 169-182 156-176
185-205 188-205 189-203 188-208 189-211 190-210
209-222(loop) 214-223 205-219
232-252 231-248 231-252 231-252 231-253 235-247 224-250
281-291

</figtable>


<figure id="CD_tm_2D57">
<xr nolink id="CD_tm_2D57"/>
</figure>
<figure id="vis_2D57">
<xr nolink id="vis_2D57"/>
</figure>

Q9YDF8

For Q9YDF8, Polyphobius did not find any homologues with the blast search. Therefore, no homolgy information could be used for the TMH prediction. In UniProt one can find the annotation for 6 TM regions and 2 intramembrane regions for this protein and lists four structures:

  • 1ORQ X-ray 3.20 A 31-253
  • 1ORS X-ray 1.90 A 33-160
  • 2A0L X-ray 3.90 A 20-259
  • 2KYH NMR - 19-160

Since in 1ORS, only residues 33-160 have been crystalized, we decided to use 1ORQ for comparison with the Polyphobius output. The TMH prediction done by Polyphobius in generel coincedes with the UniProt annotation. However, OPM and PDBTM list very diverse results. There is only a consensus on TMH 5 and 7. In <xr id="CD_tm_Q9YDF8"/> the length distribution for the TMH prediction is presented.

When comparing the annotation of OPM for the two structures 1orq and 1ors, one can find tremendous differences:

  • 1ors: C - Tilt: 19° - Segments: 1(25-46), 2(55-78), 3(86-97), 4(100-107), 5(117-148)
  • 1orq: C - Tilt: 31° - Segments: 1(153-172), 2(183-195), 3(207-225)

Yet, if one considers the sequence shift of 13 AA for the 1orq PDB sequence and the Q9YDF8 UniProt sequence (see <xr id=seq_shift />), both annotations together represent the identified TMH with Phobius. <figure id="seq_shift"> 1orq seq shift.png</figure>

Polyphobius finds 7 TMH, which correlates with the UniProt annotation:


<figtable id="table_1orq" >

<xr nolink id="table_1orq"/> AA Position of the predicted/annotated TMH for different methods/sources
UniProt PDBTM OPM(=1ors&1orq) Polyphobius TMHMM PHDhtm
39 – 63 34-65(1ORS:40-63) 38-59 39-61 42-60 42-64
68 – 92 70-93(1ORS:68-88) 68-91 68-88 68-87 69-88
99-110 42-60
109 – 125 (1ORS:101-120) 113-120 108-129 107-129 107-149
129 – 145 (1ORS:131-155) 130-161 137-157 107-149(cont)
160 – 184 164-184 166-185 163-184 162-184 162-181
196 – 208(intramembrane) 196-208 196-213 199-218 197-212
222 – 253 222-249 220-238 224-244 225-244 220-247

</figtable>

<figure id="CD_tm_Q9YDF8">
<xr nolink id="CD_tm_Q9YDF8"/>
</figure>
<figure id="vis_1orq">
<xr nolink id="vis_1orq"/>
</figure>

other methods

Comparing the ouputs of TMHMM with the annotation from UniProt, PDBTM and OPM, one finds, that the prediction is very accurate and comparable with the Polyphobius prediction. I contrast, DAS gives completely different results. It finds a comparable amount of TMH but with a different localization within in the protein. DAS does not find TMH at the N-terminal end of the protein (confusion with signal peptides?) and rather locates TMH at the C-terminal end.

signal peptides

Checking for possible confusion TMH <=> signalpeptides with SignalP 4.0

SignlaP4.0 Prediction for P35462: no SignalPeptide was found
SignlaP4.0 Prediction for Q9YDF8: no SignalPeptide was found
SignlaP4.0 Prediction for P47863: no SignalPeptide was found

Signal Peptide Prediction

Information on Proteins

Identifier P02768 P11279 P47863
Protein Serum albumin Lysosome-associated membrane glycoprotein 1 Aquaporin-4
Organism Homo sapiens (Human) Homo sapiens (Human) Rattus norvegicus (Rat)
Sequence length 609 417 323
Subcellular location Secreted Lysosome membrane, Single-pass type I membrane protein Membrane; Multi-pass membrane protein
PDB Identifier 1E7I - 2D57
Structure CD 1E7I.jpg CD 2D57.jpg


GO terms and Pfam

Pfam

AstE_AspA family: Succinylglutamate desuccinylase / Aspartoacylase family