Canavan Task 3 - Sequence-based predictions
Oh, I would sing of mackerel skies, And why the sea is wet, Of jelly-fish and conger-eels, And things that I forget. (taken from "The Cumberbunce" by Paul West)
Contents
Protocol
Commands, Source Code and other methodocial issues are kept in the protocoll.
Secondary Structure Prediction
Information on Proteins
//TODO: picsPrediction of disordered regions
Transmembrane Helix Prediction
We analyzed the prediction of Transmembrane Helices for the proteins listed in <xr id="table_TMH_info"/> and for our protein Aspartoacylase. Next to Polyphobius, we also examined the results for other TMH Predictors, namely TMHMM, DAS and PHDthm.
Information on Proteins
<figtable id="table_TMH_info">
</figtable>
Aspartoacylase
TMH prediction of our Protein yielded the expected prediction of only cytoplasmic residues.
P35462
<figure id="CD_tm_3pbl">
</figure>
Polyphobius found 7 TMH for P35462. This is in accordance with the annotation for transmembrane regions in UniProt. There is only one structure listed for P35462: 3pbl. The annotation for TMH for this structure found in PDBTM and OPM also agrees with the Polyphobius result.
In <xr id="CD_tm_3pbl"/> the length distribution for the predicted and annotated TMH is depicted. One can see that PDBTM in general finds shorter TMH, whereas Polyphobius and OPM find longer helices.
<figtable id="table_2D57">
UniProt | PDBTM | OPM | Polyphobius | TMHMM | DAS(2.2 cutoff) | PHDthm |
33-55 | 35-52 | 34-52 | 30-55 | 32-54 | - | 31-55 |
66-88 | 68-84 | 67-91 | 66-88 | 67-89 | 85-101 | 65-90 |
105-126 | 109-123 | 101-126 | 105-126 | 104-126 | 117-139 | 101-130 |
150-170 | 152-166 | 150-170 | 150-170 | 150-172 | 155-171 | 151-170 |
188-212 | 191-206 | 187-209 | 188-212 | 192-214 | 202-219 | 188-213 |
330-351 | 334-347 | 330-351 | 329-352 | 331-353 | 241-261 | 331-353 |
367-388 | 368-382 | 363-386 | 367-386 | 368-390 | 381-398 | 362-387 |
</figtable>
P47863
<figure id="CD_tm_2D57">
</figure>
Polyphobius found 6 TMH for P47863. This is in accordance with the annotation for transmembrane regions in UniProt. There are three structures listed for P47863:
- 2D57 X-ray 3.20 A
- 2ZZ9 X-ray 2.80 A
- 3IYZ electron microscopy 10.00 A
Since 2ZZ9 is a mutant, we decided use 2D57 for comparison with the Phobius Output. Interestingly, OPM lists 8 TMH for P47863, whereas PDBTM agrees with the UniProt annotation and the Polyphobius output. The two additional TMH found by OPM are rather short (<10 AA) and therefore might be neglected. In <xr id="table_2D57"/> the AA positions for the predicted and annotated TMH are given.
<figtable id="table_2D57">
UniProt | Polyphobius | PDBTM | OPM |
37-57 | 34-58 | 39-55 | 34-56 |
65-85 | 70-91 | 72-89 | 70-88 |
95-106(loop) | 98-107 | ||
116-136 | 115-136 | 116-133 | 112-136 |
156-176 | 156-177 | 158-177 | 156-178 |
185-205 | 188-208 | 188-205 | 189-203 |
209-222(loop) | 214-223 | ||
232-252 | 231-252 | 231-248 | 231-252 |
</figtable>
For the remaining TMH, all three methods find TMH of about the same length.
In <xr id="CD_tm_2D57"/> the length distribution for the predicted and annotated TMH is depicted.
Q9YDF8
<figure id="CD_tm_Q9YDF8">
</figure>
For Q9YDF8, Polyphobius did not find any homologues with the blast search. Therefore, no homolgy information could be used for the TMH prediction.
In UniProt one can find the annotation for 6 TM regions and 2 intramembrane regions for this protein and lists four structures:
- 1ORQ X-ray 3.20 A 31-253
- 1ORS X-ray 1.90 A 33-160
- 2A0L X-ray 3.90 A 20-259
- 2KYH NMR - 19-160
Since in 1ORS, only residues 33-160 have been crystalized, we decided to use 1ORQ for comparison with the Polyphobius output. The TMH prediction done by Polyphobius in generel coincedes with the UniProt annotation. However, OPM and PDBTM list very diverse results. There is only a consensus on TMH 5 and 7. In <xr id="CD_tm_Q9YDF8"/> the length distribution for the TMH prediction is presented.
When comparing the annotation of OPM for the two structures 1orq and 1ors, one can find tremendous differences:
- 1ors: C - Tilt: 19° - Segments: 1(25-46), 2(55-78), 3(86-97), 4(100-107), 5(117-148)
- 1orq: C - Tilt: 31° - Segments: 1(153-172), 2(183-195), 3(207-225)
Yet, if one considers the sequence shift of 13 AA for the 1orq PDB sequence and the Q9YDF8 UniProt sequence (see <xr id=seq_shift />), both annotations together represent the identified TMH with Phobius. <figure id="seq_shift"> </figure>
Polyphobius finds 7 TMH, which correlates with the UniProt annotation:
UniProt | Polyphobius | PDBTM | OPM(=1ors&1orq) |
39 – 63 | 42-60 | 34-65(1ors:40-63) | 38-59 |
68 – 92 | 68-88 | 70-93(1ors:68-88) | 68-91 |
99-110 | |||
109 – 125 | 108-129 | iors:101-120 | 113-120 |
129 – 145 | 137-157 | 1ors:131-155 | 130-161 |
160 – 184 | 163-184 | 164-184 | 166-185 |
196 – 208(intramembrane) | 196-213 | 196-208 | |
222 – 253 | 224-244 | 222-249 | 220-238 |
other methods
We checked the results from other TMH prediction methods.
- TMHMM
- P35462 7TMH found:
- Q9YDF8 6TMH found: 39-61, 68-87, 107-129, 162-184, 199-218, 225-244
- P47863 6TMH found: 33-55, 70-92, 112-134, 154-176, 189-211, 231-253
- P45381 none found
- DAS
- P35462 6TMH found (with 2.2 cutoff): 85-101, 117-139, 155-171, 202-219, 241-261, 381-398
- Q9YDF8 7TMH found (with 2.2 cutoff): 135-148, 160-174, 200-209, 216-234, 256-271, 291-295, 314-333
- P47863 6TMH found (with 2.2 cutoff): 87-99, 115-128, 169-182, 205-219, 235-247, 281-291
- P45381 none found
Comparing the ouputs of TMHMM with the annotation from UniProt, PDBTM and OPM, one finds, that the prediction is very accurate and comparable with the Polyphobius prediction. I contrast, DAS gives completely different results. It finds a comparable amount of TMH but with a different localization within in the protein. DAS does not find TMH at the N-terminal end of the protein (confusion with signal peptides?) and rather locates TMH at the C-terminal end.
signal peptides
Checking for possible confusion TMH <=> signalpeptides with SignalP 4.0
Signal Peptide Prediction
Information on Proteins
GO terms and Pfam
Pfam
AstE_AspA family: Succinylglutamate desuccinylase / Aspartoacylase family