Sequence-Based Predictions Hemochromatosis
Qc terxepssrw evi jypp sj aiewipw. Mrjsvq xli Uyiir, ws xlex wli qmklx wlss xliq eaec. Livi ai ks 'vsyrh xli qypfivvc fywl. Ks qsroic KS!
Aqw vjkpm K co etcba, dwv vjga ycpv aqw vq vjkpm vjcv. K mpqy ugetgvu. Mggr vjg rcpvcnqqpu. Cnycau mggr vjg rcpvcnqqpu.
Don't google it... but a hint: Caesar would solve it ;)
Short Task Description
Detailed description: Sequence-Based Predictions
- TODO: Task description
- TODO: Table numbers (once all tables are finished)
A protocol with a description of the data acquisition and other scripts used for this task is available here.
IUPred was employed to find disordered regions within HFE (Q30201), RNH1 (P10775), PPP3CA (Q08209), and cutA (Q9X0E6). The results are shown in <xr id="iupred"/>. DisProt was used to validate the predictions.
As shown in the upper left figure (<xr id="iupred"/>) Q30201 has two small regions (around residue 250 and 285) where it might be disordered. There is no entry for Q30201 in DisProt that would suggest that this is true and a sequence search (PsiBlast) against DisProt did not yield any significant results.
For P10775 no disordered regions are predicted (upper right figure in <xr id="iupred"/>). There is also no entry in DisProt. A PsiBlast search results in one significant hit (DP00554), but the alignment does not include the hit's disordered region (31-50).
DisProt does have an entry for Q08209 (DP00092). A PsiBlast search also results in an additional significant hit (DP00365), but the alignment does not contain the disordered region (19-147), so it can be discarded. A comparison between the DisProt Map (<xr id="map92"/>) and the IUPred prediction (lower left figure in <xr id="iupred"/>) shows that the general predictions are true, although IUPred inserts a small ordered region at the end of the protein (which should be disordered). The disordered regions from residue 374-486 are known to make a disorder-order transition which might cause IUPred's vague prediction within this section.
Neither IUPred (lower right figure in <xr id="iupred"/>) nor DisProt suggest any disordered regions for Q9X0E6.
IUPred seems to be quite accurate in predicting completely ordered proteins (P10775, Q9X0E6, and with the exception of the small peak in Q30201), but it seems to have problems with disordered regions where a disorder-order transition occurs.
Transmembrane helices were predicted with PolyPhobius for HFE (Q30201), DRD3 (P35462), Aquaporin-4 (P47863), and KvAP (Q9YDF8). The results were compared to OPM, PDBTM, and UniProt. The PDB IDs for OPM and PDBTM were chosen based on the following criteria:
- wildtype over mutant
- higher coverage
- better resolution
UniProt -> PDB mapping:
- P35462 -> 3PBL
- P47863 -> 2D57
- Q9YDF8 -> 1ORQ/1ORS
PolyPhobius predicts only one transmembrane helix for Q30201 (see <xr id="tmh_q30201"/>). There is no entry in OPM or PDBTM for either of its PDB IDs, but UniProt lists a TMH which almost exactly matches the predicted one (1-residue-shift).
For P35462 all methods list 7 transmembrane helices (<xr id="tmh_p35462"/>) which are consistent (regarding their positions) throughout all methods.
|P35462 (3PBL)||TMH 1||TMH 2||TMH 3||TMH 4||TMH 5||TMH 6||TMH 7|
PolyPhobius, UniProt, and PDBTM list 6 TMHs for P47863, OPM lists two additional TMHs (see <xr id="tmh_p47863"/>). These two regions are listed as "Membrane Loop" in PDBTM which might be the cause for the false entries in OPM.
|P47863 (2D57)||TMH 1||TMH 2||TMH 3||TMH 4||TMH 5||TMH 6||TMH 7||TMH 8|
Q9YDF8 seems to be the hardest one to predict TMHs for (cf. <xr id="tmh_q9ydf8"/>). PolyPhobius predicts an additional TMH (compared to UniProt); OPM and PDBTM need two PDB IDs to identify all (and "false") TMHs. Both PDB entries were adjusted for an AA shift of 13 residues.
PolyPhobius predicted a region (TMH7), labeled as "Intramembrane - Pore-Forming" in UniProt, as a (false) TMH. OPM also included this region and an additional one labeled as "Intramembrane - Helical" in UniProt. PDBTM lists TMH7 as "Membrane Loop".
|Q9YDF8 (1ORQ/1ORS)||TMH 1||TMH 2||TMH 3||TMH 4||TMH 5||TMH 6||TMH 7||TMH 8|
TODO: mean shifts, false/true positives, length (probably finished by sunday)
TODO: score description
SignalP (Webserver 4.0) predictions were made for HFE (Q30201), Aquaporin-4 (P47863), Lysosome-associated membrane glycoprotein 1 (P11279), and Serum albumin (P02768) in order to find signal peptides within these sequences. The results are shown in <xr id="signalp"/> and were compared to the corresponding entries in UniProt.
According to UniProt all four predictions are 100% precise:
- Q30201: signal peptide 1-22
- P47863: no signal peptide
- P11279: signal peptide 1-28
- P02768: signal peptide 1-18
This makes SignalP an excellent candidate for signal peptide predictions.
For the last part of this task we used GOPET and ProtFun to make a GO term prediction for the HFE protein (Q30201). We did also search for Pfam families. The results were then compared to UniProt and QuickGO.
GOPET predicts only two GO terms for our protein (see <xr id="gopet"/>) and even they are somewhat redundant (both are receptor activity). At least the results are correct in that HFE has kind of a receptor activity in that it binds to transferrin receptor (TFR).
|GO:0004872||F (Molecular Function Ontology)||91%||receptor activity|
|GO:0030106||F (Molecular Function Ontology)||88%||MHC class I receptor activity|
The results for the ProtFun prediction are shown in <xr id="protfun"/>. Predictions with a probability below 0.1 and odds below 1.0 are not shown to decrease the size of the table. ProtFun predicts "cell envelope" for the functional category. This is true as the HFE-TFR complex is located in the membrane. "Transport and binding" also has a high probability which corresponds with HFE's part in the iron transport within the body. HFE is categorized as "Nonenzyme" and no enzyme class was predicted. It is further predicted to be involved in "Immune response" as it is a protein of the major histocompatibility complex (MHC) class I.
|Biosynthesis of cofactors||0.105||1.452|
|Central intermediary metabolism||0.231||3.663|
|Fatty acid metabolism||0.016||1.265|
|Purines and pyrimidines||0.583||2.400|
|Transport and binding||0.732||1.785|
|Gene Ontology category|
Pfam lists two significant results for Q30201:
- MHC_I - Class I Histocompatibility antigen, domains alpha 1 and 2 (E-value 3.5e-43)
- C1-set - Immunoglobulin C1-set domain (E-value 2.8e-18)
MHC class I proteins are strongly involved in immune responses. UniProt also lists HFE in the MHC class I family and its structure (three extracellular domains, transmembrane region, cytoplasmic tail) fits. C1-set domains are associated with MHC class I proteins and HFE indeed contains such a domain (residues 207-298)
Compared to QuickGO which lists 27 unique GO terms for Q30201, GOPET predicts only two. Both of them not included in QuickGO's list. These two also seem to fit the HFE-TFR complex better than HFE alone, but at least the MHC class I tag shows specificity to HFE.
ProtFun's prediction seems more accurate as it successfully identifies HFE's location within the membrane and lists "Transport and binding" as a good second result. "Immune response" is also in accordance to QuickGO's terms.
Pfam's two predicted families were both true positives and it was more informative that the other two methods.
Overall none of them did identify HFE's part in the iron transport.