Sequence-Based Predictions Hemochromatosis

From Bioinformatikpedia
Revision as of 15:37, 19 May 2012 by Bernhoferm (talk | contribs) (Disorder)

Hemochromatosis>>Task 3: Sequence-based predictions

Qc terxepssrw evi jypp sj aiewipw. Mrjsvq xli Uyiir, ws xlex wli qmklx wlss xliq eaec. Livi ai ks 'vsyrh xli qypfivvc fywl. Ks qsroic KS!

Aqw vjkpm K co etcba, dwv vjga ycpv aqw vq vjkpm vjcv. K mpqy ugetgvu. Mggr vjg rcpvcnqqpu. Cnycau mggr vjg rcpvcnqqpu.

Don't google it... but a hint: Caesar would solve it ;)


Short Task Description

Detailed description: Sequence-Based Predictions


  • TODO: Task description
  • TODO: Table numbers (once all tables are finished)

Protocol

Protocol

Secondary Structure


Disorder

<figtable id="iupred">

Q30201
P10775
Q08209
Q9X0E6
Table XXX: IUPred predictions for Q30201, P10775, Q08209, and Q9X0E6. The figures show the disorder probability predicted for each amino acid residue (green line) and the 50% threshold (red line).

</figtable>

IUPred was employed to find disordered regions within HFE (Q30201), RNH1 (P10775), PPP3CA (Q08209), and cutA (Q9X0E6). The results are shown in <xr id="iupred"/>. DisProt was used to validate the predictions.


As shown in the upper left figure (<xr id="iupred"/>) Q30201 has two small regions (around residue 250 and 285) where it might be disordered. There is no entry for Q30201 in DisProt that would suggest that this is true and a sequence search (PsiBlast) against DisProt did not yield any significant results.

For P10775 no disordered regions are predicted (upper right figure in <xr id="iupred"/>). There is also no entry in DisProt. A PsiBlast search results in one significant hit (DP00554), but the alignment does not include the hit's disordered region (31-50).

DisProt does have an entry for Q08209 (DP00092). A PsiBlast search also results in an additional significant hit (DP00365), but the alignment does again not contain the disordered region (19-147), so it can be discarded. A comparison between the DisProt Map (REF) and the IUPred prediction (upper right figure in <xr id="iupred"/>) shows that the general predictions are true, although IUPred inserts a small ordered region at the end of the protein (which should be disordered).


Transmembrane Helices

<figtable id="tmh_q30201">

Q30201 TMH 1
PolyPhobius 306-329
UniProt 307-330
OPM no entry
PDBTM no entry
Table XXX: TMH predictions and annotations for Q30201. There were no entries for either of the two PDB IDs (1A6Z, 1DE4) in OPM or PDBTM.

</figtable>


<figtable id="tmh_p35462">

P35462 (3PBL) TMH 1 TMH 2 TMH 3 TMH 4 TMH 5 TMH 6 TMH 7
PolyPhobius 30-55 66-88 105-126 150-170 188-212 329-352 367-386
UniProt 33-55 66-88 105-126 150-170 188-212 330-351 367-388
OPM 34-52 67-91 101-126 150-170 187-209 330-351 363-386
PDBTM 35-52 68-84 109-123 152-166 191-206 334-347 368-382
Table XXX: TMH predictions and annotations for P35462 (PDB ID: 3PBL).

</figtable>


<figtable id="tmh_p47863">

P47863 (2D57) TMH 1 TMH 2 TMH 3 TMH 4 TMH 5 TMH 6 TMH 7 TMH 8
PolyPhobius 34-58 70-91 115-136 156-177 188-208 231-252
UniProt 37-57 65-85 116-136 156-176 185-205 232-252
OPM 34-56 70-88 98-107 112-136 156-178 189-203 214-223 231-252
PDBTM 39-55 72-89 95-106* 116-133 158-177 188-205 209-222* 231-248
Table XXX: TMH predictions and annotations for P47863 (PDB ID: 2D57). TMH3 and TMH7 (marked with *) are listed as "Membrane Loop" in PDBTM.

</figtable>


<figtable id="tmh_q9ydf8">

Q9YDF8 (1ORQ/1ORS) TMH 1 TMH 2 TMH 3 TMH 4 TMH 5 TMH 6 TMH 7 TMH 8
PolyPhobius 42-60 68-88 108-129 137-157 163-184 196-213 224-244
UniProt 39-63 68-92 97-105* 109-125 129-145 160-184 196-208* 222-253
OPM (1ORS) 38-59 68-91 99-110 113-120 130-161
OPM (1ORQ) 166-185 196-208 220-238
PDBTM (1ORS) 40-63 68-88 101-120 131-155
PDBTM (1ORQ) 34-65 70-93 164-184 197-213* 222-249
Table XXX: TMH predictions and annotations for Q9YDF8 (PDB IDs: 1ORQ, 1ORS). Residue positions are adjusted for the PDB sequence's 13AA shift. TMH3 is annotated as "Intramembrane, Helical" in UniProt, TMH7 as "Intramembrane, Pore-Forming". TMH7 is additionally marked as "Membrane Loop" in PDBTM.

</figtable>


Signal Peptides

<figtable id="signalp">

Q30201
P47863
P11279
P02768
Table XXX: SignalP predictions for Q30201, P47863, P11279, and P02768. Each figure shows the C-score, S-score, and Y-score per residue position for the corresponding protein.

</figtable>

TODO: score description

SignalP (Webserver 4.0) predictions were made for HFE (Q30201), Aquaporin-4 (P47863), Lysosome-associated membrane glycoprotein 1 (P11279), and Serum albumin (P02768) in order to find signal peptides within these sequences. The results are shown in <xr id="signalp"/> and were compared to the corresponding entries in UniProt.

According to UniProt all four predictions are 100% precise:

  • Q30201: signal peptide 1-22
  • P47863: no signal peptide
  • P11279: signal peptide 1-28
  • P02768: signal peptide 1-18


This makes SignalP an excellent candidate for signal peptide predictions.


GO Terms

For the last part of this task we used GOPET and ProtFun to make a GO term prediction for the HFE protein (Q30201). We did also search for Pfam families. The results were then compared to UniProt and QuickGO.


GOPET

GOPET predicts only two GO terms for our protein (see <xr id="gopet"/>) and even they are somewhat redundant (both are receptor activity). At least the results are correct in that HFE has kind of a receptor activity in that it binds to transferrin receptor (TFR).


<figtable id="gopet">

GOid Aspect Confidence Go term
GO:0004872 F (Molecular Function Ontology) 91% receptor activity
GO:0030106 F (Molecular Function Ontology) 88% MHC class I receptor activity
Table XXX: GO term prediction with GOPET for Q30201.

</figtable>


ProtFun

The results for the ProtFun prediction are shown in <xr id="protfun"/>. Predictions with a probability below 0.1 and odds below 1.0 are not shown to decrease the size of the table. ProtFun predicts "cell envelope" for the functional category. This is true as the HFE-TFR complex is located in the membrane. "Transport and binding" also has a high probability which corresponds with HFE's part in the iron transport within the body. HFE is categorized as "Nonenzyme" and no enzyme class was predicted. It is further predicted to be involved in "Immune response" as it is a protein of the major histocompatibility complex (MHC) class I.


<figtable id="protfun">

Functional category Probability Odds
Biosynthesis of cofactors 0.105 1.452
Cell envelope* 0.633* 10.377*
Cellular processes 0.095 1.297
Central intermediary metabolism 0.231 3.663
Fatty acid metabolism 0.016 1.265
Purines and pyrimidines 0.583 2.400
Translation 0.079 1.801
Transport and binding 0.732 1.785
Enzyme/nonenzyme
Enzyme 0.208 0.727
Nonenzyme* 0.792* 1.110*
Enzyme class
Hydrolase 0.135 0.425
Lyase 0.049 1.054
Gene Ontology category
Signal transducer 0.201 0.939
Receptor 0.353 2.076
Stress response 0.274 3.108
Immune response* 0.381* 4.486*
Table XXX: GO term prediction with ProtFun for Q30201. Entries marked with asterisks (*) had been deemed "true" by ProtFun. Results with a probability below 0.1 and odds below 1.0 are not shown.

</figtable>


Pfam

Pfam lists two significant results for Q30201:

  • MHC_I - Class I Histocompatibility antigen, domains alpha 1 and 2 (E-value 3.5e-43)
  • C1-set - Immunoglobulin C1-set domain (E-value 2.8e-18)

MHC class I proteins are strongly involved in immune responses. UniProt also lists HFE in the MHC class I family and its structure (three extracellular domains, transmembrane region, cytoplasmic tail) fits. C1-set domains are associated with MHC class I proteins and HFE indeed contains such a domain (residues 207-298)


Comparison

Compared to QuickGO which lists 27 unique GO terms for Q30201, GOPET predicts only two. Both of them not included in QuickGO's list. These two also seem to fit the HFE-TFR complex better than HFE alone, but at least the MHC class I tag is shows specificity to HFE.

ProtFun's prediction seems more accurate as it successfully identifies HFE's location within the membrane and lists "Transport and binding" as a good second result. "Immune response" is also in accordance to QuickGO's term.

Pfam's two predicted families were both true positives and it was more informative that the other two methods.

Overall none of them did identify HFE's part in the iron transport.