Lab journal
Multiple sequence alignment
The HFE protein consists of 3 domains: alpha 1 and 2 and a Immunoglobulin C1-set domain. Therefore we found two different pfam families that both match our protein:
Class I Histocompatibility antigen, domains alpha 1 and 2 (http://pfam.sanger.ac.uk/family/PF00129) and Immunoglobulin C1-set domain (http://pfam.sanger.ac.uk/family/PF07654#tabview=tab0).
We checked how much of out proteins sequence is covered by both fimilies:
immunoglobulin
# GS HFE_HUMAN/211-294 DR PDB; 1A6Z A; 189-272;
mhc class 1
# GS HFE_HUMAN/26-202 DR PDB; 1A6Z A; 4-180;
We downloaded the alignments in fasta format. Therefore, we first formatted the alignment to fasta format and then downloaded the .txt file. The alignment were then formatted again with a2m2aln:
/usr/share/freecontact/a2m2aln -q '^HFE_HUMAN/(\d+)' --quiet < imm.txt > imm.aln /usr/share/freecontact/a2m2aln -q '^HFE_HUMAN/(\d+)' --quiet < mhc.txt > mhc.aln /usr/share/freecontact/a2m2aln -q '^RASH_HUMAN/(\d+)' --quiet < ras.txt > ras.aln
freecontact with standard parameters and evfold as output format this was done with the following command:
freecontact -o evfold < imm.aln > imm_contacts.out freecontact -o evfold < mhc.aln > mhc_contacts.out freecontact -o evfold < ras.aln > ras_contacts.out
The contact predictions were then analysed using the following pipeline
- Remove contacts that have a distance of <= 5 residues in sequence
- Order descending by CN score and plot score distribution
- Select high scoring pairs
- Load reference PDB and determine minimum atom distance between all residue pairs
- determine pairs that have a minimum distance of below 5 A
Ras hotspots MHC I hotspots Ig C1 hotspots