Difference between revisions of "Lab Journal - Task 6 (PAH)"
From Bioinformatikpedia
(→Multiple Sequence Alignment) |
(→Multiple Sequence Alignment) |
||
Line 1: | Line 1: | ||
== Multiple Sequence Alignment == |
== Multiple Sequence Alignment == |
||
The multiple alignments are downloaded from the PFAM server and are converted into a freecontact readable format using a2m2aln. |
The multiple alignments are downloaded from the PFAM server and are converted into a freecontact readable format using a2m2aln. |
||
+ | #Protein RASH: <br> <code>/usr/share/freecontact/a2m2aln -q '^RASH_HUMAN/(\d+)' --quiet < PF00071_full.txt > PF00071.aln </code> |
||
− | #Protein RASH: |
||
− | + | #For our protein PAH, we have two domains. As the Biopterin-domain is said to be causing PKU if damaged, we used the PFAM alignment of this domain:<br><code> /usr/share/freecontact/a2m2aln -q '^PH4H_HUMAN/(\d+)' --quiet < PF00351_full.txt > PF00351.aln</code> |
|
− | |||
− | #For our protein PAH, we have two domains. As the Biopterin-domain is said to be causing PKU if damaged, we used the PFAM alignment of this domain: |
||
− | /usr/share/freecontact/a2m2aln -q '^PH4H_HUMAN/(\d+)' --quiet < PF00351_full.txt > PF00351.aln |
||
== Calculate and analyze correlated mutations == |
== Calculate and analyze correlated mutations == |
Revision as of 11:55, 15 June 2013
Multiple Sequence Alignment
The multiple alignments are downloaded from the PFAM server and are converted into a freecontact readable format using a2m2aln.
- Protein RASH:
/usr/share/freecontact/a2m2aln -q '^RASH_HUMAN/(\d+)' --quiet < PF00071_full.txt > PF00071.aln
- For our protein PAH, we have two domains. As the Biopterin-domain is said to be causing PKU if damaged, we used the PFAM alignment of this domain:
/usr/share/freecontact/a2m2aln -q '^PH4H_HUMAN/(\d+)' --quiet < PF00351_full.txt > PF00351.aln
extract_pairs.pl extracts all residue pairs with distance >5.
freecontact -o evfold < 'PF00071.aln' > PF00071.evfold sort -k 6 -g -r PF00071.evfold >sort_PF00071.txt sort -k 6 -g -r PF00071_extract.evfold >sort_PF00071_extract.txt
Reference structure for Ras is 121p.
For our
freecontact -o evfold < 'PF00351.aln' > PF00351.evfold sort -k 6 -g -r PF00351.evfold >sort_PF00351.txt sort -k 6 -g -r PF00351_extract.evfold >sort_PF00351_extract.txt
CN_dist.R makes histograms and multiple histograms for the CN-Score distribution. Furthermore it calculates the top L-Score (L = protein length) for each residue i that belongs to the top L:
top L-Score(i) = (sum of CN scores for residue i)/mean(CN-Scores of top L)
Calculate structural model
The length of Pfam alignment of Ras is 160, therefore we take following number of contacts: 64, 104, 160.