Sequence Search and Multiple Sequence Alignment (PKU)
Contents
Short Task Description
Perform database searches using different search tools with the PAH protein as query
Create and evaluate multiple sequence alignments
Reference Sequence of PAH
>sp|P00439|PH4H_HUMAN Phenylalanine-4-hydroxylase OS=Homo sapiens GN=PAH PE=1 SV=1 MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDV NLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPW FPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYM EEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF RLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFA QFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSE KPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKEKVRNFAATIPRPFSVRYDPYTQR IEVLDNTQQLKILADSINSEIGILCSALQKI
Database Searches
Blast
time blast2 -p blastp -d /mnt/project/pracstrucfunc12/data/big/big -i Dropbox/Phenylketonuria/Task1/PAH.fasta -o results_blast2_standard
real 1m47.401s user 1m25.290s sys 0m18.280s
time blast2 -p blastp -d /mnt/project/pracstrucfunc12/data/big/big -i Dropbox/Phenylketonuria/Task1/PAH.fasta -o results_blast2_e-10 -e 0.0000000001 -v 2000
real 1m35.454s user 1m21.700s sys 0m3.100s
HHBlits
time hhblits -i Dropbox/Phenylketonuria/Task1/PAH.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current -o results_hhblits_standard
real 6m10.059s user 3m15.640s sys 0m40.220s
hhblits -i Dropbox/Phenylketonuria/Task1/PAH.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current -o results_hhblits_n4_e-7 -n 4 -e 0.0000001 -o results_hhblits_n4_e-7
HHSearch
time hhsearch -i Dropbox/Phenylketonuria/Task1/PAH.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current_hhm_db
real 13m27.782s user 13m18.120s sys 0m8.480s
PDB
To get structural information not directly contained in big_80, we performed an additional search in PDB at NCBI, with parameters: scoring matrix=PAM70, gap open = 10, gap extend = 1, composition based statistics, max. 1000 target sequences.