Reference Sequence BCKDHA

From Bioinformatikpedia
Revision as of 22:55, 20 May 2011 by Reisinger (talk | contribs) (Conservation and Gaps)

Sequence

  • Uniprot:

>sp|P12694|ODBA_HUMAN 2-oxoisovalerate dehydrogenase subunit alpha, mitochondrial OS=Homo sapiens GN=BCKDHA PE=1 SV=2

MAVAIAAARVWRLNRGLSQAALLLLRQPGARGLARSHPPRQQQQFSSLDDKPQFPGASAE FIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILY ESQRQGRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYG NISDLGKGRQMPVHYGCKERHFVTISSPLATQIPQAVGAAYAAKRANANRVVICYFGEGA ASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIAARGPGYGIMSIRVDG NDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHP ISRLRHYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQL RKQQESLARHLQTYGEHYPLDHFDK

Sequence info: [1] The Uniprot sequence is 445 aa long, as is contains the transit peptide sequence from position 1-45.

  • PDB:

>1U5B:A|PDBID|CHAIN|SEQUENCE

SSLDDKPQFPGASAEFIDKLEFIQPNVISGIPIYRVMDRQGQIINPSEDPHLPKEKVLKLYKSMTLLNTMDRILYESQRQ GRISFYMTNYGEEGTHVGSAAALDNTDLVFGQYREAGVLMYRDYPLELFMAQCYGNISDLGKGRQMPVHYGCKERHFVTI SSPLATQIPQAVGAAYAAKRANANRVVICYFGEGAASEGDAHAGFNFAATLECPIIFFCRNNGYAISTPTSEQYRGDGIA ARGPGYGIMSIRVDGNDVFAVYNATKEARRRAVAENQPFLIEAMTYRIGHHSTSDDSSAYRSVDEVNYWDKQDHPISRLR HYLLSQGWWDEEQEKAWRKQSRRKVMEAFEQAERKPKPNPNLLFSDVYQEMPAQLRKQQESLARHLQTYGEHYPLDHFDK

Sequence info: [2]


Sequence Alignments

Sequence searches

  • FASTA

../bin/fasta36 sequence.fasta database > FastaOutput.txt

  • BLAST

blastall -p blastp -d database -i sequence.fasta > BlastOutput.txt

  • PSIBLAST

blastpgp -i sequence.fasta -j iterations -h evalueCutoff -d database > PsiblastOutput.txt

  • HHSearch

hhsearch -i query -d database -o output.txt


database = /data/blast/nr/nr

Sequences chosen for the multiple Alignment:

SeqIdentifier Seq Identity source
99-90% Sequence Identity
56967006|pdb|1X7Z 99% PSI BLAST, 3 iterations, E-value cutoff 0.005
7546384|pdb|1DTW 95% BLAST
34810149|pdb|1OLU 99% PSI BLAST, 3 iterations, E-value cutoff 10E-6
13277798|gb|AAH03787.1 95% PSI BLAST, 3 iterations, E-value cutoff 10E-6
148727347|ref|NP_001092034.1 95% BLAST
89-60% Sequence Identity
196011048|ref|XP_002115388.1 66% PSI BLAST, 3 iterations, E-value cutoff 0.005
149543950|ref|XP_001517857.1 67% BLAST
47227873|emb|CAG09036.1 82,5% FASTA
47196273|emb|CAF88112.1 81% PSI BLAST, 5 iterations, E-value cutoff 0.005
12964598|dbj|BAB32665.1 88% PSI BLAST, 5 iterations, E-value cutoff 10E-6
59-40% Sequence Identity
193290664|gb|ACF17640.1 47% BLAST
215431443|ref|ZP_03429362.1 40% FASTA
225557347|gb|EEH05633.1 51% PSI BLAST, 3 iterations, E-value cutoff 10E-6
58267618|ref|XP_570965.1 50% PSI BLAST, 5 iterations, E-value cutoff 0.005
162449842|ref|YP_001612209.1 41% PSI BLAST, 5 iterations, E-value cutoff 10E-6
39-20% Sequence Identity
56966700|pdb|1W85 31% PSI BLAST, 3 iterations, E-value cutoff 0.005
5822330|pdb|1QS0 38.1% FASTA
13516864|dbj|BAB40585.1 33% PSI BLAST, 3 iterations, E-value cutoff 10E-6
284166853|ref|YP_003405132.1 35% PSI BLAST, 5 iterations, E-value cutoff 0.005
76800932|ref|YP_325940.1 34% PSI BLAST, 5 iterations, E-value cutoff 10E-6

Sequences for the Multiple Sequences Alignment were downloaded via NCBI, the sequence id can be changed in the link to retrieve the fasta format: http://www.ncbi.nlm.nih.gov/protein/76800932?report=fasta

Multiple Alignments

clustalw sequences.fasta

t_coffee -seq sequences.fasta

t_coffee -seq sequences.fasta -mode expresso

muscle -in sequences.fasta -out output.aln

download cobalt

./cobalt -i sequences.fasta -norps T > output.aln

Conservation and Gaps

Alignment methods Gaps Avg Gap Length
ClustalW 12 3,75
T-Coffee 25 4,56
T-Coffee (3D) 56 4,75
Cobalt 19 3,26
Muscle 17 4,76

back to Maple syrup urine disease main page