Difference between revisions of "Sequence Alignments"

From Bioinformatikpedia
(Gaps in secondary structure)
(Sequence Alignments)
Line 75: Line 75:
 
Sequences for the Multiple Sequences Alignment were downloaded via NCBI, the sequence id can be changed in the link to retrieve the fasta format:
 
Sequences for the Multiple Sequences Alignment were downloaded via NCBI, the sequence id can be changed in the link to retrieve the fasta format:
 
http://www.ncbi.nlm.nih.gov/protein/76800932?report=fasta
 
http://www.ncbi.nlm.nih.gov/protein/76800932?report=fasta
  +
   
 
=== Multiple Alignments ===
 
=== Multiple Alignments ===
Line 94: Line 95:
   
 
./cobalt -i sequences.fasta -norps T > output.aln
 
./cobalt -i sequences.fasta -norps T > output.aln
  +
  +
   
 
=== Conservation and Gaps ===
 
=== Conservation and Gaps ===
Line 430: Line 433:
 
|Helix
 
|Helix
 
|}
 
|}
  +
  +
  +
   
 
=== References ===
 
=== References ===

Revision as of 23:32, 20 May 2011

Sequence Alignments

Sequence searches

  • FASTA

../bin/fasta36 sequence.fasta database > FastaOutput.txt

  • BLAST

blastall -p blastp -d database -i sequence.fasta > BlastOutput.txt

  • PSIBLAST

blastpgp -i sequence.fasta -j iterations -h evalueCutoff -d database > PsiblastOutput.txt

  • HHSearch

hhsearch -i query -d database -o output.txt


database = /data/blast/nr/nr

Sequences chosen for the multiple Alignment:

SeqIdentifier Seq Identity source
99-90% Sequence Identity
56967006|pdb|1X7Z 99% PSI BLAST, 3 iterations, E-value cutoff 0.005
7546384|pdb|1DTW 95% BLAST
34810149|pdb|1OLU 99% PSI BLAST, 3 iterations, E-value cutoff 10E-6
13277798|gb|AAH03787.1 95% PSI BLAST, 3 iterations, E-value cutoff 10E-6
148727347|ref|NP_001092034.1 95% BLAST
89-60% Sequence Identity
196011048|ref|XP_002115388.1 66% PSI BLAST, 3 iterations, E-value cutoff 0.005
149543950|ref|XP_001517857.1 67% BLAST
47227873|emb|CAG09036.1 82,5% FASTA
47196273|emb|CAF88112.1 81% PSI BLAST, 5 iterations, E-value cutoff 0.005
12964598|dbj|BAB32665.1 88% PSI BLAST, 5 iterations, E-value cutoff 10E-6
59-40% Sequence Identity
193290664|gb|ACF17640.1 47% BLAST
215431443|ref|ZP_03429362.1 40% FASTA
225557347|gb|EEH05633.1 51% PSI BLAST, 3 iterations, E-value cutoff 10E-6
58267618|ref|XP_570965.1 50% PSI BLAST, 5 iterations, E-value cutoff 0.005
162449842|ref|YP_001612209.1 41% PSI BLAST, 5 iterations, E-value cutoff 10E-6
39-20% Sequence Identity
56966700|pdb|1W85 31% PSI BLAST, 3 iterations, E-value cutoff 0.005
5822330|pdb|1QS0 38.1% FASTA
13516864|dbj|BAB40585.1 33% PSI BLAST, 3 iterations, E-value cutoff 10E-6
284166853|ref|YP_003405132.1 35% PSI BLAST, 5 iterations, E-value cutoff 0.005
76800932|ref|YP_325940.1 34% PSI BLAST, 5 iterations, E-value cutoff 10E-6

Sequences for the Multiple Sequences Alignment were downloaded via NCBI, the sequence id can be changed in the link to retrieve the fasta format: http://www.ncbi.nlm.nih.gov/protein/76800932?report=fasta


Multiple Alignments

clustalw sequences.fasta

t_coffee -seq sequences.fasta

t_coffee -seq sequences.fasta -mode expresso

muscle -in sequences.fasta -out output.aln

download cobalt

./cobalt -i sequences.fasta -norps T > output.aln


Conservation and Gaps

Alignment methods Gaps Conserved Columns
Gaps Avg Gap Length 100% cons >90% cons >80% cons >70% cons >60% cons >50% cons
ClustalW 12 3,75 24 136 53 60 71 91
T-Coffee 25 4,56 24 136 53 60 71 91
T-Coffee (3D) 56 4,75 21 326 83 92 67 71
Cobalt 19 3,26 24 128 68 56 101 72
Muscle 17 4,76 26 193 26 38 43 10


Gaps in secondary structure

ClustalW

ClustalW gaps and structure [1]
Gap position Gap length Secondary structure
109-110 4 Helix
142-143 1 Helix
235-236 1 Beta strand
276-277 11 Helix
294-295 1 Beta strand
394-395 5 Helix

T-Coffee

T-Coffee gaps and structure [2]
Gap position Gap length Secondary structure
141-142 1 Helix
232-233 1 Beta strand
275-276 11 Helix
310-311 1 Helix
369-370 5 Turn
395-396 18 Helix
398-399 5 Helix

T-Coffee 3d

T-Coffee (3D) gaps and structure [3]
Gap position Gap length Secondary structure
101-102 1 Helix
108-109 4 Helix
115-116 1 Helix
116-117 1 Helix
141-142 1 Helix
153-154 1 Beta strand
163-164 1 Helix
177-178 3 Helix
234-235 1 Beta strand
263-264 4 Beta strand
265-266 1 Beta strand
276-277 2 Helix
308-309 8 Helix
309-310 5 Helix
314-315 6 Helix
362-363 6 Helix
371-372 4 Turn
376-377 1 Helix
380-381 1 Helix
382-383 7 Helix
383-384 3 Helix
384-385 2 Helix
387-388 2 Helix
394-395 5 Helix

Cobalt

Cobalt gaps and structure [4]
Gap position Gap length Secondary structure
108-109 4 Helix
141-142 1 Helix
177-178 3 Helix
276-277 11 Helix
294-295 1 Beta strand
305-306 1 Helix
311-312 2 Helix
387-388 1 Helix
388-389 1 Helix
395-396 13 Helix

Muscle

Muscle gaps and structure [5]
Gap position Gap length Secondary structure
109-110 4 Helix
141-142 1 Helix
177-178 3 Helix
276-277 11 Helix
294-295 1 Beta strand
305-306 1 Helix
394-395 5 Helix



References

Secondary structure information


back to Maple syrup urine disease main page