Difference between revisions of "Sequence and multiple alignments"

From Bioinformatikpedia
(Database search)
(Multiple Alignments)
Line 29: Line 29:
 
!colspan="4"| 99-90% Sequence Identity
 
!colspan="4"| 99-90% Sequence Identity
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|AAG29575.1||91%|| BLAST || xxx
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|1A6Z||90%|| BLAST || xxx
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|XP_002816556.1||97.4%|| FASTA || xxx
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|BAG62562.1||%||FASTA || xxx
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|AAH74721.1||%|| FASTA|| xxx
 
|-
 
|-
 
!colspan="4"| 89-60% Sequence Identity
 
!colspan="4"| 89-60% Sequence Identity
Line 65: Line 65:
 
!colspan="4"| 39-20% Sequence Identity
 
!colspan="4"| 39-20% Sequence Identity
 
|-
 
|-
  +
|AAA39567.1||33%|| FASTA || H-2D cell surface glycoprotein
|xxx||%|| xxx || xxx
 
 
|-
 
|-
  +
|NP_001029503.1||34%|| BLAST || zinc-alpha-2-glycoprotein precursor
|xxx||%|| xxx || xxx
 
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|CAB56609.1||37%|| BLAST || human leucocyte antigen A
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|CAF18417.1||36%|| BLAST || MHC class I antigen
 
|-
 
|-
|xxx||%|| xxx || xxx
+
|ACX42646.1||35.7%|| FASTA || MHC class I antigen
   
 
|}
 
|}

Revision as of 18:48, 23 May 2011

Sequence Alignments

Database search

Overlap.png

We used different tools for the search in a non-reduntant(NR) database like BLAST, FASTA and PSI-BLAST. Because of the absence of the database in a hmm format, we decided not to integrate the hhpred web-results. These results are not compareable.

We called BLAST and FASTA with the standart parameter. blast -p blastp -i 1a6z.fasta -d /data/nr/nr -o blastp_1a6z_on_NR.txt runtime 5:34
../../Desktop/fasta-36.3.4/bin/fasta36 -q 1a6z.fasta /data/nr/nr >fasta_1a5z_on_nr.txt the runtime was 10:44.

PSI-BLAST was used with different parameter settings. blastpgp -i 1a6z.fasta -d /data/nr/nr runtime 21:40
blastpgp -i 1a6z.fasta -d /data/nr/nr -j 6 -e 10E-6 > psiblast_on_NR_itera3_e-val_10E-6.txt runtime 23:49
blastpgp -i 1a6z.fasta -d /data/nr/nr -j 3 -e 10E-6 > psiblast_on_NR_itera6_e-val_10E-6.txt runtime 15:08
blastpgp -i 1a6z.fasta -d /data/nr/nr -j 3 > psiblast_on_NR_itera3.txt runtime 9:41
blastpgp -i 1a6z.fasta -d /data/nr/nr -j 6 > psiblast_on_NR_itera6.txt runtime 23:30
We also compared the runtime of the different tools. For PSI-BLAST, in our case, the e-Value and the number of iterations had no impact on the results.

Multiple Alignments

Used Sequences

SeqIdentifier Seq Identity source Protein function
99-90% Sequence Identity
AAG29575.1 91% BLAST xxx
1A6Z 90% BLAST xxx
XP_002816556.1 97.4% FASTA xxx
BAG62562.1 % FASTA xxx
AAH74721.1 % FASTA xxx
89-60% Sequence Identity
xxx % xxx xxx
xxx % xxx xxx
xxx % xxx xxx
xxx % xxx xxx
xxx % xxx xxx
59-40% Sequence Identity
xxx % xxx xxx
xxx % xxx xxx
xxx % xxx xxx
xxx % xxx xxx
xxx % xxx xxx
39-20% Sequence Identity
AAA39567.1 33% FASTA H-2D cell surface glycoprotein
NP_001029503.1 34% BLAST zinc-alpha-2-glycoprotein precursor
CAB56609.1 37% BLAST human leucocyte antigen A
CAF18417.1 36% BLAST MHC class I antigen
ACX42646.1 35.7% FASTA MHC class I antigen


The sequences with <20% and >99% sequence identitiy were ignored and 5 samples were randomly picked from the other ranges. So 20 sequences were available for the multiple alignments. Unfortunately no sequences in the range between 99-90% with known 3D-structure were found, so only sequences without known structure were used here.