BLAST Visualization

From Protein Prediction 2 Winter Semester 2014

BLAST visualization

The aim of this project is to build a BioJS component to visualise BLAST results in the context of protein information. The visualisation will display alignments in context (showing how much of the total protein is part of the alignment) alongside protein features. There should also be a way to visualise the alignment at the amino-acid level.

Current visualization

In the current Blast Visualization, focus is on the interactive table view. On the top it would be Query sequence and result is a graphical representation of the hits found. A table of sequence identifiers of the hits together with scoring information, and alignments of the query sequence and the hits. The alignments are color coded ranging from black to red as indicated in the color label at the top. It's not optimal solution. The top red colores means good, gray means intermediate and blue means bad matches,

Blast Visualization

Description

Blast


In the figure above, the query sequence (black) is displayed over the subject (red). The part of both sequences which constitutes the alignment are positioned together. The scale of both the query and subject are represented, both showing the full length of the each sequences (numbered scales at the top and bottom of the mockup). Protein features are displayed underneath the subject.

Status

First part

  • Study the blast Algorithm 
  • Challenging to find optimal solution how to make an simple and usefull BLAST visualization by reading several articles.
  • Already implement the Table-view visualization.

Next Part

  • Extending the current table view visualization to the interactive one
  • complete the feature view
FeatureView.jpg
  • complete the AA view
AAView.jpg
  • combine the last two steps to get the final visualization
  • Zooming on table view content leads to Feature view and for more detail, zooming helps to get to the AA view
  • working on visualization idea to come up with more reliable ideas

Features

  • Easily identify which organism the subject is from
  • Easily make decisions on which matches are the most relevant (this could be based on score, e-value, identity or features

Technical details

  • The visualisation uses the biojs-io-blast parser to read from a BLAST xml file (NCBI and EBI formats are supported).

Libraries and Standards

  • 2D View: D3,Backbone
  • 3D View: Three.js

Mockup

Sketch.png



The idea is to use a circle (rectangle or any other shape) to display the hits. In that way we probably would put the hit in the center and connect the elements of the circle with it. We would accordingly put all hits to the border of the circle and connect them with matching segment of the center. Thus maybe we have to split central query sequence into multiple segments.

Experiment

As an experiment we will try to render it in 3D with WebGl and Three.js and maybe use the score as height.

Roadmap

  • 11.12:Alpha prototype (plain old NCBI-like table + ordering + filtering)
  • 13.12:Deadline for the BLAST requirements
  • 20.12: Alpha prototype with the final requirements
  • 4.1: Beta prototype with the final requirements

people

  • Homa Rasouli
  • Sebastian Wilzbach