Dot-Bracket Notation

From Protein Prediction 2 Winter Semester 2014
Revision as of 14:50, 3 January 2015 by Ppwikiuser (talk | contribs)

The aim of the project is to visualize RNA secondary structures. RNAs are chains of ribonucleotides which form complex two-dimensional structures through the formation of hydrogen bonds between cytosine and guanine, between adenine and uracil and between guanine and uracil.

Core Functionality

The main task of the visualization is the following:

The program gets an RNA secondary structure in Dot-Bracked (Vienna) Notation as input. This inputs consists of two strings where the first one is the RNA sequence, and the second string sequence of dots, round, and square brackets with the same length as the RNA sequence. For more information, please see Dot-Bracket Notation.

From this input a graph-like visualization of the RNA's secondary structure is created. In this representation the nodes refer to the ribonucleotides and the edges are the hydrogen bonds that connect them. The graph is connected and undirected. The visualization is done using Cytoscape JS with a the Preset Layout. The coordinates for each nucleotide are done as in RnaViz and the radial layout of VARNA.

Illustration of the desired core functionality of the secondary structure visualization.
Illustration of the desired core functionality of the secondary structure visualization.
Current state of the art.
Current state of the art.

The figures on the right display the original aim of the visualization as well as the current state of the visualization as done by our Javascript component. The figure shows only the visualization itself, the actual Javascript component contains also a number of options which can be used to customize and edit the visualization. A link to the complete and working program can be found in the Progress-section.

Adding Additional Functionality to the Visualization

In addition to the core visualization functionality we implemented a number of options to change and customize the visualization. Our Javascript components supports the following

  • Display nucleotide index on mouse-over
  • Drag nodes and zoom in and out to enable improved examination of the structure
  • Make the visualization editable: It is possible to add new hydrogen bonds. Resulting changes to the original Dot-Bracket notation of the structure will be considered automatically.
  • Custom color coding of the nucleotides

Things that remain to be done

Dot-Bracket Notation is not the only popular way to represent RNA secondary structure. Other commonly used formats are:

Other things that could still be added:

  • More edge types, e.g. red edges that mark pairing violations
  • Better support of the IUPAC code through improved color coding
  • Changing nucleotides through Javascript events
  • Better support of missing sequence information, also through colors

Progress

Task Implemented
Core Functionality Yes
Dragging nodes and zooming Yes
Display additional info on mouse-over Yes
Edit graph and export changes Yes, partially
PBseq format compatibility No
Connect (.ct) format compatibility No
Change nucleotide colors Yes
Change nucleotide type No
Additional color coding for IUPAC/pairing violations/missing sequence No

Current preliminary version of the implementation: RNA Secondary Structure Visualization

Source Code

People

Additional Links