Difference between revisions of "Project ideas"

From Protein Prediction 2 Winter Semester 2014
(Genome Browser)
(Genome Browser)
Line 148: Line 148:
 
* [http://genome.ucsc.edu/ UCSC genome browser]
 
* [http://genome.ucsc.edu/ UCSC genome browser]
   
Mentor: Miguel Pignatelli (EMBL-EBI) <emepyc@gmail.com>
+
Mentor: Miguel Pignatelli (EMBL-EBI) <emepyc@gmail.com>, Manuel Corpas (TGAC) mc@manuelcorpas.com<br>
 
Students: 3-4
 
Students: 3-4
   

Revision as of 13:50, 12 November 2014

Venn Diagram Viewer

Venn diagrams present a very popular method to display list comparisons. [Jvenn] is an interactive Venn diagram viewer written in JavaScript. The objective of this project would be to use the code base of Jvenn to make it compatible with BioJS2.0.
Literature: jvenn: an interactive Venn diagram viewer
Mentors: PP2_CS_2014 mentors
Students: 2

Jvenn example

<--

Protein Viewer

Visualization of PDB files - 3D structures of protein sequences

Similar projects:

Mentors: Björn Grüning (Galaxy) gruening. (at) .informatik.uni-freiburg.de
Students: 4-5

PDB structure

-->

Gene Cluster Viewer

The viewer is supposed to show the conserved gene order in prokaryotic genomes. The data will be derived from GenBank.

Source: Example for visualization
Mentors: Björn Grüning (Galaxy) gruening. (at) .informatik.uni-freiburg.de
Students: 2

Gene cluster

Dot-Bracket Notation 1

RNA secondary structure is often defined using Dot-Bracket Notation (DBN). Valid structures in DBN format are well-parenthesized words consisting of dots '.', opening '(' and closing ')' parentheses. Dotted positions are unpaired, whereas matching parenthesized positions represent base-pairing nucleotides. As the number of nucleotides interacting is always even (everyone must have a parter), the brackets must be balanced. Source: [Wikipedia: http://ultrastudio.org/en/Dot-Bracket_Notation]

Sources:

Mentors: Björn Grüning (Galaxy) gruening. (at) .informatik.uni-freiburg.de
Students: 2

RNA

Dot-Bracket Notation 2

This project deals with a slightly different representation of the Dot-Bracket Notation.

Sources:

Mentors: Björn Grüning (Galaxy) gruening. (at) .informatik.uni-freiburg.de
Students: 2

RNA

Pedigree Chart Visualization

A pedigree chart is a simple and easy to read diagram showing the occurrence and appearance or phenotypes of a particular gene in an organism and its ancestors. Pedigrees use a standardized set of symbols:

  • squares: males
  • circles: females
  • diamonds: the sex of the person is unknown
  • filled-in (darker) symbol: someone with the phenotype in question
  • shaded or half-filled symbol: heterozygotes
  • horizontal and a vertical line: connects parents to their offspring
  • ....

Literature:

Mentors: PP2_CS_2014 mentors
Students: 2

Pedigree chart

Sub-cellular localization in a cell

Archaea, Bacteria and Eukaryota form the three domains of life. Eukaryotic cells contain a nucleus and other membrane-bound organelles. The cells of archaea and bacteria in contrast are formed by a single compartment that is surrounded by the plasma membrane (Gram-negative bacteria have an additional outer membrane). The objective of this project is to visualize biological cells and highlight by a user selected sub-cellular compartments in a way that they stand out from the un-selected ones. Similar idea: The Compartments database
Mentors: PP2_CS_2014 mentors, Manuel Corpas (TGAC) mc. (at) .manuelcorpas.com
Students: 2

Pedigree chart

Force directed network (spring algorithm), Graph Viewer

The objective of this project is to visualize a network (large networks of >2000 nodes) in a way that the distance of a node from the rest of the network is determined by the number of nodes it is connected to => the more neighbors a node has the larger is its distance from the network. The component must allow zooming in/out, selection by the number of neighbors, coloring by various thresholds and other graph-related features.

Relevant sources:

Mentors: PP2_CS_2014 mentors, Yana Bromberg (Rutgers University), Björn Grüning (Galaxy) gruening. (at) .informatik.uni-freiburg.de

Students: 3-4

Graph

HSSP curve

The HSSP curve at a threshold of interest (HSSP value=0 is default) must be visualized in a 2D graph. Additionally, alignments of protein sequences, provided by the user, must be plotted on the graph.

Literature:

Mentors: PP2_CS_2014 mentors
Students: 2

HSSP curve

2D Chemical Components Visualizer

The goal is to automatically create 2D diagrams of chemical complexes with known 3D structure according to chemical drawing conventions.

Similar projects:

Mentors: Julian Heinrich (CSIRO) julian.heinrich. (at) .csiro.au, Björn Grüning (Galaxy) gruening. (at) .informatik.uni-freiburg.de
Students: 2-3

Poseview


BLAST visualization

BLAST finds regions of local similarity between sequences. It allows to search for genes, proteins and genome segments in databases like Uniprot or Genbank without the need to have an overlap or match in the database (in fact PSI-BLAST can find orthologs with even less than 30% sequence similarity). It is the best known algorithm in bioinformatics with more than 105 citations. The aim of this project to develop an interactive visualization for the result of BLAST - a component that in the end could be used by Uniprot

Mentors: Xavier Watkins x.watkins (at) ebi (dot) ac (dot) uk , Sebastian Wilzbach seb (at) wilzbach (dot) me Students: 2

Kablammo.png
BLAST result overview.png
BLAST outputbox.png

There is already a BioJS parser for the BLAST XML output.


Genome Browser

@Miguel, Manny: can you please add a description here?

We would like to have an integration of several views: Genome view, that includes all chromosomes, chromosome view (just one chromosome) and zoom view.

Journal.pone.0026345.g001.png

In this view different features are displayed for several people. Each person is a track. Clicking on a feature releases a pop up window with more info. Relevant sources:

Mentor: Miguel Pignatelli (EMBL-EBI) <emepyc@gmail.com>, Manuel Corpas (TGAC) mc@manuelcorpas.com
Students: 3-4

BigWig and BigBed File Viewers

The idea came from Saket, but Ricardo might be working on it already. Wrote these guys on email/Skype and awaiting reply.

Visualization of iAnn events

The iAnn calendar is one of the most used tools to annotate and curate scientific announcements. The idea of this project is to visualize iAnn announcements in the following ways:

  • as an interactive map
  • a table
  • and e.g. a pie chart or histograms showing statistics by various keywords (dates, country, field, etc.)

Relevant sources:

Mentor: Manuel Corpas (TGAC) mc. (at) .manuelcorpas.com
Students: 2-3

Poseview

Visualization of events on the GOBLET platform

Similar idea as for iAnn events -> visualization of events based on keywords

Sources:

Mentor: Manuel Corpas (TGAC) mc. (at) .manuelcorpas.com
Students: 2-3


Sources:

Mentor: Manuel Corpas (TGAC) mc. (at) .manuelcorpas.com
Students: 2

F1.large.jpg

Parser for GenBank format and visualization of annotations

Genbank is a Standard format for exchanging annotated sequence. Any bioinformatics library should be able to parse annotated sequence in Genbank format or generate Genbank file from annotated sequence. Genbank format is well documented: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html

It would be possible to use Genbank parser from Bio-projects like BioJava and BioPerl as a starting point. Parser that can highlight or extract annotated features will be very usefully to people developing web app for sequence visualization.

To get an idea what is expected from a project like this, take a look at this sequence: http://www.ncbi.nlm.nih.gov/nuccore/Z26331.1 and see what happens when you click on annotated features, like CDS, TATA_signal.

Mentors: Khalil El Mazouari khalil.elmazouari@gmail.com
Students: 2

Genbank.png

Graphical Model Editor

@Juanmi, can you please add a description here? Thanks :)

A Splice Junction Viewer

Screen Shot 2014-11-11 at 20.54.14.png

BAM files are next generation sequencing alignments of reads in compressed format. As part of the BioJS Google Summer of Code we developed a BAMviewer whose objective is to visualise these files in raw format as seen above.

[1]

We would like to know be able to take the information contained in the BAM files and develop a transcriptome assembly viewer. BAM files may contain only those bits of DNA that are transcribed to RNA. This is what we call the transcriptome, as opposed to genome. When reads in a BAM file come from transcribed bits of DNA one can assemble them like a puzzle. This transcriptome assembly is a crucial tool to understand the internal structure of how genes are organised and reveal biologically meaningful features that have been related to disease.

Screen Shot 2014-11-11 at 21.05.35.png