Sub Cellular Localization

From Protein Prediction 2 Winter Semester 2014
Revision as of 12:13, 6 January 2015 by Ppwikiuser (talk | contribs)

Sub Cellular Localization in Cell

Objective

"To visualize biological cells and highlight by a user selected sub-cellular compartments in a way that they stand out from the un-selected ones"

GUI mockups

Case 1

One protein, one localization:


Figure 1 : Protein's score in cell compartment (One protein)

Case 2

Multiple proteins, multiple localizations:

Figure 2 : Number of Proteins in cell compartment (more than one protein)
Figure 3 : Protein's score in cell compartment (more than one protein)

Requirements

  • User experience
    • Interactivity
    • Easy identification of the number of proteins in cell compartment
    • Good visualization of data
  • Functionality
    • Parsing the user input “txt” file
    • Mapping the file content into visualization
  • Features
    • Good color scheme to highlight :
      • Number of proteins in a compartment
      • Score of each protein (confidence)

Application design

Data

  • Remarks about input format
    • Text file must contain proteins of only one type of cell
    • Text file format :
      • Input data file should be either a ".txt" or ".csv" file.
      • File’s first line should contain cell type (i.e: eukaryota, archea, bacteria).
      • File's second line should contain Score (i.e: 0-100). Note: Minimum score should be zero.
      • File's third line should contain columns description (i.e: Protein Id, Score, Localization).
      • User’s file can have more than 3 columns but additional columns will not be executed.


Figure 4 : Text File format

Roadmap

Figure 5 : Roadmap
Project progress
Week number Completed tasks Status
Week 1
  • Researching Biology literature in order to identify proteins belonging to different cells (Archaea / Bacteria / Eukaryota)
  • Identify existing tools or libraries which can help us in visualizing input data.
  • Identification of importance of protein localization in cell's compartments.
  • Obtaining images of Archea, Bacteria and Eukaryota.
  • Creation of repository in github.
  • Definition of input file format.
Check.jpg
Week 2
  • Coding for uploading file.
  • Parsing input file.
    • Counting number of proteins in cell's compartments.
    • Storing protein's score.
  • Writing functions for identifiying "x" and "y" positions of cell's compartments.
  • Highlighting cell's compartments (static).
Check.jpg
Week 3
  • Visualization by:
    • Grouping proteins based on localization in the cell component.
    • Highlighting cell's compartments (dynamic).
    • Updating cell’s compartments based on particular protein’s score/Confidence (dynamic).
Check.jpg
Week 4
  • Development of deployable component for reuse by BioJS.
Week 5
  • Final presentation of project.


Implementation

What we did

  • First, all the cell compartments were identified by the paths that were marked using the GIMP image editor.
  • Using the information from the user's input file, the number of proteins present in each compartment were determined.
  • Each compartment was highlighted using a localization color scale, which was obtained by converting the number of proteins present in each localization into a percentage and matched to a color.
  • A table to display all the proteins present in each cell compartment and their scores.
  • Upon 'mouseover' over a compartment, a tooltip was displayed, which shows the proteins present in that cell compartment, and the score (confidence) of the protein.
  • The proteins displayed in the tooltip were made clickable so that the cell image could be updated to reflect the scores of the protein in all the cell compartments.
  • Finally, the cell compartments were highlighted using a score color scale, which was obtained by mapping the score of a protein in each cell compartment to a color.

Screenshots

  • Parsing input file
Figure 6 : Validation File size
Figure 7 : Validation Cell type
Figure 8 : Validation Use cases 1 & 2
  • Highlight cell's compartments following number of proteins and score
Figure 9 : Highlight cell's compartments algorithm
  • Showing popup, in case one protein and more than one protein in input file
Figure 10 : Showing popup's algorithm
  • Output
Figure 11 : Home page
Figure 12 : Loading of input file.
Figure 13 : 1) Highlighting of eukaryota cell's compartments by number of proteins.
Figure 14 : 2) Highlighting of eukaryota cell's compartments by number of proteins.
Figure 15 : Highlighting of eukaryota cell's compartments by protein score.


Figure 16 : 1) Highlighting of archaea cell's compartments by number of proteins.
Figure 17 : 2) Highlighting of archaea cell's compartments by number of proteins.
Figure 18 : Highlighting of archaea cell's compartments by protein score.


Figure 19 : 1) Highlighting of bacteria cell's compartments by number of proteins.
Figure 20 : 2) Highlighting of bacteria cell's compartments by number of proteins.
Figure 21 : Highlighting of baceria cell's compartments by protein score.


Figure 22 : About page which contains information about the project.

GitHub Link

The link to the GitHub account is : GitHub Sub-cellular localization in cell