Difference between revisions of "Sub Cellular Localization"

From Protein Prediction 2 Winter Semester 2014
(Roadmap)
 
(51 intermediate revisions by the same user not shown)
Line 1: Line 1:
Sub Cellular Localization in Cell
 
 
 
= Objective =
 
= Objective =
   
Line 7: Line 5:
 
= GUI mockups =
 
= GUI mockups =
   
  +
==Case 1==
[[File:Mock1.png|thumb|center|700px|720px|Figure 1 : Number of Proteins in cell compartment]]
 
  +
  +
One protein, one localization:
  +
  +
  +
[[File:Mockup1_1.png|thumb|center|700px|700px|Figure 1 : Protein's score in cell compartment (One protein)]]
  +
  +
==Case 2==
  +
  +
Multiple proteins, multiple localizations:
  +
  +
[[File:Mockup1_2.png|thumb|center|700px|720px|Figure 2 : Number of Proteins in cell compartment (more than one protein)]]
  +
  +
[[File:Mockup1_3.png|thumb|center|700px|700px|Figure 3 : Protein's score in cell compartment (more than one protein)]]
   
  +
==Requirements==
[[File:Mock2.png|thumb|center|700px|700px|Figure 2 : Protein's score in cell compartment]]
 
   
 
*User experience
 
*User experience
Line 28: Line 39:
 
**Correct identification of localization
 
**Correct identification of localization
 
*Libraries
 
*Libraries
  +
**D3 Data-Driven Documents - http://d3js.org/
**D3 - Zoomdata
 
  +
***https://github.com/mbostock/d3/blob/master/d3.min.js
***https://live.zoomdata.com/zoomdata/visualization#51db7ad4e4b04caf9ab346db-51db7ad4e4b04caf9ab346d5
 
  +
*Tools
  +
**GIMP Image editor
   
 
= Data =
 
= Data =
   
 
*Remarks about input format
 
*Remarks about input format
**Each text file must contain proteins of only one type of cell
+
**Text file must contain proteins of only one type of cell
**Text file format:
+
**Text file format :
  +
***Input data file should be either a ".txt" or ".csv" file.
***Protein id
 
  +
***File’s first line should contain cell type (i.e: eukaryota, archea, bacteria).
***Score (certainty of existence of particular protein within a cell compartment)
 
  +
***File's second line should contain Score (i.e: 0-100). Note: Minimum score should be zero.
***Localization (component)
 
  +
***File's third line should contain columns description (i.e: Protein Id, Score, Localization).
***Cell type
 
  +
***User’s file can have more than 3 columns but additional columns will not be executed.
   
  +
[[File:Format.png|thumb|center|700px|720px|Figure 3 : Text File format]]
 
  +
[[File:inputfile.png|thumb|center|700px|720px|Figure 4 : Text File format]]
   
 
= Roadmap =
 
= Roadmap =
   
[[File:Roadmap.png|thumb|center|700px|720px|Figure 4 : Roadmap]]
+
[[File:Roadmap.png|thumb|center|700px|720px|Figure 5 : Roadmap]]
   
 
{| border="1" class="wikitable"
 
{| border="1" class="wikitable"
Line 64: Line 79:
   
 
|[[File:check.jpg|center|50px|50px]]
 
|[[File:check.jpg|center|50px|50px]]
 
 
|-
 
|-
 
! Week 2
 
! Week 2
Line 73: Line 87:
 
** Storing protein's score.
 
** Storing protein's score.
 
* Writing functions for identifiying "x" and "y" positions of cell's compartments.
 
* Writing functions for identifiying "x" and "y" positions of cell's compartments.
* Highlighting cell's compartments upon "mouse-over" event.
+
* Highlighting cell's compartments (static).
  +
|[[File:check.jpg|center|50px|50px]]
|
 
 
|-
 
|-
   
Line 81: Line 95:
 
* Visualization by:
 
* Visualization by:
 
** Grouping proteins based on localization in the cell component.
 
** Grouping proteins based on localization in the cell component.
** Updating cell’s compartments based on particular protein’s score (Confidence).
+
** Highlighting cell's compartments (dynamic).
  +
** Updating cell’s compartments based on particular protein’s score/Confidence (dynamic).
|
 
  +
|[[File:check.jpg|center|50px|50px]]
 
|-
 
|-
   
 
! Week 4
 
! Week 4
 
|
 
|
  +
* Bug fixing
* Development of deployable component for reuse by BioJS.
 
  +
|[[File:check.jpg|center|50px|50px]]
|
 
 
|-
 
|-
   
Line 94: Line 109:
 
|
 
|
 
* Final presentation of project.
 
* Final presentation of project.
  +
|[[File:check.jpg|center|50px|50px]]
  +
|-
  +
  +
! Week 6
 
|
 
|
  +
* Development of deployable component for reuse by BioJS.
 
  +
|[[File:check.jpg|center|50px|50px]]
  +
|-
   
 
|}
 
|}
 
   
 
= Implementation =
 
= Implementation =
   
  +
==What we did==
[[File:algorithm.png|thumb|center|500px|500px|Figure 4 : Generic algorithm for visualization]]
 
  +
  +
* First, all the cell compartments were identified by the paths that were marked using the GIMP image editor.
  +
* The images of the different cell compartments were saved in svg file format.
  +
* Using the information from the user's input file, the number of proteins present in each compartment were determined.
  +
* Each compartment was highlighted using a localization color scale, which was obtained by converting the number of proteins present in each localization into a percentage of the total number of proteins and matched to a color.
  +
* A table to display all the proteins present in each cell compartment and their scores.
  +
* Upon 'mouseover' over a compartment, a tooltip was displayed, which shows the proteins present in that cell compartment, and the score (confidence) of the protein.
  +
* The proteins displayed in the tooltip were made clickable so that the cell image could be updated to reflect the scores of the protein in all the cell compartments.
  +
* Finally, the cell compartments were highlighted using a score color scale, which was obtained by mapping the score of a protein in each cell compartment to a color.
  +
  +
==Screenshots==
  +
  +
*Marking cell compartments using GIMP image editor
  +
  +
[[File:Gimp_imagepath.png|thumb|center|500px|500px|Figure 6 : Marked cell compartments of cell type Bacteria]]
  +
  +
*Parsing input file
  +
  +
[[File:Algorithm1.png|thumb|center|500px|500px|Figure 7 : Validation File size]]
  +
  +
[[File:Algorithm2.png|thumb|center|500px|500px|Figure 8 : Validation Cell type]]
  +
  +
[[File:Algorithm3.png|thumb|center|500px|500px|Figure 9 : Validation Use cases 1 & 2]]
  +
  +
*Highlight cell's compartments following number of proteins and score
  +
  +
[[File:highlightLogic.png|thumb|center|500px|500px|Figure 10 : Highlight cell's compartments algorithm]]
  +
  +
*Showing popup, in case one protein and more than one protein in input file
  +
  +
[[File:ShowPopupLogic.png|thumb|center|500px|500px|Figure 11 : Showing popup's algorithm]]
  +
  +
*Output
  +
  +
[[File:Loc0.png|thumb|center|500px|500px|Figure 12 : Home page]]
  +
  +
[[File:Loc1.png|thumb|center|500px|500px|Figure 13 : Loading of input file.]]
  +
  +
[[File:Loc2.png|thumb|center|500px|500px|Figure 14 : 1) Highlighting of eukaryota cell's compartments by percentage of proteins number.]]
  +
  +
[[File:Loc3.png|thumb|center|500px|500px|Figure 15 : 2) Highlighting of eukaryota cell's compartments by percentage of proteins number.]]
  +
  +
[[File:Loc4.png|thumb|center|500px|500px|Figure 16 : Highlighting of eukaryota cell's compartments by protein score.]]
  +
  +
  +
[[File:Loc5.png|thumb|center|500px|500px|Figure 17 : 1) Highlighting of bacteria cell's compartments by percentage of proteins number.]]
  +
  +
[[File:Loc6.png|thumb|center|500px|500px|Figure 18 : 2) Highlighting of bacteria cell's compartments by percentage of proteins number.]]
  +
  +
[[File:Loc7.png|thumb|center|500px|500px|Figure 19 : Highlighting of bacteria cell's compartments by protein score.]]
  +
  +
  +
[[File:Loc8.png|thumb|center|500px|500px|Figure 20 : 1) Highlighting of archaea cell's compartments by percentage of proteins number.]]
  +
  +
[[File:Loc9.png|thumb|center|500px|500px|Figure 21 : 2) Highlighting of archaea cell's compartments by percentage of proteins number.]]
  +
  +
[[File:Loc10.png|thumb|center|500px|500px|Figure 22 : Highlighting of archaea cell's compartments by protein score.]]
  +
  +
  +
[[File:Loc11.png|thumb|center|500px|500px|Figure 23 : About page which contains information about the project.]]
  +
  +
= GitHub Link =
  +
  +
The link to the GitHub account is : [http://3biogirls.github.io/Sub-cellular-localization-in-cell/index.html GitHub Sub-cellular localization in cell]
  +
  +
The link to presetation is : [[:File:Sub-cellular-localization-in-cell.pdf]]
  +
  +
=Related components=
  +
  +
The Compartments database [http://compartments.jensenlab.org/Search The Compartments database]

Latest revision as of 11:12, 29 January 2015

Objective

"To visualize biological cells and highlight by a user selected sub-cellular compartments in a way that they stand out from the un-selected ones"

GUI mockups

Case 1

One protein, one localization:


Figure 1 : Protein's score in cell compartment (One protein)

Case 2

Multiple proteins, multiple localizations:

Figure 2 : Number of Proteins in cell compartment (more than one protein)
Figure 3 : Protein's score in cell compartment (more than one protein)

Requirements

  • User experience
    • Interactivity
    • Easy identification of the number of proteins in cell compartment
    • Good visualization of data
  • Functionality
    • Parsing the user input “txt” file
    • Mapping the file content into visualization
  • Features
    • Good color scheme to highlight :
      • Number of proteins in a compartment
      • Score of each protein (confidence)

Application design

Data

  • Remarks about input format
    • Text file must contain proteins of only one type of cell
    • Text file format :
      • Input data file should be either a ".txt" or ".csv" file.
      • File’s first line should contain cell type (i.e: eukaryota, archea, bacteria).
      • File's second line should contain Score (i.e: 0-100). Note: Minimum score should be zero.
      • File's third line should contain columns description (i.e: Protein Id, Score, Localization).
      • User’s file can have more than 3 columns but additional columns will not be executed.


Figure 4 : Text File format

Roadmap

Figure 5 : Roadmap
Project progress
Week number Completed tasks Status
Week 1
  • Researching Biology literature in order to identify proteins belonging to different cells (Archaea / Bacteria / Eukaryota)
  • Identify existing tools or libraries which can help us in visualizing input data.
  • Identification of importance of protein localization in cell's compartments.
  • Obtaining images of Archea, Bacteria and Eukaryota.
  • Creation of repository in github.
  • Definition of input file format.
Check.jpg
Week 2
  • Coding for uploading file.
  • Parsing input file.
    • Counting number of proteins in cell's compartments.
    • Storing protein's score.
  • Writing functions for identifiying "x" and "y" positions of cell's compartments.
  • Highlighting cell's compartments (static).
Check.jpg
Week 3
  • Visualization by:
    • Grouping proteins based on localization in the cell component.
    • Highlighting cell's compartments (dynamic).
    • Updating cell’s compartments based on particular protein’s score/Confidence (dynamic).
Check.jpg
Week 4
  • Bug fixing
Check.jpg
Week 5
  • Final presentation of project.
Check.jpg
Week 6
  • Development of deployable component for reuse by BioJS.
Check.jpg

Implementation

What we did

  • First, all the cell compartments were identified by the paths that were marked using the GIMP image editor.
  • The images of the different cell compartments were saved in svg file format.
  • Using the information from the user's input file, the number of proteins present in each compartment were determined.
  • Each compartment was highlighted using a localization color scale, which was obtained by converting the number of proteins present in each localization into a percentage of the total number of proteins and matched to a color.
  • A table to display all the proteins present in each cell compartment and their scores.
  • Upon 'mouseover' over a compartment, a tooltip was displayed, which shows the proteins present in that cell compartment, and the score (confidence) of the protein.
  • The proteins displayed in the tooltip were made clickable so that the cell image could be updated to reflect the scores of the protein in all the cell compartments.
  • Finally, the cell compartments were highlighted using a score color scale, which was obtained by mapping the score of a protein in each cell compartment to a color.

Screenshots

  • Marking cell compartments using GIMP image editor
Figure 6 : Marked cell compartments of cell type Bacteria
  • Parsing input file
Figure 7 : Validation File size
Figure 8 : Validation Cell type
Figure 9 : Validation Use cases 1 & 2
  • Highlight cell's compartments following number of proteins and score
Figure 10 : Highlight cell's compartments algorithm
  • Showing popup, in case one protein and more than one protein in input file
Figure 11 : Showing popup's algorithm
  • Output
Figure 12 : Home page
Figure 13 : Loading of input file.
Figure 14 : 1) Highlighting of eukaryota cell's compartments by percentage of proteins number.
Figure 15 : 2) Highlighting of eukaryota cell's compartments by percentage of proteins number.
Figure 16 : Highlighting of eukaryota cell's compartments by protein score.


Figure 17 : 1) Highlighting of bacteria cell's compartments by percentage of proteins number.
Figure 18 : 2) Highlighting of bacteria cell's compartments by percentage of proteins number.
Figure 19 : Highlighting of bacteria cell's compartments by protein score.


Figure 20 : 1) Highlighting of archaea cell's compartments by percentage of proteins number.
Figure 21 : 2) Highlighting of archaea cell's compartments by percentage of proteins number.
Figure 22 : Highlighting of archaea cell's compartments by protein score.


Figure 23 : About page which contains information about the project.

GitHub Link

The link to the GitHub account is : GitHub Sub-cellular localization in cell

The link to presetation is : File:Sub-cellular-localization-in-cell.pdf

Related components

The Compartments database The Compartments database