Difference between revisions of "Sub Cellular Localization"
From Protein Prediction 2 Winter Semester 2014
Ppwikiuser (talk | contribs) (→Roadmap) |
Ppwikiuser (talk | contribs) (→Roadmap) |
||
(58 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | Sub Cellular Localization in Cell |
||
− | |||
= Objective = |
= Objective = |
||
Line 7: | Line 5: | ||
= GUI mockups = |
= GUI mockups = |
||
+ | ==Case 1== |
||
− | [[File:Mock1.png|thumb|center|700px|720px|Figure 1 : Number of Proteins in cell compartment]] |
||
+ | |||
+ | One protein, one localization: |
||
+ | |||
+ | |||
+ | [[File:Mockup1_1.png|thumb|center|700px|700px|Figure 1 : Protein's score in cell compartment (One protein)]] |
||
+ | |||
+ | ==Case 2== |
||
+ | |||
+ | Multiple proteins, multiple localizations: |
||
+ | |||
+ | [[File:Mockup1_2.png|thumb|center|700px|720px|Figure 2 : Number of Proteins in cell compartment (more than one protein)]] |
||
+ | |||
+ | [[File:Mockup1_3.png|thumb|center|700px|700px|Figure 3 : Protein's score in cell compartment (more than one protein)]] |
||
+ | ==Requirements== |
||
− | [[File:Mock2.png|thumb|center|700px|700px|Figure 2 : Protein's score in cell compartment]] |
||
*User experience |
*User experience |
||
Line 28: | Line 39: | ||
**Correct identification of localization |
**Correct identification of localization |
||
*Libraries |
*Libraries |
||
+ | **D3 Data-Driven Documents - http://d3js.org/ |
||
− | **D3 - Zoomdata |
||
+ | ***https://github.com/mbostock/d3/blob/master/d3.min.js |
||
− | ***https://live.zoomdata.com/zoomdata/visualization#51db7ad4e4b04caf9ab346db-51db7ad4e4b04caf9ab346d5 |
||
+ | *Tools |
||
+ | **GIMP Image editor |
||
= Data = |
= Data = |
||
*Remarks about input format |
*Remarks about input format |
||
− | ** |
+ | **Text file must contain proteins of only one type of cell |
− | **Text file format: |
+ | **Text file format : |
+ | ***Input data file should be either a ".txt" or ".csv" file. |
||
− | ***Protein id |
||
+ | ***File’s first line should contain cell type (i.e: eukaryota, archea, bacteria). |
||
− | ***Score (certainty of existence of particular protein within a cell compartment) |
||
+ | ***File's second line should contain Score (i.e: 0-100). Note: Minimum score should be zero. |
||
− | ***Localization (component) |
||
+ | ***File's third line should contain columns description (i.e: Protein Id, Score, Localization). |
||
− | ***Cell type |
||
+ | ***User’s file can have more than 3 columns but additional columns will not be executed. |
||
− | [[File:Format.png|thumb|center|700px|720px|Figure 3 : Text File format]] |
||
+ | [[File:inputfile.png|thumb|center|700px|720px|Figure 4 : Text File format]] |
||
− | = Roadmap = |
||
+ | = Roadmap = |
||
− | [[File:Roadmap.png|thumb|center|700px|720px|Figure 4 : Roadmap]] |
||
+ | [[File:Roadmap.png|thumb|center|700px|720px|Figure 5 : Roadmap]] |
||
{| border="1" class="wikitable" |
{| border="1" class="wikitable" |
||
Line 53: | Line 67: | ||
! Completed tasks |
! Completed tasks |
||
! Status |
! Status |
||
− | |||
|- |
|- |
||
! Week 1 |
! Week 1 |
||
− | | |
+ | | |
* Researching Biology literature in order to identify proteins belonging to different cells (Archaea / Bacteria / Eukaryota) |
* Researching Biology literature in order to identify proteins belonging to different cells (Archaea / Bacteria / Eukaryota) |
||
* Identify existing tools or libraries which can help us in visualizing input data. |
* Identify existing tools or libraries which can help us in visualizing input data. |
||
Line 64: | Line 77: | ||
* Creation of repository in github. |
* Creation of repository in github. |
||
* Definition of input file format. |
* Definition of input file format. |
||
+ | |||
− | | {{check mark}} |
||
+ | |[[File:check.jpg|center|50px|50px]] |
||
|- |
|- |
||
! Week 2 |
! Week 2 |
||
Line 73: | Line 87: | ||
** Storing protein's score. |
** Storing protein's score. |
||
* Writing functions for identifiying "x" and "y" positions of cell's compartments. |
* Writing functions for identifiying "x" and "y" positions of cell's compartments. |
||
− | * Highlighting cell's compartments |
+ | * Highlighting cell's compartments (static). |
+ | |[[File:check.jpg|center|50px|50px]] |
||
− | | |
||
|- |
|- |
||
Line 81: | Line 95: | ||
* Visualization by: |
* Visualization by: |
||
** Grouping proteins based on localization in the cell component. |
** Grouping proteins based on localization in the cell component. |
||
− | ** |
+ | ** Highlighting cell's compartments (dynamic). |
+ | ** Updating cell’s compartments based on particular protein’s score/Confidence (dynamic). |
||
− | | |
||
+ | |[[File:check.jpg|center|50px|50px]] |
||
|- |
|- |
||
! Week 4 |
! Week 4 |
||
| |
| |
||
+ | * Bug fixing |
||
− | * Development of deployable component for reuse by BioJS. |
||
+ | |[[File:check.jpg|center|50px|50px]] |
||
− | | |
||
|- |
|- |
||
Line 94: | Line 109: | ||
| |
| |
||
* Final presentation of project. |
* Final presentation of project. |
||
+ | |[[File:check.jpg|center|50px|50px]] |
||
+ | |- |
||
+ | |||
+ | ! Week 6 |
||
| |
| |
||
+ | * Development of deployable component for reuse by BioJS. |
||
− | |||
+ | |[[File:check.jpg|center|50px|50px]] |
||
+ | |- |
||
|} |
|} |
||
+ | |||
+ | = Implementation = |
||
+ | |||
+ | ==What we did== |
||
+ | |||
+ | * First, all the cell compartments were identified by the paths that were marked using the GIMP image editor. |
||
+ | * The images of the different cell compartments were saved in svg file format. |
||
+ | * Using the information from the user's input file, the number of proteins present in each compartment were determined. |
||
+ | * Each compartment was highlighted using a localization color scale, which was obtained by converting the number of proteins present in each localization into a percentage of the total number of proteins and matched to a color. |
||
+ | * A table to display all the proteins present in each cell compartment and their scores. |
||
+ | * Upon 'mouseover' over a compartment, a tooltip was displayed, which shows the proteins present in that cell compartment, and the score (confidence) of the protein. |
||
+ | * The proteins displayed in the tooltip were made clickable so that the cell image could be updated to reflect the scores of the protein in all the cell compartments. |
||
+ | * Finally, the cell compartments were highlighted using a score color scale, which was obtained by mapping the score of a protein in each cell compartment to a color. |
||
+ | |||
+ | ==Screenshots== |
||
+ | |||
+ | *Marking cell compartments using GIMP image editor |
||
+ | |||
+ | [[File:Gimp_imagepath.png|thumb|center|500px|500px|Figure 6 : Marked cell compartments of cell type Bacteria]] |
||
+ | |||
+ | *Parsing input file |
||
+ | |||
+ | [[File:Algorithm1.png|thumb|center|500px|500px|Figure 7 : Validation File size]] |
||
+ | |||
+ | [[File:Algorithm2.png|thumb|center|500px|500px|Figure 8 : Validation Cell type]] |
||
+ | |||
+ | [[File:Algorithm3.png|thumb|center|500px|500px|Figure 9 : Validation Use cases 1 & 2]] |
||
+ | |||
+ | *Highlight cell's compartments following number of proteins and score |
||
+ | |||
+ | [[File:highlightLogic.png|thumb|center|500px|500px|Figure 10 : Highlight cell's compartments algorithm]] |
||
+ | |||
+ | *Showing popup, in case one protein and more than one protein in input file |
||
+ | |||
+ | [[File:ShowPopupLogic.png|thumb|center|500px|500px|Figure 11 : Showing popup's algorithm]] |
||
+ | |||
+ | *Output |
||
+ | |||
+ | [[File:Loc0.png|thumb|center|500px|500px|Figure 12 : Home page]] |
||
+ | |||
+ | [[File:Loc1.png|thumb|center|500px|500px|Figure 13 : Loading of input file.]] |
||
+ | |||
+ | [[File:Loc2.png|thumb|center|500px|500px|Figure 14 : 1) Highlighting of eukaryota cell's compartments by percentage of proteins number.]] |
||
+ | |||
+ | [[File:Loc3.png|thumb|center|500px|500px|Figure 15 : 2) Highlighting of eukaryota cell's compartments by percentage of proteins number.]] |
||
+ | |||
+ | [[File:Loc4.png|thumb|center|500px|500px|Figure 16 : Highlighting of eukaryota cell's compartments by protein score.]] |
||
+ | |||
+ | |||
+ | [[File:Loc5.png|thumb|center|500px|500px|Figure 17 : 1) Highlighting of bacteria cell's compartments by percentage of proteins number.]] |
||
+ | |||
+ | [[File:Loc6.png|thumb|center|500px|500px|Figure 18 : 2) Highlighting of bacteria cell's compartments by percentage of proteins number.]] |
||
+ | |||
+ | [[File:Loc7.png|thumb|center|500px|500px|Figure 19 : Highlighting of bacteria cell's compartments by protein score.]] |
||
+ | |||
+ | |||
+ | [[File:Loc8.png|thumb|center|500px|500px|Figure 20 : 1) Highlighting of archaea cell's compartments by percentage of proteins number.]] |
||
+ | |||
+ | [[File:Loc9.png|thumb|center|500px|500px|Figure 21 : 2) Highlighting of archaea cell's compartments by percentage of proteins number.]] |
||
+ | |||
+ | [[File:Loc10.png|thumb|center|500px|500px|Figure 22 : Highlighting of archaea cell's compartments by protein score.]] |
||
+ | |||
+ | |||
+ | [[File:Loc11.png|thumb|center|500px|500px|Figure 23 : About page which contains information about the project.]] |
||
+ | |||
+ | = GitHub Link = |
||
+ | |||
+ | The link to the GitHub account is : [http://3biogirls.github.io/Sub-cellular-localization-in-cell/index.html GitHub Sub-cellular localization in cell] |
||
+ | |||
+ | The link to presetation is : [[:File:Sub-cellular-localization-in-cell.pdf]] |
||
+ | |||
+ | =Related components= |
||
+ | |||
+ | The Compartments database [http://compartments.jensenlab.org/Search The Compartments database] |
Latest revision as of 11:12, 29 January 2015
Contents
Objective
"To visualize biological cells and highlight by a user selected sub-cellular compartments in a way that they stand out from the un-selected ones"
GUI mockups
Case 1
One protein, one localization:
Case 2
Multiple proteins, multiple localizations:
Requirements
- User experience
- Interactivity
- Easy identification of the number of proteins in cell compartment
- Good visualization of data
- Functionality
- Parsing the user input “txt” file
- Mapping the file content into visualization
- Features
- Good color scheme to highlight :
- Number of proteins in a compartment
- Score of each protein (confidence)
- Good color scheme to highlight :
Application design
- Expected technical difficulties
- Wrong file input format
- Correct identification of localization
- Libraries
- D3 Data-Driven Documents - http://d3js.org/
- Tools
- GIMP Image editor
Data
- Remarks about input format
- Text file must contain proteins of only one type of cell
- Text file format :
- Input data file should be either a ".txt" or ".csv" file.
- File’s first line should contain cell type (i.e: eukaryota, archea, bacteria).
- File's second line should contain Score (i.e: 0-100). Note: Minimum score should be zero.
- File's third line should contain columns description (i.e: Protein Id, Score, Localization).
- User’s file can have more than 3 columns but additional columns will not be executed.
Roadmap
Week number | Completed tasks | Status |
---|---|---|
Week 1 |
|
|
Week 2 |
|
|
Week 3 |
|
|
Week 4 |
|
|
Week 5 |
|
|
Week 6 |
|
Implementation
What we did
- First, all the cell compartments were identified by the paths that were marked using the GIMP image editor.
- The images of the different cell compartments were saved in svg file format.
- Using the information from the user's input file, the number of proteins present in each compartment were determined.
- Each compartment was highlighted using a localization color scale, which was obtained by converting the number of proteins present in each localization into a percentage of the total number of proteins and matched to a color.
- A table to display all the proteins present in each cell compartment and their scores.
- Upon 'mouseover' over a compartment, a tooltip was displayed, which shows the proteins present in that cell compartment, and the score (confidence) of the protein.
- The proteins displayed in the tooltip were made clickable so that the cell image could be updated to reflect the scores of the protein in all the cell compartments.
- Finally, the cell compartments were highlighted using a score color scale, which was obtained by mapping the score of a protein in each cell compartment to a color.
Screenshots
- Marking cell compartments using GIMP image editor
- Parsing input file
- Highlight cell's compartments following number of proteins and score
- Showing popup, in case one protein and more than one protein in input file
- Output
GitHub Link
The link to the GitHub account is : GitHub Sub-cellular localization in cell
The link to presetation is : File:Sub-cellular-localization-in-cell.pdf
Related components
The Compartments database The Compartments database