HSSP curve
Contents
Introduction to HSSP curve
HSSP is a derived database merging structural (3-D) and sequence (1-D) information. For each protein of known 3-D structure from the Protein Data Bank (PDB), the database has a multiple sequence alignment of all available homologues and a sequence profile characteristic of the family. The list of homologues is the result of a database search in SwissProt using a position-weighted dynamic programming method for sequence profile alignment (MaxHom). The database is updated frequently. The listed homologues are very likely to have the same 3-D structure as the PDB protein to which they have been aligned. As a result, the database is not only a database of aligned sequence families, but also a database of implied secondary and tertiary structures covering 29% of all SwissProt-stored sequences.
Existing visualisations
There is an existing implementation that can be found on this page.
The program accepts either a set of sequences in FASTA format or a list of identifiers from either of the following protein databases: SWISS-PROT (13), PDB (14) or TrEMBL (13). Alternatively, one of the following alignment-file formats is accepted to bypass the first step of the algorithm (see below): BLAST, PSIBLAST, pair, markx0, markx1, markx2, markx3, markx10 or srspair.
It runs based on a greedy algorithm that calculates the HSSP-values.
Tool's Objective
Visualize the HSSP curve and allow the user to dynamically filter or categorize the data shown on the graph for better insights.
Core Functionalities
Task | Implemented |
---|---|
Import BLAST results | No |
Parse BLAST results | No |
Visualize HSSP curve | No |
Ability to filter based on a threshold | No |
Roadmap
- Understand the HSSP curve and the calculations needed to visualize it
- Gather input (BLAST results) with which we can work on visualizing
- Parse BLAST results input
- Calculate and visualize the HSSP curve
- Implement dynamic filtering of the curve
- Get feedback from biologist about possible improvements for better insights
- Work on changes/new features based on the feedback
Libraries we plan to use
For the first releases we plan to use:
- Jquery
- D3
And later on react to changes
People
References
http://en.wikipedia.org/wiki/Homology-derived_Secondary_Structure_of_Proteins
http://www.ncbi.nlm.nih.gov/pubmed/?term=UniqueProt%3A+creating+representative+protein+sequence+sets