Difference between revisions of "Resource data"

From Bioinformatikpedia
m (Pathway info)
m (Protein sequence info)
Line 7: Line 7:
 
* [http://www.uniprot.org/ Uniprot] - central sequence repository (mainly used in Europe) with data from Swissprot and Trembl (and many links to related data)
 
* [http://www.uniprot.org/ Uniprot] - central sequence repository (mainly used in Europe) with data from Swissprot and Trembl (and many links to related data)
 
* [http://www.ncbi.nlm.nih.gov/protein NCBI Protein] - central protein sequence search interface of NCBI, includes data from GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB
 
* [http://www.ncbi.nlm.nih.gov/protein NCBI Protein] - central protein sequence search interface of NCBI, includes data from GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB
  +
  +
=== Local data ===
  +
  +
* big: all sequences from Swissprot, Trembl and PDB (path: <code>/mnt/project/pracstrucfunc12/data/big/big*</code>)
  +
* big_80: cluster representatives of big clustered at 80% sequence identity using CD_HIT (path: <code>/mnt/project/pracstrucfunc12/data/big/big_80*</code>)
  +
* uniprot for hhblits: <code>/mnt/project/pracstrucfunc12/data/hhblits/uniprot20 </code>
   
 
== Structure info ==
 
== Structure info ==

Revision as of 11:07, 2 May 2012

Disease info

  • HGMD - "Human Gene Mutation Database", collects sequence and mutation information, links to other resources, e.g. OMIM
  • OMIM - "Online Mendelian Inheritance in Man" maintained at NCBI, has many links to related data (gene, protein, literature, etc.)
  • KEGG Disease - The KEGG DISEASE database is a collection of disease entries capturing knowledge on genetic and environmental perturbations of the KEGG pathways (see Pathway section).

Protein sequence info

  • Uniprot - central sequence repository (mainly used in Europe) with data from Swissprot and Trembl (and many links to related data)
  • NCBI Protein - central protein sequence search interface of NCBI, includes data from GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB

Local data

  • big: all sequences from Swissprot, Trembl and PDB (path: /mnt/project/pracstrucfunc12/data/big/big*)
  • big_80: cluster representatives of big clustered at 80% sequence identity using CD_HIT (path: /mnt/project/pracstrucfunc12/data/big/big_80*)
  • uniprot for hhblits: /mnt/project/pracstrucfunc12/data/hhblits/uniprot20

Structure info

  • PDB - "Protein Data Bank", central repository for protein structures
  • PDBsum - structural annotations of PDB structures (analysis of interactions between chains and with small molecule ligands)
  • SRS 3D - quick access to related structures (based on HSSP): search for Uniprot identifier of your protein -> get a list of related structures; click on a structure -> view the sequence aligned to the related structure

Pathway info

  • KEGG - "Kyoto Encyclopedia of Genes and Genomes" collects maps of metabolic pathways etc.
  • MetaCyc - database of nonredundant, experimentally elucidated metabolic pathways (from literature), contains pathways involved in both primary [def] and secondary [def] metabolism, as well as associated compounds, enzymes, and genes.