Pathway Databases

Pathguide: the pathway resource list

"Pathguide contains information about 231 biological pathway resources. Click on a link to go to the resource home page or 'Details' for a description page. Databases that are free and those supporting BioPAX, CellML, PSI-MI or SBML standards are respectively indicated."



"Observe how genes interact in dynamic graphical models. Our online maps depict molecular relationships from areas of active research. In an "open source" approach, this community-fed forum constantly integrates emerging proteomic information from the scientific community. It also catalogs and summarizes important resources providing information for over 120,000 genes from multiple species. Find both classical pathways as well as current suggestions for new pathways."


BioSilico - An Integrated Metabolic Database System

"BioSilico is a web-based database system that facilitates the search and analysis of metabolic pathways. Heterogeneous metabolic databases including LIGANDENZYMEEcoCyc and MetaCyc are integrated in a systematic way, thereby allowing users to efficiently retrieve the relevant information on enzymes, biochemical compounds and reactions. In addition, it provides well-designed view pages for more detailed summary information. BioSilico is developed as an extensible system with a robust systematic architecture."


Human Cancer Protein Interaction Network (HCPIN) Database


"The Human Cancer Protein Interaction Network (HCPIN) is a web-accessible database. It is designed for use by cancer biologists interested in assessing 3D protein structural information in the context of the protein interaction network." 


KEGG: Kyoto Encyclopedia of Genes and Genomes

"KEGG is a database of biological systems, consisting of genetic building blocks of genes and proteins (KEGG GENES), chemical building blocks of both endogenous and exogenous substances (KEGG LIGAND), molecular wiring diagrams of interaction and reaction networks (KEGG PATHWAY), and hierarchies and relationships of various biological objects (KEGG BRITE). KEGG provides a reference knowledge base for linking genomes to biological systems and also to environments by the processes of PATHWAY mapping and BRITE mapping."

The KEGG-to-SBML converter


LitInspector - Literature and Pathway mining made easy!

"LitInspector performs large scale text mining on more than 18 million PubMed entries. LitInspector's sophisticated gene recognition and intuitive color coding increase the readability of abstracts and lets you analyze signal transduction pathways, diseases and tissue associations in a snap."

Reactome - a curated knowledgebase of biological pathways

"The Reactome project is a collaboration among Cold Spring Harbor Laboratory, The European Bioinformatics Institute, and The Gene Ontology Consortium to develop a curated resource of core pathways and reactions in human biology. The information in this database is authored by biological researchers with expertise in their fields, maintained by the Reactome editorial staff, and cross-referenced with the sequence databases at NCBI, Ensembl and UniProt, the UCSC Genome Browser , HapMap, KEGG(Gene and Compound ), ChEBI, PubMed and GO. In addition to curated human events, inferred orthologous events in 22 non-human species including mouse, rat, chicken, puffer fish, worm, fly, yeast, two plants and E.coli are also available. A description of Reactome has been published in Genome Biology."


STRING - Proteins and their Interactions (EMBL)

"STRING is a database of known and predicted protein-protein interactions.
The interactions include direct (physical) and indirect (functional) associations; they are derived from four sources:

  • Genomic Context 
  • High-throughput Experiments 
  • (ConservedCoexpression  
  • Previous Knowledge

STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable. The database currently contains 1,513,782 proteins in 373 species."


UniHI: Unified Human Interactome (MDC Berlin) 

"Unified Human Interactome is a comprehensive database of the computational and experimental based human protein interaction networks. This database is aimed to integrate diverge maps, which offers the research a flexible and direct entry gate into the human interactome. In its first version, It contains more than 178,000 distinct interactions between over 18,500 unique human proteins. 
Manuscripts describing UniHI database features and its architecture have been published in NAR Database Issue 2007 and Journal of Integrative Bioinformatics 2007 Links"



Kinetics Databases 

Brenda - The Comprehensive Enzyme Information System

"BRENDA is maintained and developed at the institute of Biochemistry at the University of Cologne. Data on enzyme function are extracted directly from the primary literature by scientists holding a degree in Biology or Chemistry. Formal and consistency checks are done by computer programs, each data set on a classified enzyme is checked manually by at least one biologist and one chemist." 

Alternative Access: SwissProt.


Kinetic Data of Biomolecular Interactions

"Kinetic Data of Bio-molecular Interaction (KDBI) is a collection of experimentally determined kinetic data of protein-protein, protein-RNA, protein-DNA, protein-ligand, RNA-ligand, DNA-ligand binding or reaction events described in the literature.

Currently, KDBI contains 20,803 records (about 2.5 fold of 8,273 in year 2003) of 11,916 distinctive bio-molecular binding and 13,793 interaction events, which involve 2,934 proteins/protein complexes, 870 nucleic acids and 6713 small molecules.

A new user-friendly search engine is developed for more accurate and efficient query, and a graphic interface is also designed for better representation of data. Moreover, new function of kinetic protein-protein interaction maps is constructed for better data visualization and further quantitative study of biological pathway." 


TP-Search (University of Tokyo; see paper from Ozawa et al.) [free registration required]

"TP-Search is a comprehensive database on drug transporters, which are thought to play an important role in the pharmacokinetics of drugs. All the information is extracted from a large number of published papers. With this database, users can obtain various kinds of basic information on drug transporters."


Model Databases & Tools 

"BioNetGen (Los Alamos National Laboratory) is a tool for automatically generating mathematical models of biological systems from user-specified rules for biomolecular interactions. Sample Models are available." 


CellML Model Repository

The models in the repository are in the process of being curated.

The CellML-to-SBML converter.


E-Cell Project is an international research project aiming to model and reconstruct biological phenomena in silico, and developing necessary theoretical supports, technologies and software platforms to allow precise whole cell simulation. 


JWS Online is a "Systems Biology tool for simulation of kinetic models from a curated model database."


BioModels Database - "A Database of Annotated Published Models"



The SABIO-RK (System for the Analysis of Biochemical Pathways - Reaction Kinetics) is a web-based application based on the SABIO relational database that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. It aims to support modellers in the setting-up of models of biochemical networks, but it is also useful for experimentalists or researchers with interest in biochemical reactions and their kinetics. Information about reactions and their kinetics can be exported in SBML (Systems Biology Mark-Up Language ) format. 


SBML Layout Viewer

"This site takes an SBML File and renders it according to the SBML Layout Extension, or the JDesigner annotations. An experimental support for CellDesigner Model annotations (Annotation Version 2.5) has been added." 


Virtual Cell Published Models

"The National Resource for Cell Analysis and Modeling has developed the Virtual Cell Modeling and Simulation Framework version 4.0. This unique software platform has been designed to model cell biological processes. This new technology associates biochemical and electrophysiological data describing individual reactions with experimental microscopic image data describing their subcellular locations. Cell physiological events can then be simulated within the empirically derived geometries, thus facilitating the direct comparison of model predictions with experiment. The Virtual Cell consists of a biological and mathematical framework. Scientists can create biological models from which the software will generate the mathematical code needed to run simulations. Mathematicians may opt to use the math framework, based on the Virtual Cell Math Language, for creating their own mathematical descriptions. The simulations are run over the internet on 84 servers with 256 GHz total CPU power and 119 GB total RAM. Currently the storage capacity is 11.7 Tb."


Drug Databases

Chemie-Lexikon auf

"Das Chemie.DE Lexikon bietet Ihnen Artikel zu 37.451 Stichworten aus Chemie, Pharmazie und Materialwissenschaften sowie verwandten naturwissenschaftlichen Disziplinen." 


DrugBank (University of Alberta)

"The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains nearly 4300 drug entries including >1,000 FDA-approved small molecule drugs, 113 FDA-approved biotech (protein/peptide) drugs, 62 nutraceuticals and >3,000 experimental drugs. Additionally, more than 6,000 protein (i.e. drug target) sequences are linked to these drug entries. Each DrugCard entry contains more than 80 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data."


" is the most popular, comprehensive and up-to-date source of drug information online. Providing free, accurate and independent advice on more than 24,000 prescription drugs, over-the-counter medicines and natural products." 


"PharmGKB curates information that establishes knowledge about the relationships among drugs, diseases and genes, including their variations and gene products. Our mission is to catalyze pharmacogenomics research."


Gene, Protein, Metabolite and Expression Databases 

Affymetrix Netaffx [free registration required]

"The NetAffx™ Analysis Center enables researchers to correlate their GeneChip® array results with array design and annotation information. This resource provides you with unprecedented access to array content information, including probe sequences and gene annotations. You can quickly search for genes and/or SNPs, compare and refine results, and export data into Excel-friendly formats."


cDNA Libraries from Tissues  (Genes in Tissues - Gene Expression Tools)


Human Metabolome Database

"The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery and general education. The database is designed to contain or link three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data. The database (version 2.0) contains over 6500 metabolite entries including both water-soluble and lipid soluble metabolites as well as metabolites that would be regarded as either abundant (> 1 uM) or relatively rare (< 1 nM). Additionally, approximately 1500 protein (and DNA) sequences are linked to these metabolite entries. Each MetaboCard entry contains more than 100 data fields with 2/3 of the information being devoted to chemical/clinical data and the other 1/3 devoted to enzymatic or biochemical data. Many data fields are hyperlinked to other databases (KEGG, PubChem, MetaCyc, ChEBI, PDB, Swiss-Prot, and GenBank) and a variety of structure and pathway viewing applets. The HMDB database supports extensive text, sequence, chemical structure and relational query searches. Two additional databases, DrugBank and FooDB are also part of the HMDB suite of databases. DrugBank contains equivalent information on ~1500 drugs while FooDB contains equivalent information on ~2000 food components and food additives. 

HMDB is supported by David Wishart, Departments of Computing Science & Biological Sciences, University of Alberta." 


Human Protein Atlas

"The human protein atlas shows expression and localization of proteins in a large variety of normal human tissues, cancer cells and cell lines with the aid of immunohistochemistry (IHC) images."


iHOP - Information Hyperlinked over Proteins 

A network of concurring genes and proteins extends through the scientific literature touching on phenotypes, pathologies and gene function. 
iHOP provides this network as a natural way of accessing millions of PubMed abstracts. By using genes and proteins as hyperlinks between sentences and abstracts, the information in PubMed can be converted into one navigable resource, bringing all advantages of the internet to scientific literature research.


NCBI Home Page

Resource Guide

"Established in 1988 as a national resource for molecular biology information, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease."

The NCBI Handbook



"SOURCE is a unification tool which dynamically collects and compiles data from many scientific databases, and thereby attempts to encapsulate the genetics and molecular biology of genes from the genomes of Homo sapiens, Mus musculus, Rattus norvegicus into easy to navigate GeneReports.
The mission of SOURCE is to provide a unique scientific resource that pools publicly available data commonly sought after for any clone, GenBank accession number, or gene. SOURCE is specifically designed to facilitate the analysis of large sets of data that biologists can now produce using genome-scale experimental approaches."


SRS@EBI (Integrated database retrieval system)


SwissProt Protein knowledgebase 

"UniProtKB/Swiss-Prot; a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases.

UniProtKB/TrEMBL; a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot."

