MicrobeDB.JP will integrate divers distributed data sources of microbes to be a quarry of information for microbiologists and beyond

Hideaki SUGAWARA
National Institute of Genetics, Japan
hsugawar@nig.ac.jp

 
Abstract:
The existence of microbes was postulated as early as 6th century and actually observed by Anton van Leeuwenhoek with his single-lens microscope in 1676(1). Since then, microbiology has flourished and branched out into taxon-wide, function-wide and application-wide. In hundreds of years, tons of diverse microbial data about morphology, physiology, biochemical properties and genetic properties have been accumulated in each branch.
In the 21st century, microbial genome sequences have rocketed since the first complete genome sequence of microbes was published in 1995(2). Complete genomes of 3173 projects are registered in GOLD (3) as of May 15th, 2012. In addition, 1970 samples of 334 studies on metagenomics are registered too. We become overwhelmed by the range and volume of sequences. Therefore, MiGAP (Microbial Genome Annotation Pipeline)(4) that mechanically annotates sequences becomes popular.
MiGAP uses a couple of prediction programs and several reference databases. It serves as a base or starting point for the study in details. We need to mobilize all the types of microbial data and also their metadata, like environmental data, in order to fully understand microbes based on sequences. MicrobeDB.JP will virtually assemble and interconnect the data from a number of heterogeneous data sources in a three dimensional space that consists of genes, species and environment. The user of MicrobeDB.JP will be able to quarry information from very large amount of data that have been desperately segregated.

References

  1. http://en.wikipedia.org/wiki/Microbiology
  2. Fleichmann, RD et al. Whole-genome random sequencing and assembly of Haemophilus influenza. Science 269, 496-512 (1995)
  3. GOLD (Genomes Online Database)
  4. http://www.migap.org/