Systems Biology Markup Language (SBML)

From DrugPedia: A Wikipedia for Drug discovery

Jump to: navigation, search

The Systems Biology Markup Language (SBML) is a machine-readable language, based on XML, for representing models of biochemical reaction networks. SBML can represent metabolic networks, cell-signaling pathways, regulatory networks, and other kinds of systems studied in systems biology.

Contents

[edit] History

In the year 2000, with funding from the Japan Science and Technology Corporation (JST), Hiroaki Kitano and John C. Doyle assembled a small team of researchers to work on developing better software infrastructure for computational modeling in systems biology. Hamid Bolouri was the leader of the development team, which consisted of Andrew Finney, Herbert Sauro, and Michael Hucka. Their initial work focused on a system to allow a subset of existing simulation software packages to communicate. This subset consisted of DBSolve, E-Cell, Gepasi, Jarnac, StochSim and The Virtual Cell. The groups developing these tools met on April 28-29, 2000 at the ERATO Software Platforms for Molecular Biology workshop, held at the California Institute of Technology. It became clear during the workshop that a common model representation format was needed to enable exchanging models between software tools. The workshop attendees decided the format should be encoded in XML. The Caltech ERATO team developed a proposal for a format and circulated the draft definition to the meeting attendees in August, 2000. This draft underwent extensive discussion over mailing lists and during the Second Workshop on Software Platforms for Systems Biology, held in Tokyo, Japan, in November 2000 as a satellite workshop of the ICSB 2000 conference. After further revisions and discussions, the Caltech team issued a specification for SBML Level 1, Version 1 in March 2001.

SBML Level 2 was conceived at the 5th Workshop on Software Platforms for Systems Biology, held in July 2002, at the University of Hertfordshire, UK. By this time, far more people were involved than the original group of SBML collaborators and the continued evolution of SBML became a larger community effort, with many new tools having been enhanced to support SBML. The workshop participants in 2002 collectively decided to revise the form of SBML in Level 2. The first draft of the Level 2 Version 1 specification was released in August 2002, and the final set of features was finalized in May 2003 at the 7th Workshop on Software Platforms for Systems Biology in Ft. Lauderdale, Florida.

The next iteration of SBML took two years in part because software developers requested time to absorb and understand the larger and more complex SBML Level 2. The inevitable discovery of limitations and errors lead to the development of SBML Level 2 Version 2, issued in September 2006. By this time, the team of SBML Editors (who reconcile proposals for changes and write a coherent final specification document) had changed and now consisted of Andrew Finney, Michael Hucka and Nicolas Le Novère.

2007 saw the addition of two more SBML Editors (Sarah Keating and Stefan Hoops) and the development of SBML Level 2 Version 3 after countless contributions by and discussions with the SBML community.

[edit] The language

[edit] Purposes

SBML has three main purposes:

  • enabling the use of multiple software tools without rewriting models for each tool;
  • enabling models to be shared and published in a form other researchers can use even in a different software environment;
  • ensuring the survival of models beyond the lifetime of the software used to create them.

SBML is not an attempt to define a universal language for quantitative models. SBML's purpose is to serve as a lingua franca—an exchange format used by different present-day software tools to communicate the essential aspects of a computational model.

[edit] Main capabilities

SBML can encode models consisting of biochemical entities (species) linked by reactions to form biochemical networks. An important principle is that models are decomposed into explicitly-labeled constituent elements, the set of which resembles a verbose rendition of chemical reaction equations; the representation deliberately does not cast the model directly into a set of differential equations or other specific interpretation of the model. This explicit, modeling-framework-agnostic decomposition makes it easier for a software tool to interpret the model and translate the SBML form into whatever internal form the tool actually uses.

A software package can read an SBML model description and translate it into its own internal format for model analysis. For example, a package might provide the ability to simulate the model by constructing differential equations representing the network and then perform numerical time integration on the equations to explore the model's dynamic behavior. Or, alternatively, a package might construct a discrete stochastic representation of the model and use a Monte Carlo simulation method such as the Gillespie algorithm.

SBML allows models of arbitrary complexity to be represented. Each type of component in a model is described using a specific type of data structure that organizes the relevant information. The data structures determine how the resulting model is encoded in XML.

[edit] Levels and versions

SBML is defined in levels: upward-compatible specifications that add features and expressive power. Software tools that do not need or cannot support the complexity of higher levels can go on using lower levels; tools that can read higher levels are assured of also being able to interpret models defined in the lower levels. Thus new levels do not supersede old levels. However, each level can have multiple versions; new versions of a level supersede old versions of that same level.

There are now (October 2007) two levels. Current versions are:

Open-source software infrastructure such as libSBML allows developers to support both Levels 1 and 2 in their software with a minimum amount of effort.

SBML Level 2 Version 3 Release 2 is the latest definition of SBML and was released September 26, 2007.

[edit] Structure

A model definition in Level 2 Version 3 consists of lists of one or more of the following components:

  • Function definition: A named mathematical function that may be used throughout the rest of a model.
  • Unit definition: A named definition of a new unit of measure, or a redefinition of an existing SBML default unit. Named units can be used in the expression of quantities in a model.
  • Compartment Type: A type of location where reacting entities such as chemical substances may be located.
  • Species type: A type of entity that can participate in reactions. Examples of species types include ions such as Ca², molecules such as glucose or ATP, binding sites on a protein, and more.
  • Compartment: A well-stirred container of a particular type and finite size where species may be located. A model may contain multiple compartments of the same compartment type. Every species in a model must be located in a compartment.
  • Species: A pool of entities of the same species type located in a specific compartment.
  • Parameter: A quantity with a symbolic name. In SBML, the term parameter is used in a generic sense to refer to named quantities regardless of whether they are constants or variables in a model. SBML Level 2 Version 2 provides the ability to define parameters that are global to a model as well as parameters that are local to a single reaction.
  • Initial Assignment: A mathematical expression used to determine the initial conditions of a model. This type of structure can only be used to define how the value of a variable can be calculated from other values and variables at the start of simulated time.
  • Rule: A mathematical expression used in combination with the differential equations constructed based on the set of reactions in a model. It can be used to define how a variable's value can be calculated from other variables, or used to define the rate of change of a variable. The set of rules in a model can be used with the reaction rate equations to determine the behavior of the model with respect to time. The set of rules constrains the model for the entire duration of simulated time.
  • Constraint: A mathematical expression that defines a constraint on the values of model variables. The constraint applies at all instants of simulated time. The set of constraints in model should not be used to determine the behavior of the model with respect to time.
  • Reaction: A statement describing some transformation, transport or binding process that can change the amount of one or more species. For example, a reaction may describe how certain entities (reactants) are transformed into certain other entities (products). Reactions have associated kinetic rate expressions describing how quickly they take place.
  • Event: A statement describing an instantaneous, discontinuous change in a set of variables of any type (species concentration, compartment size or parameter value) when a triggering condition is satisfied.

[edit] Community

As of December, 2007, more than 120 software systems advertise support SBML. A current list can be found at sbml.org.

SBML has been and continues to be developed by the community of people making software platforms for systems biology, through active email discussion lists and biannual workshops, often held in conjunction with other biology conferences, especially the International Conference on Systems Biology.

Tools such as an on-line model validator and open-source libraries for incorporating SBML into C, C++, Mathematica, and MATLAB are developed partly by the SBML Team and partly by the broader SBML community.

SBML is an official IETF MIME type ("RFC 3823 MIME Media Type for the Systems Biology Markup Language (SBML)")

[edit] References

  • Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr JH, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novere N, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, Nakayama Y, Nelson MR, Nielsen PF, Sakurada T, Schaff JC, Shapiro BE, Shimizu TS, Spence HD, Stelling J, Takahashi K, Tomita M, Wagner J, Wang J. "The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models." Bioinformatics. 2003 19: 524-531.
  • Finney A, Hucka M. "Systems biology markup language: Level 2 and beyond." Biochem Soc. Trans. 2003 31: 1472-1473.
  • Hucka M, Finney A, Bornstein BJ, Keating SM, Shapiro BE, Matthews J, Kovitz BL, Schilstra MJ, Funahashi A, Doyle JC, and Kitano H, “Evolving a Lingua Franca and Associated Software Infrastructure for Computational Systems Biology: The Systems Biology Markup Language (SBML) Project”, Systems Biology 1:41-53 (2004).
  • Finney A, Hucka M, Bornstein BJ, Keating SM, Shapiro BE, Matthews J, Kovitz BL, Schilstra MJ, Funahashi A, Doyle JC, Kitano H, Software Infrastructure for Effective Communication and Reuse of Computational Models. In System Modeling in Cellular Biology: From Concepts to Nuts and Bolts, ed. Szallasi Z, Stelling J, Periwal V, MIT Press, 2006.

[edit] See also

[edit] External links