Jump to content

OBO Foundry

From Wikipedia, the free encyclopedia
OBO Foundry
FocusImprovent of biomedical ontologies
Members27
Key people
Suzanna Lewis, Barry Smith, Michael Ashburner
Websiteobofoundry.org

The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people who build and maintain ontologies related to the life sciences.[1] The OBO Foundry establishes a set of principles for ontology development for creating a suite of interoperable reference ontologies in the biomedical domain. Currently, there are more than a hundred ontologies that follow the OBO Foundry principles.

The OBO Foundry effort makes it easier to integrate biomedical results and carry out analysis in bioinformatics. It does so by offering a structured reference for terms of different research fields and their interconnections (ex: a phenotype in a mouse model and its related phenotype in zebrafish).[2]

Introduction

[edit]

The Foundry initiative aims at improving the integration of data in the life sciences. One approach to integration is the annotation of data from different sources using controlled vocabularies. Ideally, such controlled vocabularies take the form of ontologies, which support logical reasoning over the data annotated using the terms in the vocabulary.

The formalization of concepts in the biomedical domain is especially known via the work of the Gene Ontology Consortium, a part of the OBO Foundry. This has led to the development of certain proposed principles of good practice in ontology development, which are now being put into practice within the framework of the Open Biomedical Ontologies consortium through its OBO Foundry initiative. OBO ontologies form part of the resources of the National Center for Biomedical Ontology, where they form a central component of the NCBO's BioPortal.

Open Biological and Biomedical Ontologies

[edit]

The Open Biological and Biomedical Ontologies (OBO; formerly Open Biomedical Ontologies) is an effort to create ontologies (controlled vocabularies) for use across biological and medical domains. A subset of the original OBO ontologies has started the OBO Foundry, which leads the OBO efforts since 2007.[1]

The creation of OBO in 2001 was largely inspired by the efforts of the Gene Ontology project.[3] OBO forms part of the resources of the U.S. National Center for Biomedical Ontology (NCBIO) and a central element of the NCBO's BioPortal. It is an initiative led by the OBO Foundry.

Rules for participation

[edit]

The OBO Foundry is open to participations of any interested individuals. Ontologies that intend to be officially part of the OBO Foundry have to adhere to the OBO principles and pass a series of reviews done by the members, when "the Foundry coordinators serve as analogs of journal editors".[1] There are ontologies that follow OBO principles but are not officially part of OBO, such as eagle-i's Reagent Application Ontology.[4] and the Animals in Context Ontology.[5]

An integration into OBO of the OntoClean's theory of rigidity has been proposed as a step to standardize candidate ontologies. This integration would make it easier to develop software to automatically check candidates.[6]

Tools

[edit]

The OBO Foundry community is also dedicated to developing tools to facilitate creating and maintaining ontologies. Most ontology developers in OBO use the Protégé ontology editor and the Web Ontology Language (OWL) for building ontologies. To facilitate command line management of ontologies in a Protégé- and OWL-compatible format, the OBO Foundry has developed the tool ROBOT (ROBOT is an OBO Tool). ROBOT aggregates functions for routine tasks in ontology development, is open source, and can be used either via the command line or as a library for any language on the Java Virtual Machine.[7]

Other tool related to the OBO effort is OBO-Edit,[8] an ontology editor and reasoner funded by the Gene Ontology Consortium. There are also plugins for OBO-Edit which facilitate the development of ontologies, such as the semi-automatic ontology generator DOG4DAG.[9]

The OBO file format

[edit]

The OBO file format is a biology-oriented language for building ontologies. It is based on the principles of Web Ontology Language (OWL).

As a community effort, standard common mappings have been created for lossless roundtrip transformations between Open Biomedical Ontologies (OBO) format and OWL.[10][11] The research contains methodical examination of each of the constructs of OBO and a layer cake for OBO, similar to the Semantic Web stack.[12]

OBO Foundry Ontologies

[edit]

The initial set of OBO Foundry ontologies was composed by mature ontologies (such as the Gene Ontology, GO, and the Foundational Model of Anatomy, FMAO), by mergers of previously existing ontologies (ex: the Cell Ontology,[13] CL, formed from different dedicated ontologies,[14][15] and related parts on GO and FMAO) and by development of new ontologies based on its principles.[16]

The original set of ontologies also included the Zebrafish Anatomical Ontology[17] (a part of the Zebrafish Information Network), the CheBI ontology, the Disease Ontology, the Plant Ontology, the Sequence Ontology, the Ontology for Biomedical Investigations and the Protein Ontology.[16]

The number of ontologies in OBO has grown to the order of hundreds, and they are gathered in the list of OBO Foundry ontologies.

OBO Foundry and Wikidata

[edit]

A number of different OBO Foundry ontologies have also been integrated to the Wikidata knowledge graph.[18][19] This has led to the integration of OBO structured ontologies to data from other, non-OBO databases . For example, the integration of the Human Disease Ontology[20] to Wikidata has enabled its link to the description of cell-lines from the resource Cellosaurus.[21] One of the goals of the integration of OBO Foundry to Wikidata has been to lower the barriers for non-ontologists to contribute to and use ontologies. Wikidata is arguably easier to understand and use than the traditional ontology models (which require a high degree of specific expertise).[22]

Principles

[edit]

Summary of OBO Foundry Principles[23] for development of an OBO-compatible life sciences ontology:

Openness

[edit]

The ontologies are openly available and have to be released under either the license CC-BY 3.0 or under the public domain (CC0).[24] The openness of the ontologies has enabled, for example, the import of terms from the Gene Ontology (one of the ontologies that follow OBO Principles) to the Wikidata project.[25]

Common format

[edit]

The ontologies have to be available in a common formal language. In practice, that means that ontologies that are part of the OBO foundry need to describe items unsing the formats OWL/OWL2 or OBO using a RDF/XML syntax to maximize interoperability.[26]

Orthogonality

[edit]
Mapping from OBO IDs to OBO Unified Resource Identifiers (URIs), unique for each item.[10]

Terms should be unique in the OBO space, meaning that each item has a unique ontology prefix (such as CHEBI, GO, PRO) and a local numeric identifier within the ontology.[27] The choice of a numerical ID was made in order to improve maintenance and evolution of the resources.[28] In order to participate in OBO Foundry, ontologies have to be orthogonal and the concepts it models must be unique within OBO, so each concept has a single Uniform Resource Identifier (URI). New ontologies have, then, to reuse work done in other efforts.[28]

Despite the ideal of uniqueness of terms and interoperability, in practice, this is difficult to enforce, leading to the occurrence of term duplication. Furthermore, some ontologies do not reuse terms or even reuse terms inappropriately.[29]

Versioning

[edit]

Ontologies evolve in time, refining concepts and descriptions according to advances in the knowledge of their specific domains.[30] In order to ensure that new versions are updated, but tools that use older version of the ontologies are still function, OBO enforces a system of versioning systems, with each ontology version receiving a unique identifier, either in the format of a date or a numbering system, and metadata dags.[31]

Scope

[edit]

The ontologies should have a clearly specified scope (the domain it intends to cover).[32]

Have textual definitions

[edit]

The ontologies should have textual definitions for each item, in a human-readable way. That means that beside the alphanumeric identification for each item, they should be described in natural language by logical affirmations following the Aristotelian logic in a way that is unique within the ontology.[33]

Standardized relations and the Relation Ontology (RO)

[edit]

The ontologies should use relations between items from the Relations Ontology (RO). This ensures that different ontologies can integrated seamlessly, which is specially important for logical inference.[34]

The Relation Ontology (RO) is an ontology designed to represent the relationships between different biomedical concepts.[35] It describes rigorously relations like "part_of", "located_in" and "preceded_by" that are reused by many OBO Foundry ontologies.

Documentation

[edit]

OBO ontologies need to be thoroughly documented. Frequently this is done via GitHub repositories for each specific ontologies (see List of OBO Foundry ontologies).[36]

Plurality of users

[edit]

The ontologies should be useful for multiple different people, and ontology developers should document the evidence of use. This criterion is important for the review process. Examples of use include linking to terms by other ontologies, use in semantic web projects, use in annotations or other research applications.[37]

Openness to collaborations

[edit]

The ontologies should be developed in a way that allows collaborations with other OBO Foundry members.[38]

Locus of authority

[edit]

The ontologies should have one person responsible for the ontology who mediates interaction with the community.[39]

Naming conventions

[edit]

Naming conventions for OBO ontologies aim at making primary labels unambiguous and unique inside the ontology (and preferably, inside OBO). Labels and synonyms should be written in English, avoiding the use of underscores and camel case.[40] OBO lacks a mechanism for multilingual support, in contrast to Wikidata, which allows labels in different systems. The naming system in OBO is based on a series of surveys at cataloguing naming conventions of current ontologies, as well as discover issues relating to these conventions.[41]

Maintenance

[edit]

The ontologies should be updated with regards to changes in scientific consensus. The OBO Foundry defines scientific consensus as "multiple publications by independent labs over a year come to the same conclusion, and there is no or limited (<10%) dissenting opinions published in the same time frame."[42]

See also

[edit]

References

[edit]
  1. ^ a b c Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, et al. (November 2007). "The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration". Nature Biotechnology. 25 (11): 1251–5. doi:10.1038/nbt1346. PMC 2814061. PMID 17989687.
  2. ^ Mungall, Christopher J; Gkoutos, Georgios V; Smith, Cynthia L; Haendel, Melissa A; Lewis, Suzanna E; Ashburner, Michael (2010). "Integrating phenotype ontologies across multiple species". Genome Biology. 11 (1): R2. doi:10.1186/gb-2010-11-1-r2. ISSN 1465-6906. PMC 2847714. PMID 20064205.
  3. ^ Shimoyama, Mary; Dwinell, Melinda; Jacob, Howard (2009-08-05). "Multiple Ontologies for Integrating Complex Phenotype Datasets". Nature Precedings. doi:10.1038/npre.2009.3554. ISSN 1756-0357.
  4. ^ Brush MH, Vasilevsky N, Torniai C, Johnson T, Shaffer C, Haendel M (2011). "Developing a reagent application ontology within the OBO foundry framework". CEUR Workshop Proceedings. 833: 234–236.
  5. ^ Santamaria SL (2012). Development the Animals in Context ontology (PDF). Proceedings of the International Conference on Biomedical Ontology. Graz.
  6. ^ Seyed, Patrice, and Stuart C. Shapiro. (2011). Applying Rigidity to Standardizing OBO Foundry Candidate Ontologies (PDF). Proceedings of the International Conference on Biomedical Ontology (CEUR 993).{{cite conference}}: CS1 maint: multiple names: authors list (link)
  7. ^ Jackson RC, Balhoff JP, Douglass E, Harris NL, Mungall CJ, Overton JA (July 2019). "ROBOT: A Tool for Automating Ontology Workflows". BMC Bioinformatics. 20 (1): 407. doi:10.1186/s12859-019-3002-3. PMC 6664714. PMID 31357927.
  8. ^ Day-Richter J, Harris MA, Haendel M, Lewis S (August 2007). "OBO-Edit--an ontology editor for biologists". Bioinformatics. 23 (16): 2198–200. doi:10.1093/bioinformatics/btm112. PMID 17545183.
  9. ^ Wächter T, Schroeder M (June 2010). "Semi-automated ontology generation within OBO-Edit". Bioinformatics. 26 (12): i88-96. doi:10.1093/bioinformatics/btq188. PMC 2881373. PMID 20529942.
  10. ^ a b Tirmizi, Syed; Aitken, Stuart; Moreira, Dilvan A; Mungall, Chris; Sequeda, Juan; Shah, Nigam H; Miranker, Daniel P (2011). "Mapping between the OBO and OWL ontology languages". Journal of Biomedical Semantics. 2 (Suppl 1): S3. doi:10.1186/2041-1480-2-s1-s3. ISSN 2041-1480. PMC 3105495. PMID 21388572.
  11. ^ Golbreich, Christine; Horridge, Matthew; Horrocks, Ian; Motik, Boris; Shearer, Rob (2007), "OBO and OWL: Leveraging Semantic Web Technologies for the Life Sciences", The Semantic Web, Lecture Notes in Computer Science, vol. 4825, Springer Berlin Heidelberg, pp. 169–182, Bibcode:2007LNCS.4825..169G, doi:10.1007/978-3-540-76298-0_13, ISBN 978-3-540-76297-3
  12. ^ Antezana, E.; Egana, M.; De Baets, B.; Kuiper, M.; Mironov, V. (2008). "ONTO-PERL: An API for supporting the development and analysis of bio-ontologies". Bioinformatics. 24 (6): 885–887. doi:10.1093/bioinformatics/btn042. PMID 18245124.
  13. ^ Diehl, Alexander D.; Meehan, Terrence F.; Bradford, Yvonne M.; Brush, Matthew H.; Dahdul, Wasila M.; Dougall, David S.; He, Yongqun; Osumi-Sutherland, David; Ruttenberg, Alan; Sarntivijai, Sirarat; Van Slyke, Ceri E. (2016-07-04). "The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability". Journal of Biomedical Semantics. 7 (1): 44. doi:10.1186/s13326-016-0088-7. ISSN 2041-1480. PMC 4932724. PMID 27377652.
  14. ^ Bard, Jonathan; Rhee, Seung Y.; Ashburner, Michael (2005-01-14). "An ontology for cell types". Genome Biology. 6 (2): R21. doi:10.1186/gb-2005-6-2-r21. ISSN 1474-760X. PMC 551541. PMID 15693950.
  15. ^ Kelso, J. (2003-05-12). "eVOC: A Controlled Vocabulary for Unifying Gene Expression Data". Genome Research. 13 (6): 1222–1230. doi:10.1101/gr.985203. ISSN 1088-9051. PMC 403650. PMID 12799354.
  16. ^ a b Smith, Barry; Ashburner, Michael; Rosse, Cornelius; Bard, Jonathan; Bug, William; Ceusters, Werner; Goldberg, Louis J; Eilbeck, Karen; Ireland, Amelia; Mungall, Christopher J; Leontis, Neocles (November 2007). "The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration". Nature Biotechnology. 25 (11): 1251–1255. doi:10.1038/nbt1346. ISSN 1087-0156. PMC 2814061. PMID 17989687.
  17. ^ Van Slyke, Ceri E.; Bradford, Yvonne M.; Westerfield, Monte; Haendel, Melissa A. (2014-02-25). "The zebrafish anatomy and stage ontologies: representing the anatomy and development of Danio rerio". Journal of Biomedical Semantics. 5 (1): 12. doi:10.1186/2041-1480-5-12. ISSN 2041-1480. PMC 3944782. PMID 24568621.
  18. ^ Waagmeester, Andra; Stupp, Gregory; Burgstaller-Muehlbacher, Sebastian; Good, Benjamin M; Griffith, Malachi; Griffith, Obi L; Hanspers, Kristina; Hermjakob, Henning; Hudson, Toby S; Hybiske, Kevin; Keating, Sarah M (2020-03-17). Rodgers, Peter; Mungall, Chris (eds.). "Wikidata as a knowledge graph for the life sciences". eLife. 9: e52614. doi:10.7554/eLife.52614. ISSN 2050-084X. PMC 7077981. PMID 32180547.
  19. ^ Turki, Houcemeddine; Shafee, Thomas; Hadj Taieb, Mohamed Ali; Ben Aouicha, Mohamed; Vrandečić, Denny; Das, Diptanshu; Hamdi, Helmi (2019-11-01). "Wikidata: A large-scale collaborative ontological medical database". Journal of Biomedical Informatics. 99: 103292. doi:10.1016/j.jbi.2019.103292. ISSN 1532-0464. PMID 31557529.
  20. ^ Schriml, Lynn M.; Mitraka, Elvira; Munro, James; Tauber, Becky; Schor, Mike; Nickle, Lance; Felix, Victor; Jeng, Linda; Bearer, Cynthia; Lichenstein, Richard; Bisordi, Katharine (2019-01-08). "Human Disease Ontology 2018 update: classification, content and workflow expansion". Nucleic Acids Research. 47 (D1): D955–D962. doi:10.1093/nar/gky1032. ISSN 0305-1048. PMC 6323977. PMID 30407550.
  21. ^ "HeLa". www.wikidata.org. Retrieved 2020-05-04.
  22. ^ Jacobsen, Annika; Waagmeester, Andra; Kaliyaperumal, Rajaram; Stupp, Gregory S.; M. Schriml, Lynn; Thompson, Mark; I. Su, Andrew; Roos, Marco (2018-12-04). "Wikidata as an intuitive resource towards semantic data modeling in data FAIRification". Figshare. doi:10.6084/m9.figshare.7415282.v2.
  23. ^ "Overview". obofoundry.org. Retrieved 2020-02-06.
  24. ^ "Open (principle 1)". obofoundry.org. Retrieved 2020-02-06.
  25. ^ Burgstaller-Muehlbacher S, Waagmeester A, Mitraka E, Turner J, Putman T, Leong J, et al. (2016-01-01). "Wikidata as a semantic framework for the Gene Wiki initiative". Database. 2016: baw015. doi:10.1093/database/baw015. PMC 4795929. PMID 26989148.
  26. ^ "Common Format (principle 2)". obofoundry.org. Retrieved 2020-02-06.
  27. ^ "URI/Identifier Space (principle 3)". obofoundry.org. Retrieved 2020-02-06.
  28. ^ a b Courtot M, Mungall C, Brinkman RR, Ruttenberg A (2010). Building the OBO Foundry-One Policy at a Time. CEURS Proceedings: International Conference on Biomedical Ontologies.
  29. ^ Ghazvinian A, Noy NF, Musen MA (May 2011). "How orthogonal are the OBO Foundry ontologies?". Journal of Biomedical Semantics. 2 (Suppl 2): S2. doi:10.1186/2041-1480-2-s2-s2. PMC 3102891. PMID 21624157.
  30. ^ Groß, Anika; Pruski, Cédric; Rahm, Erhard (2016). "Evolution of biomedical ontologies and mappings: Overview of recent approaches". Computational and Structural Biotechnology Journal. 14: 333–340. doi:10.1016/j.csbj.2016.08.002. ISSN 2001-0370. PMC 5018063. PMID 27642503.
  31. ^ "Versioning (principle 4)". obofoundry.org. Retrieved 2020-02-06.
  32. ^ "Scope (principle 5)". obofoundry.org. Retrieved 2020-02-06.
  33. ^ "Textual Definitions (principle 6)". obofoundry.org. Retrieved 2020-02-06.
  34. ^ "Relations (principle 7)". obofoundry.org. Retrieved 2020-02-06.
  35. ^ Smith, Barry; Ceusters, Werner; Klagges, Bert; Köhler, Jacob; Kumar, Anand; Lomax, Jane; Mungall, Chris; Neuhaus, Fabian; Rector, Alan L; Rosse, Cornelius (2005). "Relations in biomedical ontologies". Genome Biology. 6 (5): R46. doi:10.1186/gb-2005-6-5-r46. PMC 1175958. PMID 15892874.
  36. ^ "Documentation (principle 8)". obofoundry.org. Retrieved 2020-02-06.
  37. ^ "Documented Plurality of Users (principle 9)". obofoundry.org. Retrieved 2020-02-06.
  38. ^ "Commitment To Collaboration (principle 10)". obofoundry.org. Retrieved 2020-02-06.
  39. ^ "Locus of Authority (principle 11)". obofoundry.org. Retrieved 2020-02-06.
  40. ^ "Naming Conventions (principle 12)". obofoundry.org. Retrieved 2020-02-06.
  41. ^ Schober, Daniel; Smith, Barry; Lewis, Suzanna E; Kusnierczyk, Waclaw; Lomax, Jane; Mungall, Chris; Taylor, Chris F; Rocca-Serra, Philippe; Sansone, Susanna-Assunta (2009). "Survey-based naming conventions for use in OBO Foundry ontology development". BMC Bioinformatics. 10 (1): 125. doi:10.1186/1471-2105-10-125. ISSN 1471-2105. PMC 2684543. PMID 19397794.
  42. ^ "Maintenance (principle 16)". obofoundry.org. Retrieved 2020-02-06.
[edit]