RSS 2.0
Critical Assessment of Information Extraction in Biology - data sets are available from Resources/Corpora and require registration.

BioCreative V

Track 4- BEL Task [2015-01-23]

Track 4: Extraction of causal network information in Biological Expression Language (BEL)

The most detailed and up-to-date description of track 4 can be found on the openbel wiki at http://tinyurl.com/beltask.
The rest of this page contains only preliminary information.

Overview

Biological networks with a structured syntax are a powerful way of representing biological information and knowledge. Well-known examples of methods to formally represent biological networks are the Systems Biology Markup Language (SBML, Hucka et al., 2003) and the Biological Expression Language (BEL, www.openbel.org). Both approaches are not only designed for representation of biological events, but they are also intended to support downstream computational applications. In particular, BEL is gaining ground as the de-facto standard for systems biology applications, because it combines the power of a formalized representation language with a relatively simple syntax that allows easy interpretation of BEL statement by a trained domain expert.

As part of an on-going product assessment program, the sbvIMPROVER initiative is supporting the manual curation and expansion of biological networks related to human lung disease [1-6]. A large-scale crowdsourcing verification approach for the verification of these biological networks, called Network Verification Challenge (NVC) [7], was organized by them. This initiative aims to provide a measure of quality control of systems based research, supporting the verification of methods and concepts in this domain. The NVC supports community-based verification and extension of biological relationships based on peer-reviewed literature evidence. At present, 50 biological networks have been curated, resulting in a total of more than 180'000 relationships, all available in BEL format, with supporting evidence in form of a sentence or section and a PubMed identifier.

Using the data provided and validated through the sbvIMPROVER NVC, we invite members of the academic text mining community and providers of text mining solutions to develop and test novel approaches aiming at evaluating the usage of text mining for relation extraction and automated construction of network elements. The goal is to assess the utility of such tools either for the automated annotation and network expansion, or their suitability as supporting tools for assisted curation.

The challenge is organized into two tasks which will evaluate the complementary aspects of the problem:

  1. given selected textual evidence, construct the corresponding network fragment
  2. given a network fragment, detect all available textual evidence

BEL documentation

Extensive documentation about the syntax and usage of the BEL language will be provided to participants. An introductory explanation can be found at the OpenBEL website and at the BEL wiki.

Task 1

Short description: Given textual evidence for a BEL statement, generate the corresponding BEL statement.

Training data: A significant number of relationships systematically selected from the curated networks, with their evidence and the full BEL statement.

Test data: A smaller number of relationships from the same dataset. We provide only the evidence and the participants have to generate the BEL statement.

Evaluation: Fully automated, we will use P/R/F scores comparing BEL statements generated by the user with the corresponding human-generated BEL statements. Several ranking metrics will be provided.

Task 2

Short description: Given a BEL statement, provide at most 10 additional evidence sentences.

Training data: Same data as for Task 1

Test data: BEL statements WITHOUT evidence. The participants have to provide at most 10 sentences (ranked by confidence) with different PMIDs that offer evidence for each BEL statement.

Evaluation: Manual, a team of experts will analyze all evidence statements provided by the participants and classify them as correct or incorrect. We will then score each participants’ contribution using a ranking metric, such as TAP-k.

Dates

Please note that the dates are indicative only and subject to change
Release sample data: February 15
Release training data: February 28
Release test data: June 14
Submission of results deadline: June 16
Delivery of evaluation results: July 10
Submission of the papers: July 20
Provide feedback on the papers: August 1
Camera-ready: August 15
Workshop: September 9-11

Task organizing committee:

Dr. Fabio Rinaldi (OntoGene, Switzerland)
Dr. Juliane Fluck (Fraunhofer Institute, Germany)
Dr. Sam Ansari (sbvIMPROVER, Switzerland)
Dr. Julia Hoeng (sbvIMPROVER, Switzerland)
Prof. Dr. Martin Hofmann-Apitius (OpenBEL Consortium, Germany)

References:

1. De Leon, H., Boue, S., Schlage, W.K., Boukharov, N., Westra, J.W., Gebel, S., VanHooser, A., Talikka, M., Fields, R.B., Veljkovic, E. et al. (2014) A vascular biology network model focused on inflammatory processes to investigate atherogenesis and plaque instability. Journal of Translational Medicine, 12, 185.

2. Emek Demir et al., (2010) The BioPAX community standard for pathway data sharing, Nature Biotechnology 28, 935–942

3. Gebel, S., Lichtner, R.B., Frushour, B., Schlage, W.K., Hoang, V., Talikka, M., Hengstermann, A., Mathis, C., Veljkovic, E., Peck, M. et al. (2013) Construction of a Computable Network Model for DNA Damage, Autophagy, Cell Death, and Senescence. Bioinformatics and Biology Insights, 7, 97-117.

4. Park, J.S., Schlage, W.K., Frushour, B.P., Talikka, M., Toedter, G. , Gebel, S., Deehan, R., Veljkovic, E., Westra, J.W., Peck, M.J., Boue, S., Kogel, U., Gonzalez-Suarez, I., Hengstermann, A., Peitsch, M.C., Hoeng, J. (2013) Construction of a Computable Network Model of Tissue Repair and Angiogenesis in the Lung. Journal of Clinical Toxicology, S12, 002.

5. Schlage, W.K., Westra, J.W., Gebel, S., Catlett, N.L., Mathis, C., Frushour, B.P., Hengstermann, A., Van Hooser, A., Poussin, C., Wong, B. et al. (2011) A computable cellular stress network model for non-diseased pulmonary and cardiovascular tissue. BMC systems biology, 5, 168.

6. Westra, J.W., Schlage, W.K., Frushour, B.P., Gebel, S., Catlett, N.L., Han, W., Eddy, S.F., Hengstermann, A., Matthews, A.L., Mathis, C. et al. (2011) Construction of a Computable Cell Proliferation Network Focused on Non-Diseased Lung Cells. BMC systems biology, 5, 105.

7. Westra, J.W., Schlage, W.K., Hengstermann, A., Gebel, S., Mathis, C., Thomson, T., Wong, B., Hoang, V., Veljkovic, E., Peck, M. et al. (2013) A Modular Cell-Type Focused Inflammatory Process Network Model for Non-Diseased Pulmonary Tissue. Bioinformatics and Biology Insights, 7, 167-192.

8. Manuscript accepted for proceedings at the Pacific Symposium of Biocomputing, Hawaii, 2015. http://psb.stanford.edu/psb-online/proceedings/psb15/binder.pdf