RSS 2.0
Critical Assessment of Information Extraction in Biology - data sets are available from Resources/Corpora and require registration.

BC Workshop '12

Track II- Workflow [2011-09-09]

We are calling for curation teams to produce a document describing their curation process as it starts from selection of articles for curation (as journal articles or abstracts) culminating in database entries.

As a help in this important enterprise, we have put together two resources:

  • A sample workflow description and workflow diagram, provided by the Comparative Toxicogenomics Database group -- see Example of CTD workflow;
  • An outline suggesting some of the issues that would be useful to text mining developers who are seeking to produce algorithms and tools to assist the curation process (see below).
  • OUTLINE for Description of Biocuration Workflow

    1. Introduction
    2. a. Overall philosophy and what information is captured.
      b. What use is being made of this information or is envisioned for this information?
      c. The current workflow of the operation.
      d. How are the links to biomedical literature captured?
    3. Encoding methods
    4. a. How is the information captured to make it machine readable?
      b. What entities are involved and how are they entered in the database?
      c. What relationships are involved and how are they symbolized?
      d. What standardized or controlled vocabularies are used?
      e. Give examples of a variety of data elements and how they appear in the database.
    5. Information access
    6. a. When a curator runs into a problem or a difficult case what kind of information is needed to solve it?
      b. What kind of internet searching is used most often in difficult cases? dictionary? wikipedia? other database?
    7. Use of text mining tools
    8. a. What text mining tools do you currently employ in your workflow and what problems do these algorithms solve for you?
      b. What problems do you have that are not currently solved, but which you think could be amenable to a text mining solution (i.e., for which steps text mining could overcome current bottlenecks in the existing pipeline)?

    Your description of your workflow and any available annotated corpora will be valuable for considering workflows and challenge tasks for the future BioCreative IV challenge.

    Document format
    The file should be up to 6 pages, including figures; Word or .rtf files would be preferred.

    Important Dates

    Item Deadline Submit via Comment
    Team Registration November 15, 2011 web register here
    Submission of Biocuration Workflow December 31, 2011 email to Subject:BioCreative-2012 Track II

    Return to Homepage