BioCreative - Track III-Interactive TM

Critical Assessment of Information Extraction in Biology - data sets are available from Resources/Corpora and require registration.

BC Workshop '12

Track III-Interactive TM [2011-09-19]

Track Coordinators

Cecilia Arighi
Martin Krallinger
Kevin Cohen
John Wilbur
Ben Carterette

Invitation to TM teams

Invitation to Biocurators

Participating Systems

Biocurator's Tasks for pre-workshop evaluation

Biocurator's Checklist

Important dates

Downloads

1- Invitation to Text Mining Teams

We invite text mining teams for the submissions of system descriptions that highlight a text mining/NLP system with respect to a specific biocuration task. The description should be biocurator centric, providing clear examples of input (such as PMID, gene, keyword) and output (list of relevant articles, compound recognition, PPI, etc) of the system, and provide the context in which the system can be used effectively (e.g. the task is only applicable for articles about a given taxon group). The track is open to systems that process abstracts and/or full-length articles.

The description should be no longer than 6 pages including figures, and Word or RTF formats are preferred.

All submissions will be evaluated by the BioCreative Organizing Committee according to the following criteria:

Relevance and Impact: Is the system currently being used in a biocuration task/workflow?
Adaptability: Is it robust and adaptable to applications for other related biocuration tasks (i.e., can be utilized by multiple databases/resources)?
Interactivity: Does it provide an interactive web interface for biocurator’s testing?
Performance: Can the system be benchmarked and provide precision and recall for the task?

For system evaluation, the participating teams should:

Define a curation task according to the system capabilities
Provide a set of annotated examples as a practice test for curators prior to the evaluation (30-50 examples). The idea is to provide some examples of correct system output for the task, so curators learn what to expect from the system.
Suggest biocurator(s) who could annotate literature corpus as the gold standard for evaluation before the workshop (approximately 50 documents)
Benchmark system and submit appropriate metrics (precision/recall/f-measure/MAP) for the given task(s) by March1, 2012. Note that here we request your own benchmarking, which may have already been published, to make sure that the systems have gone through some formal evaluation.

The list of selected teams will be posted with accompanying description on or about January 20, 2012. The BioCreative Organizing Committee along with the teams will identify and recruit biocurators that will participate in the evaluation of the system.

The evaluation will include comparing time-on-task in manual vs. system activities, as well as precision and recall for uncurated and curated set comparing to gold standard (generated by the suggested biocurator, and blind to the systems).

To assist teams with this activity, a document with the description of a system and a proposed task is found at the end of this page under Downloads.

At the workshop

The selected systems will be presented on the second day of the workshop. A demo session, where the users (biocurators) will have the opportunity to use the systems, will follow.

Finally, the results and observations from the system evaluation will be presented.

Dissemination

Accepted descriptions will be published in the workshop proceedings. Submission of documents should be done via Easychair. Deadline March 15.

Submit

Specifications

Systems should be web based and compatible with Mozilla Firefox 4.0 or higher.

2- Invitation to Biocurators

We invite biocurators to participate in a user study on the text mining system of their choice prior to and at the workshop. The user study will involve (i) manual curation and text-mining of the literature corpus by the biocurators; (ii) recording of the user interactions with the system (with logs of all queries and clicks); and (iii) a post-study survey.

3- Participating systems

System	Description	Article	Coordinator
TextPresso	Curation of subcellular localization using Gene ontology cellular component	Full-text	Cecilia Arighi
PCS	Curation of Entity-Quality terms from phylogenetic literature using ontologies	N/A	Cecilia Arighi
Tagtog	Protein/gene mentions recognition via interactive learning and annotation framework	Abstract	John Wilbur
PubTator	Document triage (relevant documents for curation) and bioconcept annotation (gene, disease, chemicals)	Abstract	Kevin Cohen
PPIinterfinder	Mining of protein-protein interaction for human proteins (abstract and full legth articles):document classification and extraction of interacting proteins and keywords.	Abstract	Martin Krallinger
eFIP	Mining Protein Interactions of Phosphorylated Proteins from the Literature. Document classification and information extraction of phosphorylated protein, protein binding partners and impact keyword	Abstract	Martin Krallinger
Acetylation	Document retrieval and ranking based on relevance on protein acetylation	Abstract	Cecilia Arighi
T-HOD	Document triage for disease-related genes (relevant documents for curation) and bioconcept annotation (gene, disease and relation)	Abstract	Cecilia Arighi

More details about the systems can be found here.

4-Biocurator's Tasks for Pre-workshop evaluation

Prior to the workshop each biocurator will need to perform the following tasks:

Install curator logger to track time-on-task and curator web-based activities

Biocurators assigned for testing the system should install a client-side web-browser add-on curatorlogger.xpi (download and instruction available in Downloads at the end of this page) that will allow tracking time and user activity during testing. Users will be informed as to the nature of the data being collected and asked whether they want to opt out of data collection when the browser opens. The data that is collected will be sent to one of the organizers (Ben Carterette) automatically when a session is complete.

Get Trained: use examples provided by the teams to familiarized with the assigned system. Also make sure you get information about the curation guidelines for the particular task

Perform Evaluation: curate a set of documents manually (approx. 25) and a set of documents using the selected system (approx. 25).

Manual task: the user will be given the list of documents in Pubmed environment for their manual processing and results should be stored in a spreadsheet with format provided by team.

Using system: curator will validate the output provided by the system and store information using the system or a spreadsheet (output to be determined by each system).

Complete Survey: After the evaluation of the systems users should complete the survey in which users will be asked additional questions that may help elucidate their experience with the system.
To complete Survey please click here

5- Biocurator's Checklist

1-Get documentation and link to tool from Coordinator or Team

2-Install curatorlogger (download available at the end of this page)

3-Use Mozilla Firefox 4.0 or higher for the activity

4-Remember to select play button when starting curation session, conversely select stop when you are done

5-Even when logger is used, time your activity independently

6-Practice on system with examples provided by Teams

7-Record results of manual curation in requested format (should be provided by Coordinator or Team)

8-Record results of system curation in requested format (should be provided by Coordinator or Team)

9-Complete survey

6- Important Dates

Item	Deadline	Submit via	Comment
Team Registration	November 15, 2011	web	Closed
Text Mining System Description	December 31, 2011	email to arighi@dbi.udel.edu	Subject:BioCreative-2012 Track III
System Benchmarking Results	March 5, 2012	email to arighi@dbi.udel.edu	Subject:BioCreative-2012 Track III-result
Submission of Practice Test	March 5, 2012	email to Coordinator	Subject:BioCreative-2012 Track III-test
Interface Available for Testing	March 5, 2012	email to Coordinator	Subject:BioCreative-2012 Track III-interface
Biocurator's Evaluation Results	March 20-25, 2012	email to Coordinator	Subject:BioCreative-2012 Track III-biocurator
Workshop	Noon April 4- 5pm April 5, 2012		Reporting from testing Track III will be presented