Track Coordinators
Cecilia ArighiMartin Krallinger
Kevin Cohen
John Wilbur
Ben Carterette
Table of Contents
1- Invitation to Text Mining Teams
We invite text mining teams for the submissions of system descriptions that highlight a text mining/NLP system with respect to a specific biocuration task. The description should be biocurator centric, providing clear examples of input (such as PMID, gene, keyword) and output (list of relevant articles, compound recognition, PPI, etc) of the system, and provide the context in which the system can be used effectively (e.g. the task is only applicable for articles about a given taxon group). The track is open to systems that process abstracts and/or full-length articles.
The description should be no longer than 6 pages including figures, and Word or RTF formats are preferred.
All submissions will be evaluated by the BioCreative Organizing Committee according to the following criteria:
- Relevance and Impact: Is the system currently being used in a biocuration task/workflow?
- Adaptability: Is it robust and adaptable to applications for other related biocuration tasks (i.e., can be utilized by multiple databases/resources)?
- Interactivity: Does it provide an interactive web interface for biocurator’s testing?
- Performance: Can the system be benchmarked and provide precision and recall for the task?
For system evaluation, the participating teams should:
- Define a curation task according to the system capabilities
- Provide a set of annotated examples as a practice test for curators prior to the evaluation (30-50 examples). The idea is to provide some examples of correct system output for the task, so curators learn what to expect from the system.
- Suggest biocurator(s) who could annotate literature corpus as the gold standard for evaluation before the workshop (approximately 50 documents)
- Benchmark system and submit appropriate metrics (precision/recall/f-measure/MAP) for the given task(s) by March1, 2012. Note that here we request your own benchmarking, which may have already been published, to make sure that the systems have gone through some formal evaluation.
The list of selected teams will be posted with accompanying description on or about January 20, 2012. The BioCreative Organizing Committee along with the teams will identify and recruit biocurators that will participate in the evaluation of the system.
The evaluation will include comparing time-on-task in manual vs. system activities, as well as precision and recall for uncurated and curated set comparing to gold standard (generated by the suggested biocurator, and blind to the systems).
To assist teams with this activity, a document with the description of a system and a proposed task is found at the end of this page under Downloads.
At the workshopThe selected systems will be presented on the second day of the workshop. A demo session, where the users (biocurators) will have the opportunity to use the systems, will follow.
Finally, the results and observations from the system evaluation will be presented.
DisseminationAccepted descriptions will be published in the workshop proceedings. Submission of documents should be done via Easychair. Deadline March 15.
Submit
Specifications
Systems should be web based and compatible with Mozilla Firefox 4.0 or higher.
2- Invitation to Biocurators
We invite biocurators to participate in a user study on the text mining system of their choice prior to and at the workshop. The user study will involve (i) manual curation and text-mining of the literature corpus by the biocurators; (ii) recording of the user interactions with the system (with logs of all queries and clicks); and (iii) a post-study survey.
3- Participating systems
System | Description | Article | Coordinator |
TextPresso | Curation of subcellular localization using Gene ontology cellular component | Full-text | Cecilia Arighi |
PCS | Curation of Entity-Quality terms from phylogenetic literature using ontologies | N/A | Cecilia Arighi |
Tagtog | Protein/gene mentions recognition via interactive learning and annotation framework | Abstract | John Wilbur |
PubTator | Document triage (relevant documents for curation) and bioconcept annotation (gene, disease, chemicals) | Abstract | Kevin Cohen |
PPIinterfinder | Mining of protein-protein interaction for human proteins (abstract and full legth articles):document classification and extraction of interacting proteins and keywords. | Abstract | Martin Krallinger |
eFIP | Mining Protein Interactions of Phosphorylated Proteins from the Literature. Document classification and information extraction of phosphorylated protein, protein binding partners and impact keyword | Abstract | Martin Krallinger |
Acetylation | Document retrieval and ranking based on relevance on protein acetylation | Abstract | Cecilia Arighi |
T-HOD | Document triage for disease-related genes (relevant documents for curation) and bioconcept annotation (gene, disease and relation) | Abstract | Cecilia Arighi |
More details about the systems can be found here.
4-Biocurator's Tasks for Pre-workshop evaluation
Prior to the workshop each biocurator will need to perform the following tasks:
Biocurators assigned for testing the system should install a client-side web-browser add-on curatorlogger.xpi (download and instruction available in Downloads at the end of this page) that will allow tracking time and user activity during testing. Users will be informed as to the nature of the data being collected and asked whether they want to opt out of data collection when the browser opens. The data that is collected will be sent to one of the organizers (Ben Carterette) automatically when a session is complete.
Manual task: the user will be given the list of documents in Pubmed environment for their manual processing and results should be stored in a spreadsheet with format provided by team.
Using system: curator will validate the output provided by the system and store information using the system or a spreadsheet (output to be determined by each system).
To complete Survey please click here
5- Biocurator's Checklist
- 1-Get documentation and link to tool from Coordinator or Team
- 2-Install curatorlogger (download available at the end of this page)
- 3-Use Mozilla Firefox 4.0 or higher for the activity
- 4-Remember to select play button when starting curation session, conversely select stop when you are done
- 5-Even when logger is used, time your activity independently
- 6-Practice on system with examples provided by Teams
- 7-Record results of manual curation in requested format (should be provided by Coordinator or Team)
- 8-Record results of system curation in requested format (should be provided by Coordinator or Team)
- 9-Complete survey
6- Important Dates
Item | Deadline | Submit via | Comment |
---|---|---|---|
Team Registration | November 15, 2011 | web | Closed |
Text Mining System Description | December 31, 2011 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III |
System Benchmarking Results | March 5, 2012 | email to arighi@dbi.udel.edu | Subject:BioCreative-2012 Track III-result |
Submission of Practice Test | March 5, 2012 | email to Coordinator | Subject:BioCreative-2012 Track III-test |
Interface Available for Testing | March 5, 2012 | email to Coordinator | Subject:BioCreative-2012 Track III-interface |
Biocurator's Evaluation Results | March 20-25, 2012 | email to Coordinator | Subject:BioCreative-2012 Track III-biocurator |
Workshop | Noon April 4- 5pm April 5, 2012 | Reporting from testing Track III will be presented |
All questions pertaining to this track should be directed to Cecilia Arighi (arighi@dbi.udel.edu).
Return to Homepage |