RSS 2.0
Critical Assessment of Information Extraction in Biology - data sets are available from Resources/Corpora and require registration.

BioCreative VII

Track 3 - Automatic extraction of medication names in tweets [2020-01-22]

Task Motivation

Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step towards incorporating Twitter data in pharmacoepidemiological research is to automatically recognize medication mentions in tweets. A common approach is to search for tweets containing lexical matches of drug names occurring in a manually compiled dictionary. Even allowing for variants and misspellings, this approach has several limitations. In our previous study [Weissenbacher et. al, 2019], when using the lexical match approach on a corpus where names of drugs are rare, we retrieved only 71% of the tweets that we manually identified as mentioning a drug, and more than 45% of the tweets retrieved were false positives. For example, tweets that mention Lyrica are predominantly about the singer, Lyrica Anderson, and not about the antiepileptic drug. In addition, descriptive text and medication class mentions (such as ‘my blood pressure med’ or ‘my anti-seizure pill’), as well as compounds and ‘street’ names for medications (‘the blue pill’) present additional challenges. This competition will be an opportunity to go beyond the lexical match approach, providing new methods to improve the extraction of drugs mentioned in posts and enhancing the utility of social media for public health research.

Task Definition: Automatic extraction of medication names in tweets

The goal of this task is to extract the spans that mention a medication or dietary supplement in tweets. The dataset consists of all tweets posted by 212 Twitter users during their pregnancy. This data represents the natural and highly imbalanced distribution of drug mentions in Twitter, with only approximately 0.2% of the tweets mentioning a medication. Training and evaluating a sequence labeler on this data set will closely model the detection of drugs in tweets in practice.
  • Training data: ~140,000 tweets (~350 tweets mentioning at least one drug, ~140,000 tweets by the same 212 users, not mentioning drugs)
  • Test data: ~60,000 tweets
  • Evaluation metric: exact and partial F1-scores for the positive class (i.e., the correct spans of drug name)
  • Contact information: Davy Weissenbacher (
  • Codalab: TBA

For each tweet, the publicly available data set contains: i. the tweet ID, ii. the text of the tweet, iii. the start and iv. end of the span, v. the text covered by the span in the tweet, vi. the normalized drug name (empty if the tweet did not mention a drug).
Note 1: if a tweet mentions 2 or more drugs, the tweet is repeated 2 or more times with the mention of each drug in each repetition as shown below. The evaluation data will just contain the tweet IDs and the text of the tweet.
Note 2: participants will not be evaluated on the normalization task, just the extraction task, i.e. retrieving the span positions.

tweet ID             text                                                                  Begin    End     span            drug normalized    
397783574797352960   Only 3 Arnica Balms left...                                           8        11      Arnica Balms    arnica balm        
404288692514078720   @user sudafed that I'm not sure I'm comfortable taking it.            7        13      sudafed         sudafed            
343961712334686205   I like this song!                                                     -        -       -               -
424441978835570688   @user no my body hurts, they prescribed me hydros and moltrin         44       49      hydros          hydrocodone        
424441978835570688   @user no my body hurts, they prescribed me hydros and moltrin         55       61      moltrin         motrin            

Important dates:

Training data available: March 15, 2020
Test data available: TBA
System predictions for test data due: TBA

Task organizers:

Graciela Gonzalez-Hernandez, University of Pennsylvania, USA
Davy Weissenbacher, University of Pennsylvania, USA
Ivan Flores, University of Pennsylvania, USA
Karen O’Connor, University of Pennsylvania, USA