#Input: Input. Each file consists of a collection of text passages corresponding to figure captions for a given article.
# Each document block is either a single or multiple panel caption text passage.
# The BioC example below shows a couple of text passages from PMC4291477 Figure 1, panels A and B.
00000000sourcedata.key4291477 Figure_1-A134710.15252/embj.2014888964291477Figure 1-AFigure_1-A0Immunoblot of media and lysate collected from HEK293T cells overexpressing ERdj3WT or ERdj3KDEL. Fresh media was conditioned on cells for 24 h prior to harvest.4291477 Figure_1-B134710.15252/embj.2014888964291477Figure 1-BFigure_1-B0qPCR analysis of ERdj3 mRNA in HEK293T cells treated with or without thapsigargin (Tg; 6 h, 100 nM). Error bars represent the mean ± 95% confidence interval as calculated in DataAssist 2.0 (n = 3).
#Output: The goal is to provide bioentity annotations for the input text passages in BioC format.
# Teams can chose to annotate one or more of the following bioentity types: gene/protein (NCBI gene/UniProtKB), miRNA (Rfam), small molecules (ChEBI, PubChem), cellular components (GO CC), cell types and cell lines (cellosaurus, cell ontology), tissues and organs (Uberon), and species (NCBI Taxon).
# For a given bioentity type there may be more than one type of ID source (e.g., ChEBI and PubChem for small molecules). SourceData assigns one of these as primary source for IDs.
# In the case of protein/gene Sourcedata annotates these separately, but we will treat them interchangeably. In the example below, ERdj3 is linked to UniProt in one case, and NCBI gene in the second mention. We would accept as correct if both are linked to same type of source ID.
# To help in this task, a mapping between UniProtKB-NCBI gene, and ChEBI-PubChem is provided.
# Note that the bioentity type is implicit in the , based on the source, the bioentity it is.
# For a given passage block, each annotation should be assigned a numerical id , starting with id="1"
00000000sourcedata.key4291477 Figure_1-A134710.15252/embj.2014888964291477Figure 1-AFigure_1-A0Immunoblot of media and lysate collected from HEK293T cells overexpressing ERdj3WT or ERdj3KDEL. Fresh media was conditioned on cells for 24 h prior to harvest.Uniprot:Q9UBS4ERdj3CVCL_0063HEK293TNCBI gene:51726ERdj34291477 Figure_1-B134710.15252/embj.2014888964291477Figure 1-BFigure_1-B0qPCR analysis of ERdj3 mRNA in HEK293T cells treated with or without thapsigargin (Tg; 6 h, 100 nM). Error bars represent the mean ± 95% confidence interval as calculated in DataAssist 2.0 (n = 3).CHEBI:9516thapsigarginNCBI gene:51726ERdj3CVCL_0063HEK293TCHEBI:9516Tg
#SourceData annotation: Passages with annotations from SourceData curators. Note that the infon keys of the type "sourcedata_figure_annot_id" and "sourcedata_article_annot_id" are internal to SourceData.
00000000sourcedata.key4291477 Figure_1-A134710.15252/embj.2014888964291477Figure 1-AFigure_1-A0Immunoblot of media and lysate collected from HEK293T cells overexpressing ERdj3WT or ERdj3KDEL. Fresh media was conditioned on cells for 24 h prior to harvest.Uniprot:Q9UBS41210ERdj3CVCL_00632212HEK293TNCBI gene:517263213ERdj34291477 Figure_1-B134710.15252/embj.2014888964291477Figure 1-BFigure_1-B0qPCR analysis of ERdj3 mRNA in HEK293T cells treated with or without thapsigargin (Tg; 6 h, 100 nM). Error bars represent the mean ± 95% confidence interval as calculated in DataAssist 2.0 (n = 3).CHEBI:95161215thapsigarginNCBI gene:517262217ERdj3CVCL_00633218HEK293TCHEBI:95164219Tg