BioNLP 2011 - Gene Regulation Event Extraction
Biomedical event extraction for gene regulation, based on the BioNLP 2011 Shared Task (Kim et al., ACL Workshop 2011). Annotators identify biological entities and mark regulatory events such as gene expression, transcription, and protein catabolism in scientific abstracts.
Configuration Fileconfig.yaml
# BioNLP 2011 - Gene Regulation Event Extraction
# Based on Kim et al., ACL Workshop 2011
# Paper: https://aclanthology.org/W11-1801/
# Dataset: https://2011.bionlp-st.org/
#
# This task involves two annotation layers:
# 1. Entity annotation: Mark spans of Gene, Protein, and other biological entities
# 2. Event annotation: Identify regulatory events (expression, transcription, etc.)
#
# Event types include:
# - Gene_expression: Production of a gene product
# - Transcription: DNA to mRNA conversion
# - Protein_catabolism: Protein degradation
# - Localization: Movement of a protein to a cellular location
# - Binding: Physical interaction between proteins
# - Positive_regulation: Upregulation of a process
# - Negative_regulation: Downregulation of a process
# - Regulation: General regulatory relationship
annotation_task_name: "BioNLP 2011: Gene Regulation Event Extraction"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: span
name: entity_types
description: "Highlight and label biological entities in the text"
labels:
- "Gene"
- "Protein"
- "Cell_type"
- "Cell_line"
- "DNA_domain"
- "RNA"
tooltips:
"Gene": "A gene name or symbol (e.g., p53, BRCA1)"
"Protein": "A protein name or product (e.g., NF-kB, interleukin-2)"
"Cell_type": "A type of cell (e.g., T cell, macrophage)"
"Cell_line": "A specific cell line (e.g., Jurkat, HeLa)"
"DNA_domain": "A DNA region or domain (e.g., promoter, enhancer)"
"RNA": "An RNA molecule (e.g., mRNA, miRNA)"
- annotation_type: event_annotation
name: regulation_events
description: "Identify gene regulation events and their types in the text"
annotation_instructions: |
Annotate biomedical text for gene regulation events:
1. First, use the span tool to mark all biological entities (genes, proteins, cell types, etc.).
2. Then, use the event annotation tool to identify regulatory events.
3. Common event types: gene expression, transcription, binding, positive/negative regulation.
4. Each event should have a trigger word and one or more entity arguments.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #166534;">Biomedical Text:</strong>
<p style="font-size: 16px; line-height: 1.8; margin: 8px 0 0 0;">{{text}}</p>
</div>
<div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 12px;">
<p style="font-size: 13px; color: #713f12; margin: 0;"><strong>Instructions:</strong> First mark entity spans (Gene, Protein, etc.), then annotate regulation events.</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "bionlp_001",
"text": "Activation of NF-kappa B by IL-2 is mediated through the phosphorylation of I kappa B-alpha in human T lymphocytes. The transcription factor NF-kappa B plays a critical role in the regulation of gene expression in these cells."
},
{
"id": "bionlp_002",
"text": "The expression of interleukin-2 receptor alpha chain (IL-2R alpha) is rapidly induced on T cells following antigenic stimulation. This upregulation is controlled by the transcription factors NFAT and AP-1 binding to the IL-2R alpha promoter."
}
]
// ... and 8 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/domain-specific/bionlp-gene-regulation-events potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
Event Annotation
N-ary event annotation with trigger spans and typed argument roles. Annotate events like ATTACK, HIRE, and TRAVEL with constrained entity arguments and hub-spoke arc visualization.
Causal Medical Claim Detection and PICO Extraction
Detection of causal claims in medical texts and extraction of PICO (Population, Intervention, Comparator, Outcome) elements. Based on SemEval-2023 Task 8 (Khetan et al.).
ChemProt - Chemical-Protein Interaction Annotation
Identify chemical and gene/protein entities and classify their interaction types in biomedical text, based on the ChemProt corpus from BioCreative VI (Krallinger et al., 2017). Supports relation extraction for drug-target interaction mining from literature.