n2c2 2022: Social Determinants of Health Extraction
Extract social determinants of health (SDOH) from clinical notes. Annotators identify SDOH entities (Employment, Housing, Substance Use, etc.) and their attributes (status, type, temporal information). Based on the n2c2 2022 Track 2 shared task for SDOH extraction from clinical narratives.
設定ファイルconfig.yaml
# n2c2 2022 Track 2: Social Determinants of Health Extraction
# Based on Lybarger et al., JAMIA 2023
# Paper: https://academic.oup.com/jamia/article/30/8/1382/7184752
# Dataset: https://portal.dbmi.hms.harvard.edu/projects/n2c2-2022-track2/
#
# Task: Extract SDOH entities and attributes from clinical notes
# Annotators identify mentions of social determinants and classify
# their status attributes.
#
# SDOH Entity Types:
# - Employment: job status, occupation, work-related information
# - LivingStatus: housing, living arrangements, homelessness
# - Alcohol: alcohol use, drinking behaviors
# - Drug: illicit drug use, substance abuse
# - Tobacco: smoking, tobacco/nicotine use
# - Insurance: health insurance status, coverage information
#
# Status Attributes: Current, Past, None, Unknown
annotation_task_name: "n2c2 SDOH Extraction from Clinical Notes"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
# Step 1: Identify SDOH entity mentions
- annotation_type: span
name: sdoh_entity
description: "Highlight mentions of social determinants of health in the clinical text"
labels:
- "Employment"
- "LivingStatus"
- "Alcohol"
- "Drug"
- "Tobacco"
- "Insurance"
label_colors:
"Employment": "#22c55e"
"LivingStatus": "#3b82f6"
"Alcohol": "#ef4444"
"Drug": "#f97316"
"Tobacco": "#eab308"
"Insurance": "#8b5cf6"
tooltips:
"Employment": "Mentions of job status, occupation, work, unemployment, retirement, or disability"
"LivingStatus": "Mentions of housing situation, living arrangements, homelessness, or shelter"
"Alcohol": "Mentions of alcohol consumption, drinking, or alcohol-related behaviors"
"Drug": "Mentions of illicit drug use, substance abuse, or recreational drug use"
"Tobacco": "Mentions of smoking, tobacco use, vaping, chewing tobacco, or nicotine"
"Insurance": "Mentions of health insurance status, coverage type, or lack of insurance"
allow_overlapping: false
# Step 2: Status of the identified SDOH entity
- annotation_type: radio
name: entity_status
description: "What is the current status of the identified SDOH factor?"
labels:
- "Current"
- "Past"
- "None"
- "Unknown"
keyboard_shortcuts:
"Current": "c"
"Past": "p"
"None": "n"
"Unknown": "u"
tooltips:
"Current": "The patient currently has this SDOH status (e.g., currently smokes, currently employed)"
"Past": "The patient had this status in the past (e.g., former smoker, previously homeless)"
"None": "The patient explicitly does not have this status (e.g., denies alcohol, never smoked)"
"Unknown": "The status cannot be determined from the available text"
html_layout: |
<div style="margin-bottom: 10px; padding: 8px; background: #f8f9fa; border-radius: 6px;">
<strong>Note Type:</strong> {{note_type}}
</div>
<div style="padding: 10px; border: 1px solid #ddd; border-radius: 6px; line-height: 1.8; font-family: monospace; white-space: pre-wrap;">
{{text}}
</div>
allow_all_users: true
instances_per_annotator: 100
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
サンプルデータsample-data.json
[
{
"id": "n2c2_001",
"text": "SOCIAL HISTORY: The patient is a 58-year-old male, retired truck driver. He quit smoking approximately 10 years ago after a 30 pack-year history. He reports occasional alcohol use, approximately 2-3 beers per week. Denies any illicit drug use. He lives with his wife in a single-family home.",
"note_type": "History and Physical"
},
{
"id": "n2c2_002",
"text": "Social Hx: Current smoker, 1 PPD x 25 years. Drinks heavily, approximately 6 beers daily. Reports prior marijuana use but states he stopped 5 years ago. Currently unemployed, lost his job 3 months ago. Living in a shelter. No health insurance.",
"note_type": "Admission Note"
}
]
// ... and 8 more itemsこのデザインを取得
Clone or download from the repository
クイックスタート:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/domain-specific/n2c2-sdoh-extraction potato start config.yaml
詳細
アノテーションタイプ
ドメイン
ユースケース
タグ
問題を見つけた場合やデザインを改善したい場合は?
Issueを作成関連デザイン
Clinical TempEval - Temporal Information Extraction from Clinical Notes
Extraction of temporal information from clinical text, identifying time expressions, event mentions, and their temporal relations. Based on SemEval-2016 Task 12 (Clinical TempEval).
Analysis of Clinical Text: Disorder Identification and Normalization
Identify disorder mentions and their attributes in clinical discharge summaries, based on SemEval-2015 Task 14 (Elhadad et al.). Annotators mark disorder spans, body locations, severity indicators, and classify the assertion status of each disorder.
Aspect-Based Sentiment Analysis
Identification of aspect terms in review text with sentiment polarity classification for each aspect. Based on SemEval-2016 Task 5 (ABSA).