DocBank Document Layout Detection
Document layout analysis benchmark (Li et al., COLING 2020). Detect and classify document elements including titles, abstracts, paragraphs, figures, tables, and captions.
Configuration Fileconfig.yaml
# DocBank Document Layout Detection Configuration
# Based on Li et al., COLING 2020
annotation_task_name: "DocBank Document Layout Analysis"
task_dir: "."
data_files:
- "sample-data.json"
item_properties:
id_key: "id"
text_key: "image_url"
context_key: "context"
user_config:
allow_all_users: true
annotation_schemes:
- annotation_type: "multiselect"
name: "document_elements"
description: "Select all document elements visible"
labels:
- name: "abstract"
tooltip: "Abstract section"
- name: "author"
tooltip: "Author names"
- name: "caption"
tooltip: "Figure/table captions"
- name: "date"
tooltip: "Publication date"
- name: "equation"
tooltip: "Mathematical equations"
- name: "figure"
tooltip: "Figures and images"
- name: "footer"
tooltip: "Page footer"
- name: "header"
tooltip: "Page header"
- name: "list"
tooltip: "Bulleted or numbered lists"
- name: "paragraph"
tooltip: "Body paragraphs"
- name: "reference"
tooltip: "Bibliography/references"
- name: "section"
tooltip: "Section headings"
- name: "table"
tooltip: "Tables"
- name: "title"
tooltip: "Document title"
- annotation_type: "radio"
name: "document_type"
description: "What type of document is this?"
labels:
- name: "academic_paper"
tooltip: "Academic/research paper"
- name: "technical_report"
tooltip: "Technical report"
- name: "book_chapter"
tooltip: "Book or chapter"
- name: "news_article"
tooltip: "News article"
- name: "form"
tooltip: "Form or application"
- name: "invoice"
tooltip: "Invoice or receipt"
- name: "other"
tooltip: "Other document type"
- annotation_type: "radio"
name: "layout_complexity"
description: "How complex is the layout?"
labels:
- name: "single_column"
tooltip: "Single column layout"
- name: "double_column"
tooltip: "Two column layout"
- name: "multi_column"
tooltip: "Three or more columns"
- name: "mixed"
tooltip: "Mixed column layouts"
- annotation_type: "radio"
name: "scan_quality"
description: "Quality of the document image"
labels:
- name: "high"
tooltip: "Clear, high-resolution"
- name: "medium"
tooltip: "Readable but some issues"
- name: "low"
tooltip: "Poor quality, hard to read"
- annotation_type: "text"
name: "element_boxes"
description: "List bounding boxes: element_type,x1,y1,x2,y2 (one per line)"
interface_config:
item_display_format: "<img src='{{text}}' style='max-width:100%; max-height:600px;'/><br/><small>{{context}}</small>"
output_annotation_format: "json"
output_annotation_dir: "annotations"
Sample Datasample-data.json
[
{
"id": "doc_001",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Example_academic_paper.pdf/page1-424px-Example_academic_paper.pdf.jpg",
"context": "Document page image. Identify all layout elements: title, abstract, sections, figures, tables, captions, etc."
},
{
"id": "doc_002",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Schloss_Neuschwanstein_2013.jpg/1200px-Schloss_Neuschwanstein_2013.jpg",
"context": "Analyze document layout structure. Mark headers, paragraphs, lists, and other elements."
}
]
// ... and 1 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/image/specialized/docbank potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
DocLayNet Document Layout Analysis
Document layout analysis with bounding box annotations. Annotators draw bounding boxes around layout elements (text blocks, tables, figures, headers, footers, lists) in document page images.
OmniDocBench Comprehensive Document Parsing
Comprehensive document parsing annotation covering layout detection, text recognition, table structure, and formula recognition. Annotators draw bounding boxes and provide text transcriptions for document elements.
FLUTE: Figurative Language Understanding through Textual Explanations
Figurative language understanding via NLI. Annotators classify figurative sentences (sarcasm, simile, metaphor, idiom) and provide textual explanations of the figurative meaning. The task combines natural language inference with fine-grained figurative language type classification.