Skip to content
Showcase/DocBank Document Layout Detection
intermediatetext

DocBank Document Layout Detection

Document layout analysis benchmark (Li et al., COLING 2020). Detect and classify document elements including titles, abstracts, paragraphs, figures, tables, and captions.

Q1: Rate your experience12345Q2: Primary use case?ResearchIndustryEducationQ3: Additional feedback

Konfigurationsdateiconfig.yaml

# DocBank Document Layout Detection Configuration
# Based on Li et al., COLING 2020

annotation_task_name: "DocBank Document Layout Analysis"
task_dir: "."

data_files:
  - "sample-data.json"

item_properties:
  id_key: "id"
  text_key: "image_url"
  context_key: "context"

user_config:
  allow_all_users: true

annotation_schemes:
  - annotation_type: "multiselect"
    name: "document_elements"
    description: "Select all document elements visible"
    labels:
      - name: "abstract"
        tooltip: "Abstract section"
      - name: "author"
        tooltip: "Author names"
      - name: "caption"
        tooltip: "Figure/table captions"
      - name: "date"
        tooltip: "Publication date"
      - name: "equation"
        tooltip: "Mathematical equations"
      - name: "figure"
        tooltip: "Figures and images"
      - name: "footer"
        tooltip: "Page footer"
      - name: "header"
        tooltip: "Page header"
      - name: "list"
        tooltip: "Bulleted or numbered lists"
      - name: "paragraph"
        tooltip: "Body paragraphs"
      - name: "reference"
        tooltip: "Bibliography/references"
      - name: "section"
        tooltip: "Section headings"
      - name: "table"
        tooltip: "Tables"
      - name: "title"
        tooltip: "Document title"

  - annotation_type: "radio"
    name: "document_type"
    description: "What type of document is this?"
    labels:
      - name: "academic_paper"
        tooltip: "Academic/research paper"
      - name: "technical_report"
        tooltip: "Technical report"
      - name: "book_chapter"
        tooltip: "Book or chapter"
      - name: "news_article"
        tooltip: "News article"
      - name: "form"
        tooltip: "Form or application"
      - name: "invoice"
        tooltip: "Invoice or receipt"
      - name: "other"
        tooltip: "Other document type"

  - annotation_type: "radio"
    name: "layout_complexity"
    description: "How complex is the layout?"
    labels:
      - name: "single_column"
        tooltip: "Single column layout"
      - name: "double_column"
        tooltip: "Two column layout"
      - name: "multi_column"
        tooltip: "Three or more columns"
      - name: "mixed"
        tooltip: "Mixed column layouts"

  - annotation_type: "radio"
    name: "scan_quality"
    description: "Quality of the document image"
    labels:
      - name: "high"
        tooltip: "Clear, high-resolution"
      - name: "medium"
        tooltip: "Readable but some issues"
      - name: "low"
        tooltip: "Poor quality, hard to read"

  - annotation_type: "text"
    name: "element_boxes"
    description: "List bounding boxes: element_type,x1,y1,x2,y2 (one per line)"

interface_config:
  item_display_format: "<img src='{{text}}' style='max-width:100%; max-height:600px;'/><br/><small>{{context}}</small>"

output_annotation_format: "json"
output_annotation_dir: "annotations"

Beispieldatensample-data.json

[
  {
    "id": "doc_001",
    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Example_academic_paper.pdf/page1-424px-Example_academic_paper.pdf.jpg",
    "context": "Document page image. Identify all layout elements: title, abstract, sections, figures, tables, captions, etc."
  },
  {
    "id": "doc_002",
    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Schloss_Neuschwanstein_2013.jpg/1200px-Schloss_Neuschwanstein_2013.jpg",
    "context": "Analyze document layout structure. Mark headers, paragraphs, lists, and other elements."
  }
]

// ... and 1 more items

Dieses Design herunterladen

View on GitHub

Clone or download from the repository

Schnellstart:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/image/specialized/docbank
potato start config.yaml

Details

Annotationstypen

multiselectradiotext

Bereich

Document AILayout Analysis

Anwendungsfälle

Document UnderstandingLayout DetectionInformation Extraction

Schlagwörter

docbankdocumentlayoutocrcoling2020

Problem gefunden oder möchten Sie dieses Design verbessern?

Issue öffnen