Skip to content
Showcase/DocLayNet Document Layout Analysis
intermediatetext

DocLayNet Document Layout Analysis

Document layout analysis with bounding box annotations. Annotators draw bounding boxes around layout elements (text blocks, tables, figures, headers, footers, lists) in document page images.

Q1: Rate your experience12345Q2: Primary use case?ResearchIndustryEducationQ3: Additional feedback

設定ファイルconfig.yaml

# DocLayNet Document Layout Analysis Configuration
# Based on Pfitzmann et al., KDD 2022

annotation_task_name: "DocLayNet Document Layout Analysis"
task_dir: "."

data_files:
  - "sample-data.json"

item_properties:
  id_key: "id"
  text_key: "image_url"
  context_key: "document_type"

user_config:
  allow_all_users: true

annotation_schemes:
  - annotation_type: "text"
    name: "bounding_boxes"
    description: "Draw bounding boxes around each document layout element (format: element_type,x,y,width,height per line)"

  - annotation_type: "multiselect"
    name: "element_types"
    description: "Select all layout element types present in this document page"
    labels:
      - name: "Caption"
        tooltip: "Captions for figures, tables, or other floats"
      - name: "Footnote"
        tooltip: "Footnotes at the bottom of the page"
      - name: "Formula"
        tooltip: "Mathematical formulas or equations"
      - name: "List-item"
        tooltip: "Individual items in bulleted or numbered lists"
      - name: "Page-footer"
        tooltip: "Footer content (page numbers, running text at bottom)"
      - name: "Page-header"
        tooltip: "Header content (running titles, section info at top)"
      - name: "Picture"
        tooltip: "Images, diagrams, charts, or other visual elements"
      - name: "Section-header"
        tooltip: "Section or subsection headings"
      - name: "Table"
        tooltip: "Tabular data structures"
      - name: "Text"
        tooltip: "Regular paragraph text or body content"
      - name: "Title"
        tooltip: "Document title or major heading"

  - annotation_type: "radio"
    name: "document_category"
    description: "What category best describes this document?"
    labels:
      - name: "financial_report"
        tooltip: "Annual reports, earnings, financial statements"
      - name: "scientific_paper"
        tooltip: "Research papers, journal articles"
      - name: "legal_document"
        tooltip: "Contracts, patents, regulations"
      - name: "manual"
        tooltip: "Technical manuals, user guides"
      - name: "government"
        tooltip: "Government publications, policy documents"
      - name: "other"
        tooltip: "Documents not fitting the above categories"

  - annotation_type: "radio"
    name: "layout_complexity"
    description: "Rate the layout complexity of this document page"
    labels:
      - name: "simple"
        tooltip: "Single column, mostly text, few elements"
      - name: "moderate"
        tooltip: "Multiple elements, some tables or figures"
      - name: "complex"
        tooltip: "Multi-column, many different element types, dense layout"

interface_config:
  item_display_format: "<img src='{{text}}' style='max-width:100%; max-height:600px; border:1px solid #ccc;'/><br/><small>Document type: {{document_type}} | Page: {{page_number}}</small>"

output_annotation_format: "json"
output_annotation_dir: "annotations"

サンプルデータsample-data.json

[
  {
    "id": "doclaynet_001",
    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/8/8e/Pubmed_central_abstract.png/800px-Pubmed_central_abstract.png",
    "document_type": "scientific_paper",
    "page_number": 1
  },
  {
    "id": "doclaynet_002",
    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/5/5f/Wikiversity-logo-en.svg/800px-Wikiversity-logo-en.svg.png",
    "document_type": "educational_material",
    "page_number": 1
  }
]

// ... and 8 more items

このデザインを取得

View on GitHub

Clone or download from the repository

クイックスタート:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/image/doclaynet-document-layout
potato start config.yaml

詳細

アノテーションタイプ

multiselectradiotext

ドメイン

Document AILayout Analysis

ユースケース

Document Layout DetectionElement ClassificationDocument Digitization

タグ

doclaynetdocument-layoutbounding-boxdocument-aikdd2022ibm

問題を見つけた場合やデザインを改善したい場合は?

Issueを作成