Skip to content
Guides6 min read

Mejores Prácticas para la Anotación de Documentos Legales

Técnicas especializadas para anotar contratos, documentos judiciales y presentaciones regulatorias con experiencia en el dominio.

Potato Team·

Mejores Prácticas para la Anotación de Documentos Legales

Los documentos legales requieren enfoques de anotación especializados debido a su estructura compleja, terminología específica del dominio y las altas consecuencias de los errores. Esta guía cubre estrategias para la anotación efectiva de texto legal.

  • Terminología densa: La jerga legal requiere anotadores capacitados
  • Documentos extensos: Los contratos pueden abarcar cientos de páginas
  • Referencias cruzadas: Las secciones hacen referencia a otras secciones
  • Precisión requerida: Los errores pueden tener consecuencias legales
  • Dependencia del contexto: El significado depende del tipo de documento y la jurisdicción

Segmentación de Documentos

Desglose de Documentos Extensos

yaml
annotation_task_name: "Legal Document Annotation"
 
display:
  # Segment by section
  segmentation:
    enabled: true
    method: section_headers
    pattern: '^\d+\.\s+[A-Z]'
 
  # Show document context
  context:
    show_previous_section: true
    show_section_hierarchy: true
 
  # Navigation
  navigation:
    show_outline: true
    jump_to_section: true

Anotación a Nivel de Sección

yaml
data_files:
  - contracts.json
 
item_properties:
  id_key: id
  text_key: text
 
preprocessing:
  segment_by: sections
  preserve_metadata: true
  include_section_number: true
 
# Each section becomes an annotation item
# {
#   "id": "contract_001_section_3.2",
#   "text": "The Licensor grants...",
#   "section_number": "3.2",
#   "section_title": "License Grant",
#   "document_id": "contract_001"
# }

Reconocimiento de Entidades Legales

Entidades Específicas de Contratos

yaml
annotation_schemes:
  - annotation_type: span
    name: legal_entities
    labels:
      - name: PARTY
        color: "#FECACA"
        description: "Contracting parties (Licensor, Licensee, Company, etc.)"
 
      - name: DEFINED_TERM
        color: "#FDE68A"
        description: "Defined terms (usually capitalized)"
 
      - name: DATE
        color: "#BBF7D0"
        description: "Dates and time periods"
 
      - name: MONETARY
        color: "#C4B5FD"
        description: "Dollar amounts, fees, penalties"
 
      - name: OBLIGATION
        color: "#BFDBFE"
        description: "Must, shall, will obligations"
 
      - name: CONDITION
        color: "#FED7AA"
        description: "If, unless, provided that conditions"
 
      - name: REFERENCE
        color: "#E0E7FF"
        description: "References to other sections or documents"

Detección de Obligaciones

yaml
annotation_schemes:
  - annotation_type: multiselect
    name: obligation_type
    question: "What type of obligation is this?"
    options:
      - name: performance
        label: "Performance Obligation"
        description: "Party must do something"
 
      - name: payment
        label: "Payment Obligation"
        description: "Party must pay"
 
      - name: restriction
        label: "Restriction/Prohibition"
        description: "Party must not do something"
 
      - name: condition
        label: "Conditional Obligation"
        description: "Obligation triggered by condition"
 
      - name: warranty
        label: "Warranty/Representation"
        description: "Statement of fact or promise"

Clasificación de Cláusulas

Tipos de Cláusulas Contractuales

yaml
annotation_schemes:
  - annotation_type: radio
    name: clause_type
    question: "What type of clause is this?"
    options:
      - name: definitions
        label: "Definitions"
      - name: grant
        label: "Grant of Rights/License"
      - name: consideration
        label: "Consideration/Payment"
      - name: term
        label: "Term and Termination"
      - name: representations
        label: "Representations & Warranties"
      - name: indemnification
        label: "Indemnification"
      - name: limitation
        label: "Limitation of Liability"
      - name: confidentiality
        label: "Confidentiality"
      - name: ip
        label: "Intellectual Property"
      - name: dispute
        label: "Dispute Resolution"
      - name: boilerplate
        label: "Boilerplate/Miscellaneous"

Evaluación de Riesgo

yaml
annotation_schemes:
  - annotation_type: likert
    name: risk_level
    question: "Rate the risk level of this clause for [Party]"
    min_label: "Low Risk"
    max_label: "High Risk"
    size: 5
 
  - annotation_type: text
    name: risk_notes
    question: "Explain the risk factors"
    multiline: true
    required_if:
      field: risk_level
      operator: ">="
      value: 4

Anotación de Documentos Judiciales

Extracción de Información del Caso

yaml
annotation_schemes:
  - annotation_type: span
    name: case_entities
    labels:
      - name: CASE_NUMBER
        description: "Case identifier"
 
      - name: COURT
        description: "Court name and jurisdiction"
 
      - name: JUDGE
        description: "Presiding judge"
 
      - name: PLAINTIFF
        description: "Plaintiff/Petitioner"
 
      - name: DEFENDANT
        description: "Defendant/Respondent"
 
      - name: ATTORNEY
        description: "Attorneys/Legal representatives"
 
      - name: LEGAL_CITATION
        description: "Citations to cases, statutes, regulations"
 
      - name: RULING
        description: "Court's ruling or order"

Estructura de Argumentos

yaml
annotation_schemes:
  - annotation_type: span
    name: argument_structure
    labels:
      - name: CLAIM
        color: "#FECACA"
        description: "Main claim or assertion"
 
      - name: PREMISE
        color: "#BBF7D0"
        description: "Supporting premise"
 
      - name: EVIDENCE
        color: "#BFDBFE"
        description: "Evidence cited"
 
      - name: REBUTTAL
        color: "#FED7AA"
        description: "Counter-argument"
 
      - name: CONCLUSION
        color: "#E0E7FF"
        description: "Conclusion drawn"

Resaltado de Términos Legales

yaml
display:
  keyword_highlighting:
    enabled: true
 
    categories:
      - name: obligation_words
        color: "#FEE2E2"
        keywords:
          - shall
          - must
          - will
          - agrees to
          - is required to
          - is obligated to
 
      - name: permission_words
        color: "#D1FAE5"
        keywords:
          - may
          - is permitted to
          - has the right to
          - is entitled to
 
      - name: prohibition_words
        color: "#FEF3C7"
        keywords:
          - shall not
          - must not
          - may not
          - is prohibited from
 
      - name: condition_words
        color: "#DBEAFE"
        keywords:
          - if
          - unless
          - provided that
          - subject to
          - contingent upon
          - in the event that
yaml
quality_control:
  # Require legal training
  qualification:
    required_training: legal_annotation_training
    training_accuracy: 0.85
 
  # Domain expertise check
  attention_checks:
    enabled: true
    items:
      - text: |
          "Notwithstanding any provision herein to the contrary,
          Licensee shall indemnify Licensor against all claims."
        expected:
          obligation_type: indemnification
          obligated_party: "Licensee"
        type: domain_knowledge
 
  # High agreement required
  redundancy:
    annotations_per_item: 3
    agreement_threshold: 0.8
    on_disagreement: expert_review
 
  # Expert review layer
  expert_review:
    enabled: true
    review_threshold: 0.7
    expert_users: [legal_expert_1, legal_expert_2]
yaml
annotation_task_name: "Contract Clause Analysis"
 
display:
  text_display: html
 
  # Section context
  context:
    show_document_metadata: true
    show_section_hierarchy: true
 
  # Legal term highlighting
  keyword_highlighting:
    enabled: true
    categories:
      - name: obligations
        color: "#FEE2E2"
        keywords: [shall, must, will, agrees]
      - name: conditions
        color: "#DBEAFE"
        keywords: [if, unless, provided that, subject to]
      - name: defined_terms
        pattern: '\b[A-Z][a-zA-Z]+(?:\s+[A-Z][a-zA-Z]+)*\b'
        color: "#FEF3C7"
 
annotation_schemes:
  # Clause type
  - annotation_type: radio
    name: clause_type
    question: "Classify this clause"
    options:
      - name: license_grant
        label: "License Grant"
      - name: payment
        label: "Payment/Consideration"
      - name: term
        label: "Term/Termination"
      - name: indemnification
        label: "Indemnification"
      - name: limitation
        label: "Limitation of Liability"
      - name: confidentiality
        label: "Confidentiality"
      - name: other
        label: "Other"
 
  # Entity spans
  - annotation_type: span
    name: entities
    labels:
      - name: PARTY
        color: "#FECACA"
      - name: DEFINED_TERM
        color: "#FDE68A"
      - name: MONETARY
        color: "#C4B5FD"
      - name: DATE
        color: "#BBF7D0"
      - name: OBLIGATION
        color: "#BFDBFE"
 
  # Risk assessment
  - annotation_type: likert
    name: risk
    question: "Risk level for the receiving party?"
    size: 5
    min_label: "Low"
    max_label: "High"
 
  # Key issues
  - annotation_type: text
    name: issues
    question: "Note any unusual or problematic language"
    multiline: true
 
quality_control:
  redundancy:
    annotations_per_item: 2
    agreement_threshold: 0.75
 
  qualification:
    required_training: true
    training_items: 20
    training_accuracy: 0.8

Ejemplo de Directrices para Anotadores

Al crear directrices para la anotación legal:

  1. Definir el alcance: Qué documentos, qué jurisdicciones
  2. Glosario de terminología: Definir términos legales para los anotadores
  3. Casos límite: Cómo manejar lenguaje ambiguo
  4. Referencias cruzadas: Cuándo anotar vs. ignorar referencias
  5. Requisitos de precisión: Límites exactos de los spans

Mejores Prácticas

  1. Usar anotadores capacitados: La anotación legal requiere conocimiento del dominio
  2. Segmentar documentos extensos: Dividir en secciones manejables
  3. Resaltar términos clave: Guiar la atención hacia el lenguaje legal
  4. Alta redundancia: Los errores legales son costosos
  5. Capa de revisión por expertos: Tener abogados revisando casos límite
  6. Directrices claras: Definir exactamente qué significa cada etiqueta
  7. Anotación contextual: Mostrar la estructura del documento y secciones relacionadas

Documentación completa en /docs/core-concepts/annotation-types.