Guides6 min read
Mejores Prácticas para la Anotación de Documentos Legales
Técnicas especializadas para anotar contratos, documentos judiciales y presentaciones regulatorias con experiencia en el dominio.
Potato Team·
Mejores Prácticas para la Anotación de Documentos Legales
Los documentos legales requieren enfoques de anotación especializados debido a su estructura compleja, terminología específica del dominio y las altas consecuencias de los errores. Esta guía cubre estrategias para la anotación efectiva de texto legal.
Desafíos en la Anotación Legal
- Terminología densa: La jerga legal requiere anotadores capacitados
- Documentos extensos: Los contratos pueden abarcar cientos de páginas
- Referencias cruzadas: Las secciones hacen referencia a otras secciones
- Precisión requerida: Los errores pueden tener consecuencias legales
- Dependencia del contexto: El significado depende del tipo de documento y la jurisdicción
Segmentación de Documentos
Desglose de Documentos Extensos
yaml
annotation_task_name: "Legal Document Annotation"
display:
# Segment by section
segmentation:
enabled: true
method: section_headers
pattern: '^\d+\.\s+[A-Z]'
# Show document context
context:
show_previous_section: true
show_section_hierarchy: true
# Navigation
navigation:
show_outline: true
jump_to_section: trueAnotación a Nivel de Sección
yaml
data_files:
- contracts.json
item_properties:
id_key: id
text_key: text
preprocessing:
segment_by: sections
preserve_metadata: true
include_section_number: true
# Each section becomes an annotation item
# {
# "id": "contract_001_section_3.2",
# "text": "The Licensor grants...",
# "section_number": "3.2",
# "section_title": "License Grant",
# "document_id": "contract_001"
# }Reconocimiento de Entidades Legales
Entidades Específicas de Contratos
yaml
annotation_schemes:
- annotation_type: span
name: legal_entities
labels:
- name: PARTY
color: "#FECACA"
description: "Contracting parties (Licensor, Licensee, Company, etc.)"
- name: DEFINED_TERM
color: "#FDE68A"
description: "Defined terms (usually capitalized)"
- name: DATE
color: "#BBF7D0"
description: "Dates and time periods"
- name: MONETARY
color: "#C4B5FD"
description: "Dollar amounts, fees, penalties"
- name: OBLIGATION
color: "#BFDBFE"
description: "Must, shall, will obligations"
- name: CONDITION
color: "#FED7AA"
description: "If, unless, provided that conditions"
- name: REFERENCE
color: "#E0E7FF"
description: "References to other sections or documents"Detección de Obligaciones
yaml
annotation_schemes:
- annotation_type: multiselect
name: obligation_type
question: "What type of obligation is this?"
options:
- name: performance
label: "Performance Obligation"
description: "Party must do something"
- name: payment
label: "Payment Obligation"
description: "Party must pay"
- name: restriction
label: "Restriction/Prohibition"
description: "Party must not do something"
- name: condition
label: "Conditional Obligation"
description: "Obligation triggered by condition"
- name: warranty
label: "Warranty/Representation"
description: "Statement of fact or promise"Clasificación de Cláusulas
Tipos de Cláusulas Contractuales
yaml
annotation_schemes:
- annotation_type: radio
name: clause_type
question: "What type of clause is this?"
options:
- name: definitions
label: "Definitions"
- name: grant
label: "Grant of Rights/License"
- name: consideration
label: "Consideration/Payment"
- name: term
label: "Term and Termination"
- name: representations
label: "Representations & Warranties"
- name: indemnification
label: "Indemnification"
- name: limitation
label: "Limitation of Liability"
- name: confidentiality
label: "Confidentiality"
- name: ip
label: "Intellectual Property"
- name: dispute
label: "Dispute Resolution"
- name: boilerplate
label: "Boilerplate/Miscellaneous"Evaluación de Riesgo
yaml
annotation_schemes:
- annotation_type: likert
name: risk_level
question: "Rate the risk level of this clause for [Party]"
min_label: "Low Risk"
max_label: "High Risk"
size: 5
- annotation_type: text
name: risk_notes
question: "Explain the risk factors"
multiline: true
required_if:
field: risk_level
operator: ">="
value: 4Anotación de Documentos Judiciales
Extracción de Información del Caso
yaml
annotation_schemes:
- annotation_type: span
name: case_entities
labels:
- name: CASE_NUMBER
description: "Case identifier"
- name: COURT
description: "Court name and jurisdiction"
- name: JUDGE
description: "Presiding judge"
- name: PLAINTIFF
description: "Plaintiff/Petitioner"
- name: DEFENDANT
description: "Defendant/Respondent"
- name: ATTORNEY
description: "Attorneys/Legal representatives"
- name: LEGAL_CITATION
description: "Citations to cases, statutes, regulations"
- name: RULING
description: "Court's ruling or order"Estructura de Argumentos
yaml
annotation_schemes:
- annotation_type: span
name: argument_structure
labels:
- name: CLAIM
color: "#FECACA"
description: "Main claim or assertion"
- name: PREMISE
color: "#BBF7D0"
description: "Supporting premise"
- name: EVIDENCE
color: "#BFDBFE"
description: "Evidence cited"
- name: REBUTTAL
color: "#FED7AA"
description: "Counter-argument"
- name: CONCLUSION
color: "#E0E7FF"
description: "Conclusion drawn"Resaltado de Términos Legales
yaml
display:
keyword_highlighting:
enabled: true
categories:
- name: obligation_words
color: "#FEE2E2"
keywords:
- shall
- must
- will
- agrees to
- is required to
- is obligated to
- name: permission_words
color: "#D1FAE5"
keywords:
- may
- is permitted to
- has the right to
- is entitled to
- name: prohibition_words
color: "#FEF3C7"
keywords:
- shall not
- must not
- may not
- is prohibited from
- name: condition_words
color: "#DBEAFE"
keywords:
- if
- unless
- provided that
- subject to
- contingent upon
- in the event thatControl de Calidad para el Ámbito Legal
yaml
quality_control:
# Require legal training
qualification:
required_training: legal_annotation_training
training_accuracy: 0.85
# Domain expertise check
attention_checks:
enabled: true
items:
- text: |
"Notwithstanding any provision herein to the contrary,
Licensee shall indemnify Licensor against all claims."
expected:
obligation_type: indemnification
obligated_party: "Licensee"
type: domain_knowledge
# High agreement required
redundancy:
annotations_per_item: 3
agreement_threshold: 0.8
on_disagreement: expert_review
# Expert review layer
expert_review:
enabled: true
review_threshold: 0.7
expert_users: [legal_expert_1, legal_expert_2]Configuración Completa de Anotación Legal
yaml
annotation_task_name: "Contract Clause Analysis"
display:
text_display: html
# Section context
context:
show_document_metadata: true
show_section_hierarchy: true
# Legal term highlighting
keyword_highlighting:
enabled: true
categories:
- name: obligations
color: "#FEE2E2"
keywords: [shall, must, will, agrees]
- name: conditions
color: "#DBEAFE"
keywords: [if, unless, provided that, subject to]
- name: defined_terms
pattern: '\b[A-Z][a-zA-Z]+(?:\s+[A-Z][a-zA-Z]+)*\b'
color: "#FEF3C7"
annotation_schemes:
# Clause type
- annotation_type: radio
name: clause_type
question: "Classify this clause"
options:
- name: license_grant
label: "License Grant"
- name: payment
label: "Payment/Consideration"
- name: term
label: "Term/Termination"
- name: indemnification
label: "Indemnification"
- name: limitation
label: "Limitation of Liability"
- name: confidentiality
label: "Confidentiality"
- name: other
label: "Other"
# Entity spans
- annotation_type: span
name: entities
labels:
- name: PARTY
color: "#FECACA"
- name: DEFINED_TERM
color: "#FDE68A"
- name: MONETARY
color: "#C4B5FD"
- name: DATE
color: "#BBF7D0"
- name: OBLIGATION
color: "#BFDBFE"
# Risk assessment
- annotation_type: likert
name: risk
question: "Risk level for the receiving party?"
size: 5
min_label: "Low"
max_label: "High"
# Key issues
- annotation_type: text
name: issues
question: "Note any unusual or problematic language"
multiline: true
quality_control:
redundancy:
annotations_per_item: 2
agreement_threshold: 0.75
qualification:
required_training: true
training_items: 20
training_accuracy: 0.8Ejemplo de Directrices para Anotadores
Al crear directrices para la anotación legal:
- Definir el alcance: Qué documentos, qué jurisdicciones
- Glosario de terminología: Definir términos legales para los anotadores
- Casos límite: Cómo manejar lenguaje ambiguo
- Referencias cruzadas: Cuándo anotar vs. ignorar referencias
- Requisitos de precisión: Límites exactos de los spans
Mejores Prácticas
- Usar anotadores capacitados: La anotación legal requiere conocimiento del dominio
- Segmentar documentos extensos: Dividir en secciones manejables
- Resaltar términos clave: Guiar la atención hacia el lenguaje legal
- Alta redundancia: Los errores legales son costosos
- Capa de revisión por expertos: Tener abogados revisando casos límite
- Directrices claras: Definir exactamente qué significa cada etiqueta
- Anotación contextual: Mostrar la estructura del documento y secciones relacionadas
Documentación completa en /docs/core-concepts/annotation-types.