Skip to content
Docs/Features

Annotation History

Track every annotation action with timestamps for auditing and analysis.

Annotation History

Potato provides comprehensive tracking of all annotation actions with fine-grained timestamp metadata. This enables performance analysis, quality assurance, and detailed audit trails.

Overview

The annotation history system tracks:

  • Every annotation action: Label selections, span annotations, text inputs
  • Precise timestamps: Server and client-side timestamps
  • Action metadata: User, instance, schema, old/new values
  • Performance metrics: Processing times, action rates
  • Suspicious activity: Unusually fast or burst activity patterns

Action Tracking

Every annotation change is recorded as an AnnotationAction with:

FieldDescription
action_idUnique UUID for each action
timestampServer-side timestamp
client_timestampBrowser-side timestamp (if available)
user_idUser who performed the action
instance_idInstance being annotated
action_typeType of action performed
schema_nameAnnotation schema name
label_nameSpecific label within the schema
old_valuePrevious value (for updates/deletes)
new_valueNew value (for adds/updates)
span_dataSpan details for span annotations
server_processing_time_msServer processing time

Action Types

The system tracks these action types:

  • add_label - New label selection
  • update_label - Label value changed
  • delete_label - Label removed
  • add_span - New span annotation created
  • update_span - Span annotation modified
  • delete_span - Span annotation removed

Configuration

Annotation history tracking is enabled by default. No additional configuration required.

Performance Metrics

The system calculates performance metrics from action history:

python
from potato.annotation_history import AnnotationHistoryManager
 
metrics = AnnotationHistoryManager.calculate_performance_metrics(actions)
 
# Returns:
{
    'total_actions': 150,
    'average_action_time_ms': 45.2,
    'fastest_action_time_ms': 12,
    'slowest_action_time_ms': 234,
    'actions_per_minute': 8.5,
    'total_processing_time_ms': 6780
}

Suspicious Activity Detection

The system can detect potentially problematic annotation patterns:

python
from potato.annotation_history import AnnotationHistoryManager
 
analysis = AnnotationHistoryManager.detect_suspicious_activity(
    actions,
    fast_threshold_ms=500,      # Actions faster than this are flagged
    burst_threshold_seconds=2   # Actions closer than this are flagged
)
 
# Returns:
{
    'suspicious_actions': [...],
    'fast_actions_count': 5,
    'burst_actions_count': 12,
    'fast_actions_percentage': 3.3,
    'burst_actions_percentage': 8.0,
    'suspicious_score': 15.2,
    'suspicious_level': 'Low'
}

Suspicious Levels

ScoreLevelInterpretation
0-10NormalTypical annotation behavior
10-30LowSome fast actions, likely acceptable
30-60MediumNotable pattern, may warrant review
60-80HighConcerning pattern, review recommended
80-100Very HighLikely quality issue, immediate review

API Reference

AnnotationAction

python
from potato.annotation_history import AnnotationAction
 
action = AnnotationAction(
    action_id="uuid-here",
    timestamp=datetime.now(),
    user_id="annotator1",
    instance_id="doc_001",
    action_type="add_label",
    schema_name="sentiment",
    label_name="positive",
    old_value=None,
    new_value=True
)
 
# Serialize to dictionary
data = action.to_dict()
 
# Deserialize from dictionary
action = AnnotationAction.from_dict(data)

AnnotationHistoryManager

python
from potato.annotation_history import AnnotationHistoryManager
 
# Create a new action with current timestamp
action = AnnotationHistoryManager.create_action(
    user_id="annotator1",
    instance_id="doc_001",
    action_type="add_label",
    schema_name="sentiment",
    label_name="positive",
    old_value=None,
    new_value=True
)
 
# Filter actions by time range
filtered = AnnotationHistoryManager.get_actions_by_time_range(
    actions,
    start_time=datetime(2024, 1, 1),
    end_time=datetime(2024, 1, 31)
)
 
# Filter actions by instance
instance_actions = AnnotationHistoryManager.get_actions_by_instance(
    actions, instance_id="doc_001"
)
 
# Calculate performance metrics
metrics = AnnotationHistoryManager.calculate_performance_metrics(actions)
 
# Detect suspicious activity
analysis = AnnotationHistoryManager.detect_suspicious_activity(actions)

Use Cases

Quality Assurance

Monitor annotator behavior for quality issues:

python
for user_id in get_all_users():
    user_actions = get_user_actions(user_id)
    analysis = AnnotationHistoryManager.detect_suspicious_activity(user_actions)
 
    if analysis['suspicious_level'] in ['High', 'Very High']:
        flag_for_review(user_id, analysis)

Audit Trail

Track changes for regulatory compliance:

python
instance_actions = AnnotationHistoryManager.get_actions_by_instance(
    all_actions, "doc_001"
)
 
audit_log = [action.to_dict() for action in instance_actions]
with open("audit_doc_001.json", "w") as f:
    json.dump(audit_log, f, indent=2)

Time Analysis

Understand annotation timing patterns:

python
from collections import Counter
 
hours = Counter(action.timestamp.hour for action in all_actions)
print("Peak annotation hours:", hours.most_common(5))

Data Storage

Annotation history is stored in the user state files:

text
output/
  annotations/
    user_state_annotator1.json  # Includes action history
    user_state_annotator2.json

Export Format

Actions are serialized with ISO 8601 timestamps:

json
{
  "action_id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2024-01-15T10:30:45.123456",
  "user_id": "annotator1",
  "instance_id": "doc_001",
  "action_type": "add_label",
  "schema_name": "sentiment",
  "label_name": "positive",
  "old_value": null,
  "new_value": true,
  "server_processing_time_ms": 23
}

Best Practices

  1. Regular monitoring: Check suspicious activity reports periodically
  2. Threshold tuning: Adjust detection thresholds based on task complexity
  3. Export backups: Regularly export history for long-term storage
  4. Privacy compliance: Consider data retention policies for timestamps

Further Reading

For implementation details, see the source documentation.