Annotation History
Track every annotation action with timestamps for auditing and analysis.
Annotation History
Potato provides comprehensive tracking of all annotation actions with fine-grained timestamp metadata. This enables performance analysis, quality assurance, and detailed audit trails.
Overview
The annotation history system tracks:
- Every annotation action: Label selections, span annotations, text inputs
- Precise timestamps: Server and client-side timestamps
- Action metadata: User, instance, schema, old/new values
- Performance metrics: Processing times, action rates
- Suspicious activity: Unusually fast or burst activity patterns
Action Tracking
Every annotation change is recorded as an AnnotationAction with:
| Field | Description |
|---|---|
action_id | Unique UUID for each action |
timestamp | Server-side timestamp |
client_timestamp | Browser-side timestamp (if available) |
user_id | User who performed the action |
instance_id | Instance being annotated |
action_type | Type of action performed |
schema_name | Annotation schema name |
label_name | Specific label within the schema |
old_value | Previous value (for updates/deletes) |
new_value | New value (for adds/updates) |
span_data | Span details for span annotations |
server_processing_time_ms | Server processing time |
Action Types
The system tracks these action types:
add_label- New label selectionupdate_label- Label value changeddelete_label- Label removedadd_span- New span annotation createdupdate_span- Span annotation modifieddelete_span- Span annotation removed
Configuration
Annotation history tracking is enabled by default. No additional configuration required.
Performance Metrics
The system calculates performance metrics from action history:
from potato.annotation_history import AnnotationHistoryManager
metrics = AnnotationHistoryManager.calculate_performance_metrics(actions)
# Returns:
{
'total_actions': 150,
'average_action_time_ms': 45.2,
'fastest_action_time_ms': 12,
'slowest_action_time_ms': 234,
'actions_per_minute': 8.5,
'total_processing_time_ms': 6780
}Suspicious Activity Detection
The system can detect potentially problematic annotation patterns:
from potato.annotation_history import AnnotationHistoryManager
analysis = AnnotationHistoryManager.detect_suspicious_activity(
actions,
fast_threshold_ms=500, # Actions faster than this are flagged
burst_threshold_seconds=2 # Actions closer than this are flagged
)
# Returns:
{
'suspicious_actions': [...],
'fast_actions_count': 5,
'burst_actions_count': 12,
'fast_actions_percentage': 3.3,
'burst_actions_percentage': 8.0,
'suspicious_score': 15.2,
'suspicious_level': 'Low'
}Suspicious Levels
| Score | Level | Interpretation |
|---|---|---|
| 0-10 | Normal | Typical annotation behavior |
| 10-30 | Low | Some fast actions, likely acceptable |
| 30-60 | Medium | Notable pattern, may warrant review |
| 60-80 | High | Concerning pattern, review recommended |
| 80-100 | Very High | Likely quality issue, immediate review |
API Reference
AnnotationAction
from potato.annotation_history import AnnotationAction
action = AnnotationAction(
action_id="uuid-here",
timestamp=datetime.now(),
user_id="annotator1",
instance_id="doc_001",
action_type="add_label",
schema_name="sentiment",
label_name="positive",
old_value=None,
new_value=True
)
# Serialize to dictionary
data = action.to_dict()
# Deserialize from dictionary
action = AnnotationAction.from_dict(data)AnnotationHistoryManager
from potato.annotation_history import AnnotationHistoryManager
# Create a new action with current timestamp
action = AnnotationHistoryManager.create_action(
user_id="annotator1",
instance_id="doc_001",
action_type="add_label",
schema_name="sentiment",
label_name="positive",
old_value=None,
new_value=True
)
# Filter actions by time range
filtered = AnnotationHistoryManager.get_actions_by_time_range(
actions,
start_time=datetime(2024, 1, 1),
end_time=datetime(2024, 1, 31)
)
# Filter actions by instance
instance_actions = AnnotationHistoryManager.get_actions_by_instance(
actions, instance_id="doc_001"
)
# Calculate performance metrics
metrics = AnnotationHistoryManager.calculate_performance_metrics(actions)
# Detect suspicious activity
analysis = AnnotationHistoryManager.detect_suspicious_activity(actions)Use Cases
Quality Assurance
Monitor annotator behavior for quality issues:
for user_id in get_all_users():
user_actions = get_user_actions(user_id)
analysis = AnnotationHistoryManager.detect_suspicious_activity(user_actions)
if analysis['suspicious_level'] in ['High', 'Very High']:
flag_for_review(user_id, analysis)Audit Trail
Track changes for regulatory compliance:
instance_actions = AnnotationHistoryManager.get_actions_by_instance(
all_actions, "doc_001"
)
audit_log = [action.to_dict() for action in instance_actions]
with open("audit_doc_001.json", "w") as f:
json.dump(audit_log, f, indent=2)Time Analysis
Understand annotation timing patterns:
from collections import Counter
hours = Counter(action.timestamp.hour for action in all_actions)
print("Peak annotation hours:", hours.most_common(5))Data Storage
Annotation history is stored in the user state files:
output/
annotations/
user_state_annotator1.json # Includes action history
user_state_annotator2.json
Export Format
Actions are serialized with ISO 8601 timestamps:
{
"action_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2024-01-15T10:30:45.123456",
"user_id": "annotator1",
"instance_id": "doc_001",
"action_type": "add_label",
"schema_name": "sentiment",
"label_name": "positive",
"old_value": null,
"new_value": true,
"server_processing_time_ms": 23
}Best Practices
- Regular monitoring: Check suspicious activity reports periodically
- Threshold tuning: Adjust detection thresholds based on task complexity
- Export backups: Regularly export history for long-term storage
- Privacy compliance: Consider data retention policies for timestamps
Further Reading
- Admin Dashboard - View annotation statistics
- Behavioral Tracking - Interaction-level tracking
- Quality Control - Automated quality checks
For implementation details, see the source documentation.