Trajectory Editing for SFT/DPO

Annotators rewrite the steps of an agent trace to fix a wrong reasoning step, correct a tool call, or strengthen the final answer, and Potato exports each original/corrected pair as supervised fine-tuning targets and DPO preference pairs.

The trajectory_edit schema lets annotators rewrite the steps of an agent trace, and exports each correction as training data. Fix a wrong reasoning step, correct a typo'd tool call, or strengthen the final answer, and Potato saves the corrected trajectory next to the original. The trajectory_correction exporter then turns each (original, corrected) pair into supervised fine-tuning (SFT) targets and direct preference optimization (DPO) preference pairs.

This makes Potato a training-data production tool, not only an evaluation tool. It is the editing counterpart to step-level scoring: instead of rating a trajectory, annotators repair it, and the repair becomes a learning signal.

Trajectory correction editor with live diff An agent step shown with a read-only original and an editable corrected box with a word-level diff

Quick start

Run the included example from the repository root:

bash

python potato/flask_server.py start examples/agent-traces/trajectory-correction/config.yaml -p 8000

How it works

Each agent step renders as a card showing the original text (read-only) and an editable corrected box pre-filled with the original. As the annotator types:

a live word-level diff highlights insertions (green) and deletions (red strikethrough),
the words and characters changed are counted, and
an "edited" flag appears on changed fields.

A "Reset" button restores the original per field. With edit_final_answer: true the final answer gets its own editor. Nothing is required: an unedited trace simply produces no training pair.

Configuration

yaml

annotation_schemes:
  - annotation_type: trajectory_edit
    name: corrected_trajectory
    description: "Fix any wrong steps, then correct the final answer"
    steps_key: steps          # instance field holding the step list
    step_text_key: action     # the default per-step editable field
    editable_fields:          # which fields get an editor
      - action
      # - thought             # add to also edit reasoning
    show_diff: true
    show_edit_distance: true
    allow_reset: true
    require_reason_on_edit: false   # add a per-field "reason" input
    edit_final_answer: true
    final_answer_key: final_answer

Option	Default	Description
`steps_key`	`steps`	Instance field holding the step list.
`step_text_key`	`action`	Default editable field per step.
`editable_fields`	`[step_text_key]`	Which step fields get an editor, e.g. `[action, thought]`.
`show_diff`	`true`	Show the live word-level diff.
`show_edit_distance`	`true`	Show words and characters changed.
`allow_reset`	`true`	Per-field "Reset to original" button.
`require_reason_on_edit`	`false`	Per-field "reason for edit" input.
`edit_final_answer`	`false`	Add an editor for the final answer.
`final_answer_key`	`final_answer`	Instance field holding the final answer.

Data format

The schema reads steps from the instance under steps_key. Each step is an object whose fields (action, thought, and so on) can be edited; bare-string steps are edited as the step_text_key field.

json

{
  "id": "traj_001",
  "task_description": "Find the weather in San Francisco.",
  "steps": [
    {"thought": "Look it up.", "action": "web_search(queyr='SF weather')"},
    {"thought": "Open it.",    "action": "open_url(results[0])"}
  ],
  "final_answer": "It is sunny."
}

Export

Run the trajectory_correction exporter. It writes three files:

trajectory_corrections.json — every record: the original_trace, the reconstructed corrected_trace, and per-field edits with edit distances and reasons.
trajectory_sft.jsonl — one line per edited trace: {"prompt": <task>, "completion": <corrected_trace>}.
trajectory_dpo.jsonl — one line per edited trace: {"prompt": <task>, "chosen": <corrected_trace>, "rejected": <original_trace>}.

Unedited traces are counted but excluded from SFT/DPO, since training on an unchanged trajectory adds nothing; the skipped count appears in the export stats. With multiple annotators, each annotator who edited a trace yields one SFT/DPO record.

Notes and limitations

The diff is word-level. For code-like tool calls with no spaces, a single token may show as wholly changed even for a one-character fix; the character-distance counter is the precise signal.
Pair with step-level scoring if you also want per-step correctness or an error taxonomy on the same trace.

Three-Pane Trace Evaluation — read-only reasoning, calls, and answer view
Agentic Annotation — agent-trace display and evaluation patterns
Export Formats — the full exporter reference

For implementation details, see the source documentation.