Navigate and debug with the TracePilot AI dashboard

The TracePilot AI dashboard gives you a live view of every agent run your SDK has instrumented. You can browse execution trees, click into individual spans to inspect inputs and outputs, monitor token costs and request latency, and fork any span to test a different input without touching your code.

The dashboard is available immediately after your first wrapOpenAI call. No extra configuration, no pipeline setup, no waiting — open tracepilotai.com/dashboard and your trace is already there.

What the dashboard shows

Each trace in the dashboard corresponds to one tp.startTrace call. Inside a trace, you see:

Execution tree — every span in parent-child order, matching the parentSpanId links you set in code
Latency — wall-clock duration for each span and the overall trace
Token usage — prompt tokens, completion tokens, and total per LLM span
Estimated cost — per-span API cost based on the model and token count
Errors — any span where the wrapped call threw is marked as failed with the full error message
RPS — requests per second across all active traces, visible in the metrics header

Navigate a trace

Open the dashboard

Go to tracepilotai.com/dashboard. Sign in with the same GitHub or Google account you used to generate your API key.

Find your trace

Traces are listed in reverse chronological order. Each row shows the agent name you passed to tp.startTrace, the start time, total duration, and whether any spans failed. Click a trace row to open it.

Click a span

Inside the trace view, the execution tree is rendered as a collapsible hierarchy. Click any span to open the detail panel on the right. The panel shows:

The full input — messages array for LLM spans, function arguments for tool spans
The full output — completion response or function return value
Token breakdown (LLM spans only)
Latency in milliseconds
Estimated cost in USD

Inspect inputs and outputs

Inputs and outputs are displayed as formatted JSON you can copy directly. For LLM spans, the messages array is shown in conversation order so you can read the prompt exactly as the model received it.

Span badges

Two badges appear in the span tree to draw attention to spans that need review. ⚠ Destructive — shown on any wrapToolCall span where you passed isDestructive: true. Use this to flag tool calls that modified external state: a database write, an outbound email, a payment charge, or any other irreversible action. The badge makes it easy to identify side effects during an incident review without reading every span. Failed — shown on any span where the wrapped call threw an error. The error message and stack trace are captured automatically and displayed in the detail panel. You do not need to add any error handling to get this information.

Fork and rerun a failing span

Fork & Rerun is the core debugging feature of TracePilot AI. When a span produces a bad output or throws an error, you can edit its input directly in the dashboard and re-execute it — without redeploying your agent.

Find the failing span

Open the trace and locate the span with the Failed badge or the unexpected output. Click it to open the detail panel.

Click Fork & Rerun

Click the Fork & Rerun button in the span detail panel. A new panel opens showing the span’s input — the messages array for an LLM span, or the function arguments for a tool span.

Edit the input

Modify the input directly in the editor. For an LLM span, you might adjust the system prompt, add context, or reword the user message. For a tool span, you might correct a malformed argument.

View the new result

Click Run. TracePilot executes the span with your edited input and displays the new output immediately in the same panel. The forked execution is saved as a separate trace so you can compare it to the original.

Fork & Rerun executes the span using your live API key. For destructive tool spans — those marked with ⚠ — forking will re-run the tool and its side effects (sending another email, writing to the database again, etc.). Review the input carefully before running.

Use Fork & Rerun on intermediate spans, not just the last one. If step 2 of a 5-step agent produced a bad search query, fork step 2, fix the query, and see whether step 3 would have succeeded. You do not need to re-run the entire agent.

Monitor metrics across traces

The dashboard header shows aggregate metrics across all traces for the selected time window:

Metric	What it measures
RPS	Requests per second to your agents
Avg latency	Mean span duration across all LLM calls
Total tokens	Combined prompt and completion tokens
Total cost	Estimated OpenAI API spend
Error rate	Percentage of spans that failed

Use these metrics to spot regressions after a prompt change, identify expensive models, or find agents that are looping unexpectedly.

Filtering and searching traces

Use the search bar at the top of the trace list to filter by agent name (the value you passed to tp.startTrace). You can also filter by time range, status (all / failed only), and model name.

Error spans and automatic capture

You do not need to add any special error handling to get error spans. If the function you pass to wrapOpenAI or wrapToolCall throws, TracePilot captures the error automatically, marks the span as failed, and re-throws the original error so your existing error handling still runs.

Get Started

Core Concepts

Guides

Integrations

Navigate and debug with the TracePilot AI dashboard

What the dashboard shows

Navigate a trace

Span badges

Fork and rerun a failing span

Monitor metrics across traces

​What the dashboard shows

​Navigate a trace

​Span badges

​Fork and rerun a failing span

​Monitor metrics across traces

What the dashboard shows

Navigate a trace

Span badges

Fork and rerun a failing span

Monitor metrics across traces