Guide · 2026-05-12

How to backtest a trading strategy with Claude Code

End-to-end walkthrough: add the Bagtester MCP server to Claude Code, ask the agent to draft an SMA crossover, run the backtest, read the results, iterate with optimize_strategy and walk_forward — without leaving the terminal.

The setup, in 60 seconds

Claude Code supports MCP natively. To add Bagtester, you need an API key (generate one at bagtester.com/sign-up) and one shell command:

bash

claude mcp add bagtester \
  --transport http \
  https://bagtester.com/api/mcp \
  --header "Authorization: Bearer bag_YOUR_KEY"

Verify by asking Claude in a fresh session to list bagtester tools. You should see 20 tools, including submit_strategy, optimize_strategy, and manage_subscription.

Your first backtest

The fastest path is to ask Claude for a strategy and let it submit:

prompt

backtest a simple SMA crossover (fast=20, slow=50) on BTCUSDT for 2024

Claude writes the Python code, calls submit_strategy over MCP, waits for the result, and shows you:

A headline like +24.7% return, Sharpe 1.83, max DD -14.2% over 87 trades
Unicode sparklines rendered inline: ▁▂▂▃▅▆▇▇█▇▇▆▅▆▆▇█▇▇█
A share URL you can cmd-click open
Any quality_flags with high severity, with reasons attached

Roughly 7-15 seconds wall time for a 1-year minute-mode crypto run.

Read the quality flags

The schema-v2.0 result includes 12 boolean flags computed from your backtest's metrics — things like small_sample, time_concentration, regime_dependent, and underperformed_benchmark. Each is { triggered, reason, severity: "low"|"medium"|"high" }.

Claude reads these directly — you don't have to ask. If a result looks great but the agent says "by the way, 62% of the PnL came from Q2 alone", that's the time_concentration flag talking. Useful for deciding whether to trust a number.

Iterate: optimize_strategy

Once a strategy looks alive, you usually want to test parameter sensitivity. With Bagtester, ask:

prompt

sweep fast in [10, 20, 30, 50] and slow in [100, 150, 200]

Claude calls optimize_strategy with that grid. The tool auto-creates a param_sweep experiment, runs all 12 combinations, and returns:

A ranked list by your primary_metric (default Sharpe)
An experiment_id you can revisit with get_experiment
A heatmap URL for the 2D parameter grid

Quant tier supports up to 1,000 combinations per sweep; Trader caps at 50. Each run charges credits proportional to its mode.

Spot overfit: walk_forward

A strategy that looks great on the full sample but fell apart on unseen data is the classic overfit trap. Walk-forward analysis re-optimizes parameters every in_sample_days and tests them on the next out_sample_days. Ask:

prompt

run walk_forward on the best params with 90-day in-sample and 30-day out-of-sample

Compare aggregate out-of-sample Sharpe to in-sample. A retention ratio below ~0.5 typically signals overfit; above ~0.7 is a reasonable sign the strategy generalizes.

Compare runs

After a few iterations you want a side-by-side. Ask:

prompt

compare the last three backtests

Claude calls compare_backtests with the relevant job_ids and gets a markdown table plus a winner_per_metric map — so the agent can say "best Sharpe is run 2, best max DD is run 1." You also get a stacked-equity PNG URL.

Manage your subscription from the agent

Out of credits? Want to upgrade? Just ask Claude:

prompt

what's my Bagtester plan and how do I upgrade?

Claude calls manage_subscription, which returns the current plan, credits remaining, the last few invoices, and — for paying users — a one-shot Stripe customer-portal URL good for ~1 hour. Click the URL to upgrade, downgrade, change payment method, or cancel. Same flow handles cancellation when the time comes; you never need to dig through the dashboard.

What's next

Once you have a workflow that fits your team, scope deepens fast:

Use search_backtests to find old runs by symbol, tag, or date — no more dashboard hopping
Save promising strategies to your library with create_experiment to group asset-matrix runs
Embed share URLs in PRs and Slack — the OG card renders the equity curve and headline metrics

The MCP server is at https://bagtester.com/api/mcp. Free tier ships 500 credits/month so you can prove the workflow before committing.

Add Bagtester to Claude Code

Get a free API key Setup page