Guide · 2026-05-12
How to backtest a trading strategy with Claude Code
End-to-end walkthrough: add the Bagtester MCP server to Claude Code, ask the agent to draft an SMA crossover, run the backtest, read the results, iterate with optimize_strategy and walk_forward — without leaving the terminal.
The setup, in 60 seconds
Claude Code supports MCP natively. To add Bagtester, you need an API key (generate one at bagtester.com/sign-up) and one shell command:
claude mcp add bagtester \ --transport http \ https://bagtester.com/api/mcp \ --header "Authorization: Bearer bag_YOUR_KEY"
Verify by asking Claude in a fresh session to list bagtester tools. You should see 14 tools, including submit_strategy, optimize_strategy, and manage_subscription.
Your first backtest
The fastest path is to ask Claude for a strategy and let it submit:
backtest a simple SMA crossover (fast=20, slow=50) on BTCUSDT for 2024
Claude writes the Python code, calls submit_strategy over MCP, waits for the result, and shows you:
- A headline like
+24.7% return, Sharpe 1.83, max DD -14.2% over 87 trades - Unicode sparklines rendered inline:
▁▂▂▃▅▆▇▇█▇▇▆▅▆▆▇█▇▇█ - A share URL you can cmd-click open
- Any
quality_flagswith high severity, with reasons attached
Roughly 7-15 seconds wall time for a 1-year minute-mode crypto run.
Read the quality flags
The schema-v2.0 result includes 12 boolean flags computed from your backtest's metrics — things like small_sample, time_concentration, regime_dependent, and underperformed_benchmark. Each is { triggered, reason, severity: "low"|"medium"|"high" }.
Claude reads these directly — you don't have to ask. If a result looks great but the agent says "by the way, 62% of the PnL came from Q2 alone", that's the time_concentration flag talking. Useful for deciding whether to trust a number.
Iterate: optimize_strategy
Once a strategy looks alive, you usually want to test parameter sensitivity. With Bagtester, ask:
sweep fast in [10, 20, 30, 50] and slow in [100, 150, 200]
Claude calls optimize_strategy with that grid. The tool auto-creates a param_sweep experiment, runs all 12 combinations, and returns:
- A ranked list by your
primary_metric(default Sharpe) - An
experiment_idyou can revisit withget_experiment - A heatmap URL for the 2D parameter grid
Quant tier supports up to 1,000 combinations per sweep; Trader caps at 50. Each run charges credits proportional to its mode.
Spot overfit: walk_forward
A strategy that looks great on the full sample but fell apart on unseen data is the classic overfit trap. Walk-forward analysis re-optimizes parameters every in_sample_days and tests them on the next out_sample_days. Ask:
run walk_forward on the best params with 90-day in-sample and 30-day out-of-sample
Compare aggregate out-of-sample Sharpe to in-sample. A retention ratio below ~0.5 typically signals overfit; above ~0.7 is a reasonable sign the strategy generalizes.
Compare runs
After a few iterations you want a side-by-side. Ask:
compare the last three backtests
Claude calls compare_backtests with the relevant job_ids and gets a markdown table plus a winner_per_metric map — so the agent can say "best Sharpe is run 2, best max DD is run 1." You also get a stacked-equity PNG URL.
Manage your subscription from the agent
Out of credits? Want to upgrade? Just ask Claude:
what's my Bagtester plan and how do I upgrade?
Claude calls manage_subscription, which returns the current plan, credits remaining, the last few invoices, and — for paying users — a one-shot Stripe customer-portal URL good for ~1 hour. Click the URL to upgrade, downgrade, change payment method, or cancel. Same flow handles cancellation when the time comes; you never need to dig through the dashboard.
What's next
Once you have a workflow that fits your team, scope deepens fast:
- Use
search_backteststo find old runs by symbol, tag, or date — no more dashboard hopping - Save promising strategies to your library with
create_experimentto group asset-matrix runs - Embed share URLs in PRs and Slack — the OG card renders the equity curve and headline metrics
The MCP server is at https://bagtester.com/api/mcp. Free tier ships 500 credits/month so you can prove the workflow before committing.