Claude Code · Polymarket
Backtest Polymarket bots from Claude Code
Claude Code has native MCP support. Add Bagtester once and every session has a 7-tool Polymarket surface — submit a bot, list past runs, compare results, browse strategy templates. All from natural-language prompts; on-chain trade data; Brier + calibration results returned synchronously.
Setup (2 minutes)
Get an API key
Sign up at bagtester.com/sign-up. Open the dashboard, create an API key. Save the bag_… token somewhere safe.
Add the MCP
claude mcp add bagtester \ --transport http https://bagtester.com/api/mcp \ --header "Authorization: Bearer bag_YOUR_KEY"
Verify Polymarket tools are exposed
In Claude Code, ask: "list polymarket tools". You should see seven polymarket_* tools. The first one to call for any new bot idea is polymarket_list_strategy_templates — ten ready-to-go templates (cheap-YES fade, news-spike fade, time-decay carry, sports favorite, vectorized momentum, etc.).
Ask for a backtest
> backtest a mean-reversion bot on Polymarket politics markets resolved in 2024 with at least 100k volume
Claude will call polymarket_list_strategy_templates, pick the news-spike-fade template, then submit via polymarket_submit_bot with market_filter.tags=["Politics"] and min_volume_usdc=100000. The result lands inline: Brier score, calibration curve, by-tag PnL breakdown, equity curve, 11 PM-specific quality flags.
Try these prompts
- "Test a time-decay carry strategy on Polymarket sports markets, hold YES under $0.15 to resolution"
- "Find Polymarket crypto markets above $10M volume that resolved in 2024, then backtest a vectorized momentum strategy on them"
- "Compare three Polymarket strategies (cheap-yes fade, favorite momentum, news-spike fade) head-to-head on the 2024 politics universe"
What you get back
Every polymarket_submit_bot response is the same pm_v1.0 schema: summary.brier_score, summary.calibration_curve (10 buckets), summary.by_tag (per-tag PnL + Brier), summary.equity_curve_usdc, quality_flags_pm._summary (the 11 flags), plus next_steps with agent-friendly suggestions (refactor-to-vectorized, broaden-filter, focus-on-best-tag).