Summary
The backtesting framework reports drawdown metrics that are inconsistent with the total_value series in the portfolio_snapshots it produces. Independently computing max drawdown from the snapshot total_value field yields significantly different results.
Expected values (from framework)
{
"max_drawdown": 0.1384753616852372,
"max_drawdown_absolute": 406.69151474500177,
"max_daily_drawdown": 0.1384753616852372,
"max_drawdown_duration": 240
}
Actual values (computed from portfolio_snapshots[].total_value)
| Metric |
Framework reports |
Computed from snapshots |
Delta |
max_drawdown |
13.85% |
9.19% |
+4.66pp |
max_drawdown_absolute |
406.69 |
230.50 |
+176.19 |
max_daily_drawdown |
13.85% |
7.84% |
+6.01pp |
max_drawdown_duration |
240 days |
241 days |
-1 day |
Analysis
1. Framework uses a different equity curve than total_value
The framework's absolute drawdown (406.69) divided by its fractional drawdown (0.1385) implies a peak equity of ~2,937. However, the actual peak total_value in the snapshots is ~2,508. This means the framework is computing drawdown from a different equity series than the one stored in total_value.
Possible causes:
- The framework may be summing fields differently (e.g.
unallocated + pending_value + unrealized instead of using the pre-computed total_value).
- The framework may be revaluing positions at current market prices independently of the snapshot, producing a different equity curve.
- There may be a mismatch between the equity curve used internally for metrics and the one serialized to
portfolio_snapshots.
2. max_daily_drawdown equals max_drawdown — likely a bug
The framework reports max_daily_drawdown = 0.1384753616852372, which is identical to max_drawdown. This is almost certainly wrong:
max_daily_drawdown should represent the largest single-period (day-to-day) decline, which is typically much smaller than the peak-to-trough drawdown.
- From the snapshot data, the largest single-day drop is 7.84%, not 13.85%.
- If
max_daily_drawdown truly equals max_drawdown, it would mean the entire 13.85% drawdown happened in a single snapshot interval — contradicting the reported max_drawdown_duration of 240 days.
Likely cause: max_daily_drawdown is being assigned the same value as max_drawdown instead of being computed independently as the worst single-period return.
3. max_drawdown_duration is close but off by 1 day
The duration (240 vs 241 days) is within rounding tolerance and may be an off-by-one in how the framework counts the start/end day (inclusive vs exclusive). This is minor.
How to reproduce
Using backtest_run_three.json:
import json
from datetime import datetime
with open("backtest_run_three.json") as f:
data = json.load(f)
snaps = sorted(data["portfolio_snapshots"], key=lambda s: s["created_at"])
# Max drawdown from total_value
peak = 0
max_dd = 0
max_dd_abs = 0
for s in snaps:
tv = s["total_value"]
if tv > peak:
peak = tv
if peak > 0:
dd = (peak - tv) / peak
if dd > max_dd:
max_dd = dd
max_dd_abs = peak - tv
print(f"max_drawdown: {max_dd}") # 0.0919 — NOT 0.1385
print(f"max_drawdown_absolute: {max_dd_abs}") # 230.50 — NOT 406.69
Suggested fix
- Verify that the equity curve used for drawdown calculation matches the
total_value written to portfolio_snapshots. If they diverge, either fix the metric calculation or fix the snapshot serialization.
- Fix
max_daily_drawdown to compute the worst single-period return independently:
max_daily_dd = max(
(snaps[i-1]["total_value"] - snaps[i]["total_value"]) / snaps[i-1]["total_value"]
for i in range(1, len(snaps))
if snaps[i-1]["total_value"] > 0
)
- Review the off-by-one in
max_drawdown_duration (inclusive vs exclusive day counting).
Summary
The backtesting framework reports drawdown metrics that are inconsistent with the
total_valueseries in theportfolio_snapshotsit produces. Independently computing max drawdown from the snapshottotal_valuefield yields significantly different results.Expected values (from framework)
{ "max_drawdown": 0.1384753616852372, "max_drawdown_absolute": 406.69151474500177, "max_daily_drawdown": 0.1384753616852372, "max_drawdown_duration": 240 }Actual values (computed from
portfolio_snapshots[].total_value)max_drawdownmax_drawdown_absolutemax_daily_drawdownmax_drawdown_durationAnalysis
1. Framework uses a different equity curve than
total_valueThe framework's absolute drawdown (406.69) divided by its fractional drawdown (0.1385) implies a peak equity of ~2,937. However, the actual peak
total_valuein the snapshots is ~2,508. This means the framework is computing drawdown from a different equity series than the one stored intotal_value.Possible causes:
unallocated + pending_value + unrealizedinstead of using the pre-computedtotal_value).portfolio_snapshots.2.
max_daily_drawdownequalsmax_drawdown— likely a bugThe framework reports
max_daily_drawdown = 0.1384753616852372, which is identical tomax_drawdown. This is almost certainly wrong:max_daily_drawdownshould represent the largest single-period (day-to-day) decline, which is typically much smaller than the peak-to-trough drawdown.max_daily_drawdowntruly equalsmax_drawdown, it would mean the entire 13.85% drawdown happened in a single snapshot interval — contradicting the reportedmax_drawdown_durationof 240 days.Likely cause:
max_daily_drawdownis being assigned the same value asmax_drawdowninstead of being computed independently as the worst single-period return.3.
max_drawdown_durationis close but off by 1 dayThe duration (240 vs 241 days) is within rounding tolerance and may be an off-by-one in how the framework counts the start/end day (inclusive vs exclusive). This is minor.
How to reproduce
Using
backtest_run_three.json:Suggested fix
total_valuewritten toportfolio_snapshots. If they diverge, either fix the metric calculation or fix the snapshot serialization.max_daily_drawdownto compute the worst single-period return independently:max_drawdown_duration(inclusive vs exclusive day counting).