Skip to content

Commit 56c71fa

Browse files
igerberclaude
andcommitted
Add jackknife variance estimation to SyntheticDiD
Implement Algorithm 3 from Arkhangelsky et al. (2021) as a third variance_method option ("jackknife") matching R's synthdid::vcov(method="jackknife"). Delete-1 jackknife over all units with fixed weights - no Frank-Wolfe re-estimation, making it the fastest variance method. Validated against R golden values with SE matching to machine precision (5.5e-15). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent dbff8a5 commit 56c71fa

7 files changed

Lines changed: 528 additions & 14 deletions

File tree

benchmarks/R/benchmark_synthdid.R

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,14 +73,21 @@ weights <- attr(tau_hat, "weights")
7373
unit_weights <- weights$omega
7474
time_weights <- weights$lambda
7575

76-
# Compute SE via placebo (jackknife)
76+
# Compute SE via placebo
7777
message("Computing standard errors...")
7878
se_start <- Sys.time()
7979
se_matrix <- vcov(tau_hat, method = "placebo")
8080
se <- as.numeric(sqrt(se_matrix[1, 1])) # Extract scalar SE
8181
se_time <- as.numeric(difftime(Sys.time(), se_start, units = "secs"))
8282

83-
total_time <- estimation_time + se_time
83+
# Compute SE via jackknife (Algorithm 3)
84+
message("Computing jackknife standard errors...")
85+
se_jk_start <- Sys.time()
86+
se_jk_matrix <- vcov(tau_hat, method = "jackknife")
87+
se_jackknife <- as.numeric(sqrt(se_jk_matrix[1, 1]))
88+
se_jk_time <- as.numeric(difftime(Sys.time(), se_jk_start, units = "secs"))
89+
90+
total_time <- estimation_time + se_time + se_jk_time
8491

8592
# Compute noise level and regularization (to match Python's auto-computed values)
8693
N0 <- setup$N0
@@ -98,6 +105,7 @@ results <- list(
98105
# Point estimate and SE
99106
att = as.numeric(tau_hat),
100107
se = se,
108+
se_jackknife = se_jackknife,
101109

102110
# Weights (full precision)
103111
unit_weights = as.numeric(unit_weights),
@@ -111,7 +119,8 @@ results <- list(
111119
# Timing
112120
timing = list(
113121
estimation_seconds = estimation_time,
114-
se_seconds = se_time,
122+
se_placebo_seconds = se_time,
123+
se_jackknife_seconds = se_jk_time,
115124
total_seconds = total_time
116125
),
117126

benchmarks/python/benchmark_synthdid.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ def parse_args():
4545
)
4646
parser.add_argument(
4747
"--variance-method", type=str, default="placebo",
48-
choices=["bootstrap", "placebo"],
48+
choices=["bootstrap", "jackknife", "placebo"],
4949
help="Variance estimation method (default: placebo to match R)"
5050
)
5151
parser.add_argument(

diff_diff/results.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -662,7 +662,7 @@ class SyntheticDiDResults:
662662
att : float
663663
Average Treatment effect on the Treated (ATT).
664664
se : float
665-
Standard error of the ATT estimate (bootstrap or placebo-based).
665+
Standard error of the ATT estimate (bootstrap, jackknife, or placebo-based).
666666
t_stat : float
667667
T-statistic for the ATT estimate.
668668
p_value : float
@@ -684,7 +684,12 @@ class SyntheticDiDResults:
684684
post_periods : list
685685
List of post-treatment period identifiers.
686686
variance_method : str
687-
Method used for variance estimation: "bootstrap" or "placebo".
687+
Method used for variance estimation: "bootstrap", "jackknife", or "placebo".
688+
placebo_effects : np.ndarray, optional
689+
Method-specific per-iteration estimates: placebo treatment effects
690+
(for "placebo"), bootstrap ATT estimates (for "bootstrap"), or
691+
leave-one-out estimates (for "jackknife"). The ``variance_method``
692+
field disambiguates the contents.
688693
"""
689694

690695
att: float

diff_diff/synthetic_did.py

Lines changed: 199 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -53,10 +53,15 @@ class SyntheticDiD(DifferenceInDifferences):
5353
Implements Algorithm 4 from Arkhangelsky et al. (2021). This is R's default.
5454
- "bootstrap": Bootstrap at unit level with fixed weights matching R's
5555
synthdid::vcov(method="bootstrap").
56+
- "jackknife": Jackknife variance matching R's synthdid::vcov(method="jackknife").
57+
Implements Algorithm 3 from Arkhangelsky et al. (2021). Deterministic
58+
(N_control + N_treated iterations), uses fixed weights (no re-estimation).
59+
The ``n_bootstrap`` parameter is ignored for this method.
5660
n_bootstrap : int, default=200
5761
Number of replications for variance estimation. Used for both:
5862
- Bootstrap: Number of bootstrap samples
5963
- Placebo: Number of random permutations (matches R's `replications` argument)
64+
Ignored when ``variance_method="jackknife"``.
6065
seed : int, optional
6166
Random seed for reproducibility. If None (default), results
6267
will vary between runs.
@@ -163,15 +168,15 @@ def __init__(
163168
self.n_bootstrap = n_bootstrap
164169
self.seed = seed
165170

166-
# Validate n_bootstrap
167-
if n_bootstrap < 2:
171+
# Validate n_bootstrap (irrelevant for jackknife, which is deterministic)
172+
if n_bootstrap < 2 and variance_method != "jackknife":
168173
raise ValueError(
169174
f"n_bootstrap must be >= 2 (got {n_bootstrap}). At least 2 "
170175
f"iterations are needed to estimate standard errors."
171176
)
172177

173178
# Validate variance_method
174-
valid_methods = ("bootstrap", "placebo")
179+
valid_methods = ("bootstrap", "jackknife", "placebo")
175180
if variance_method not in valid_methods:
176181
raise ValueError(
177182
f"variance_method must be one of {valid_methods}, " f"got '{variance_method}'"
@@ -269,18 +274,19 @@ def fit( # type: ignore[override]
269274
f"Got '{resolved_survey.weight_type}'."
270275
)
271276

272-
# Reject placebo + full survey design (strata/PSU/FPC are silently ignored)
277+
# Reject non-bootstrap + full survey design (strata/PSU/FPC need Rao-Wu)
273278
if (
274279
resolved_survey is not None
275280
and (
276281
resolved_survey.strata is not None
277282
or resolved_survey.psu is not None
278283
or resolved_survey.fpc is not None
279284
)
280-
and self.variance_method == "placebo"
285+
and self.variance_method != "bootstrap"
281286
):
282287
raise NotImplementedError(
283-
"SyntheticDiD with variance_method='placebo' does not support strata/PSU/FPC. "
288+
f"SyntheticDiD with variance_method='{self.variance_method}' does not "
289+
"support strata/PSU/FPC. "
284290
"Use variance_method='bootstrap' for full survey design support."
285291
)
286292

@@ -510,6 +516,20 @@ def fit( # type: ignore[override]
510516
)
511517
placebo_effects = bootstrap_estimates
512518
inference_method = "bootstrap"
519+
elif self.variance_method == "jackknife":
520+
# Fixed-weight jackknife (R's synthdid Algorithm 3)
521+
se, jackknife_estimates = self._jackknife_se(
522+
Y_pre_control,
523+
Y_post_control,
524+
Y_pre_treated,
525+
Y_post_treated,
526+
unit_weights,
527+
time_weights,
528+
w_treated=w_treated,
529+
w_control=w_control,
530+
)
531+
placebo_effects = jackknife_estimates
532+
inference_method = "jackknife"
513533
else:
514534
# Use placebo-based variance (R's synthdid Algorithm 4)
515535
se, placebo_effects = self._placebo_variance_se(
@@ -528,7 +548,14 @@ def fit( # type: ignore[override]
528548

529549
# Compute test statistics
530550
t_stat, p_value_analytical, conf_int = safe_inference(att, se, alpha=self.alpha)
531-
if len(placebo_effects) > 0 and np.isfinite(t_stat):
551+
# Empirical p-value for placebo/bootstrap (null-distribution draws).
552+
# Jackknife pseudo-values are NOT null-distribution draws, so use
553+
# analytical (normal) p-value instead.
554+
if (
555+
inference_method != "jackknife"
556+
and len(placebo_effects) > 0
557+
and np.isfinite(t_stat)
558+
):
532559
p_value = max(
533560
np.mean(np.abs(placebo_effects) >= np.abs(att)),
534561
1.0 / (len(placebo_effects) + 1),
@@ -1106,6 +1133,171 @@ def _placebo_variance_se(
11061133

11071134
return se, placebo_estimates
11081135

1136+
def _jackknife_se(
1137+
self,
1138+
Y_pre_control: np.ndarray,
1139+
Y_post_control: np.ndarray,
1140+
Y_pre_treated: np.ndarray,
1141+
Y_post_treated: np.ndarray,
1142+
unit_weights: np.ndarray,
1143+
time_weights: np.ndarray,
1144+
w_treated=None,
1145+
w_control=None,
1146+
) -> Tuple[float, np.ndarray]:
1147+
"""Compute jackknife standard error matching R's synthdid Algorithm 3.
1148+
1149+
Delete-1 jackknife over all units (control + treated) with **fixed**
1150+
weights. For each leave-one-out sample the original omega is subsetted
1151+
and renormalized; lambda stays unchanged. No Frank-Wolfe
1152+
re-estimation, making this the fastest variance method.
1153+
1154+
This matches R's ``synthdid::vcov(method="jackknife")`` which sets
1155+
``update.omega=FALSE, update.lambda=FALSE``.
1156+
1157+
Parameters
1158+
----------
1159+
Y_pre_control : np.ndarray
1160+
Control outcomes in pre-treatment periods, shape (n_pre, n_control).
1161+
Y_post_control : np.ndarray
1162+
Control outcomes in post-treatment periods, shape (n_post, n_control).
1163+
Y_pre_treated : np.ndarray
1164+
Treated outcomes in pre-treatment periods, shape (n_pre, n_treated).
1165+
Y_post_treated : np.ndarray
1166+
Treated outcomes in post-treatment periods, shape (n_post, n_treated).
1167+
unit_weights : np.ndarray
1168+
Unit weights from Frank-Wolfe optimization, shape (n_control,).
1169+
time_weights : np.ndarray
1170+
Time weights from Frank-Wolfe optimization, shape (n_pre,).
1171+
w_treated : np.ndarray, optional
1172+
Survey probability weights for treated units.
1173+
w_control : np.ndarray, optional
1174+
Survey probability weights for control units.
1175+
1176+
Returns
1177+
-------
1178+
tuple
1179+
(se, jackknife_estimates) where se is the standard error and
1180+
jackknife_estimates is a length-N array of leave-one-out estimates
1181+
(first n_control entries are control-LOO, last n_treated are
1182+
treated-LOO).
1183+
1184+
References
1185+
----------
1186+
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S.
1187+
(2021). Synthetic Difference-in-Differences. American Economic Review,
1188+
111(12), 4088-4118. Algorithm 3.
1189+
"""
1190+
n_control = Y_pre_control.shape[1]
1191+
n_treated = Y_pre_treated.shape[1]
1192+
n = n_control + n_treated
1193+
1194+
# --- Early-return NaN: matches R's NA conditions ---
1195+
if n_treated <= 1:
1196+
warnings.warn(
1197+
"Jackknife variance requires more than 1 treated unit. "
1198+
"Use variance_method='placebo' for single treated unit.",
1199+
UserWarning,
1200+
stacklevel=3,
1201+
)
1202+
return np.nan, np.array([])
1203+
1204+
if np.sum(unit_weights > 0) <= 1:
1205+
warnings.warn(
1206+
"Jackknife variance requires more than 1 control unit with "
1207+
"nonzero weight. Consider variance_method='placebo'.",
1208+
UserWarning,
1209+
stacklevel=3,
1210+
)
1211+
return np.nan, np.array([])
1212+
1213+
jackknife_estimates = np.empty(n)
1214+
1215+
# --- Precompute treated means (constant across control-LOO) ---
1216+
if w_treated is not None:
1217+
treated_pre_mean = np.average(Y_pre_treated, axis=1, weights=w_treated)
1218+
treated_post_mean = np.average(Y_post_treated, axis=1, weights=w_treated)
1219+
else:
1220+
treated_pre_mean = np.mean(Y_pre_treated, axis=1)
1221+
treated_post_mean = np.mean(Y_post_treated, axis=1)
1222+
1223+
# --- Precompute omega composed with survey weights (for treated-LOO) ---
1224+
if w_control is not None:
1225+
omega_eff_full = unit_weights * w_control
1226+
omega_eff_full = omega_eff_full / omega_eff_full.sum()
1227+
else:
1228+
omega_eff_full = unit_weights
1229+
1230+
# --- Leave-one-out over control units ---
1231+
mask = np.ones(n_control, dtype=bool)
1232+
for j in range(n_control):
1233+
mask[j] = False
1234+
1235+
# Subset and renormalize omega
1236+
omega_jk = _sum_normalize(unit_weights[mask])
1237+
1238+
# Compose with survey weights if present
1239+
if w_control is not None:
1240+
omega_jk = omega_jk * w_control[mask]
1241+
omega_jk = omega_jk / omega_jk.sum()
1242+
1243+
jackknife_estimates[j] = compute_sdid_estimator(
1244+
Y_pre_control[:, mask],
1245+
Y_post_control[:, mask],
1246+
treated_pre_mean,
1247+
treated_post_mean,
1248+
omega_jk,
1249+
time_weights,
1250+
)
1251+
1252+
mask[j] = True # restore for next iteration
1253+
1254+
# --- Leave-one-out over treated units ---
1255+
mask = np.ones(n_treated, dtype=bool)
1256+
for k in range(n_treated):
1257+
mask[k] = False
1258+
1259+
# Recompute treated means from remaining units
1260+
if w_treated is not None:
1261+
w_t_jk = w_treated[mask]
1262+
t_pre_mean = np.average(
1263+
Y_pre_treated[:, mask], axis=1, weights=w_t_jk
1264+
)
1265+
t_post_mean = np.average(
1266+
Y_post_treated[:, mask], axis=1, weights=w_t_jk
1267+
)
1268+
else:
1269+
t_pre_mean = np.mean(Y_pre_treated[:, mask], axis=1)
1270+
t_post_mean = np.mean(Y_post_treated[:, mask], axis=1)
1271+
1272+
jackknife_estimates[n_control + k] = compute_sdid_estimator(
1273+
Y_pre_control,
1274+
Y_post_control,
1275+
t_pre_mean,
1276+
t_post_mean,
1277+
omega_eff_full,
1278+
time_weights,
1279+
)
1280+
1281+
mask[k] = True # restore for next iteration
1282+
1283+
# --- Check for non-finite estimates (propagate NaN like R's var()) ---
1284+
if not np.all(np.isfinite(jackknife_estimates)):
1285+
warnings.warn(
1286+
"Some jackknife leave-one-out estimates are non-finite. "
1287+
"Standard error cannot be computed.",
1288+
UserWarning,
1289+
stacklevel=3,
1290+
)
1291+
return np.nan, jackknife_estimates
1292+
1293+
# --- Jackknife SE formula: sqrt((n-1)/n * sum((u - ubar)^2)) ---
1294+
# Matches R's: sqrt(((n-1)/n) * (n-1) * var(u))
1295+
u_bar = np.mean(jackknife_estimates)
1296+
ss = np.sum((jackknife_estimates - u_bar) ** 2)
1297+
se = np.sqrt((n - 1) / n * ss)
1298+
1299+
return se, jackknife_estimates
1300+
11091301
def get_params(self) -> Dict[str, Any]:
11101302
"""Get estimator parameters."""
11111303
return {

docs/methodology/REGISTRY.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1472,6 +1472,23 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
14721472
5. Compute SDID estimate with renormalized ω and original λ
14731473
6. `SE = sd(bootstrap_estimates, ddof=1)`
14741474

1475+
- Alternative: Jackknife variance (matching R's `synthdid::vcov(method="jackknife")`)
1476+
Implements Algorithm 3 from Arkhangelsky et al. (2021):
1477+
1. For each control unit j=1,...,N_co:
1478+
- Remove unit j, renormalize omega: `ω_jk = _sum_normalize(ω[remaining])`
1479+
- Keep λ unchanged, keep treated means unchanged
1480+
- Compute SDID estimate τ_{(-j)}
1481+
2. For each treated unit k=1,...,N_tr:
1482+
- Keep ω and λ unchanged
1483+
- Recompute treated mean from remaining N_tr-1 treated units
1484+
- Compute SDID estimate τ_{(-k)}
1485+
3. `SE = sqrt( ((n-1)/n) × Σ (τ_{(-i)} - τ̄)² )` where n = N_co + N_tr
1486+
1487+
Fixed weights: No Frank-Wolfe re-estimation (`update.omega=FALSE, update.lambda=FALSE`).
1488+
Returns NaN SE for single treated unit or single nonzero-weight control.
1489+
Deterministic: exactly N_co + N_tr iterations, no replications parameter.
1490+
P-value: analytical (normal distribution), not empirical.
1491+
14751492
*Edge cases:*
14761493
- **Frank-Wolfe non-convergence**: Returns current weights after max_iter iterations. No warning emitted; the convergence check `vals[t-1] - vals[t] < min_decrease²` simply does not trigger early exit, and the final iterate is returned.
14771494
- **`_sparsify` all-zero input**: If `max(v) <= 0`, returns uniform weights `ones(len(v)) / len(v)`.
@@ -1490,7 +1507,10 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
14901507
- **Varying treatment within unit**: Raises `ValueError`. SDID requires block treatment (constant within each unit). Suggests CallawaySantAnna or ImputationDiD for staggered adoption.
14911508
- **Unbalanced panel**: Raises `ValueError`. SDID requires all units observed in all periods. Suggests `balance_panel()`.
14921509
- **Poor pre-treatment fit**: Warns (`UserWarning`) when `pre_fit_rmse > std(treated_pre_outcomes, ddof=1)`. Diagnostic only; estimation proceeds.
1493-
- **Note:** Survey support: weights, strata, PSU, and FPC are all supported. Full-design surveys use Rao-Wu rescaled bootstrap (Phase 6); `variance_method="placebo"` requires weights-only (strata/PSU/FPC require bootstrap). Both sides weighted per WLS regression interpretation: treated-side means are survey-weighted (Frank-Wolfe target and ATT formula); control-side synthetic weights are composed with survey weights post-optimization (ω_eff = ω * w_co, renormalized). Frank-Wolfe optimization itself is unweighted — survey importance enters after trajectory-matching. Covariate residualization uses WLS with survey weights. Placebo and bootstrap SE preserve survey weights on both sides.
1510+
- **Jackknife with single treated unit**: Returns NaN SE. Cannot leave-one-out with N_tr=1; R returns NA for the same condition.
1511+
- **Jackknife with single nonzero-weight control**: Returns NaN SE. Leaving out the only effective control is not meaningful.
1512+
- **Jackknife with non-finite LOO estimate**: Returns NaN SE. Unlike bootstrap/placebo, jackknife is deterministic and cannot skip failed iterations; NaN propagates through `var()` (matches R behavior).
1513+
- **Note:** Survey support: weights, strata, PSU, and FPC are all supported. Full-design surveys use Rao-Wu rescaled bootstrap (Phase 6); non-bootstrap variance methods (`variance_method="placebo"` or `"jackknife"`) require weights-only (strata/PSU/FPC require bootstrap). Both sides weighted per WLS regression interpretation: treated-side means are survey-weighted (Frank-Wolfe target and ATT formula); control-side synthetic weights are composed with survey weights post-optimization (ω_eff = ω * w_co, renormalized). Frank-Wolfe optimization itself is unweighted — survey importance enters after trajectory-matching. Covariate residualization uses WLS with survey weights. Placebo, jackknife, and bootstrap SE preserve survey weights on both sides.
14941514

14951515
**Reference implementation(s):**
14961516
- R: `synthdid::synthdid_estimate()` (Arkhangelsky et al.'s official package)
@@ -1505,6 +1525,9 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi
15051525
- [x] Placebo SE formula: sqrt((r-1)/r) * sd(placebo_estimates)
15061526
- [x] Placebo SE: re-estimates omega and lambda per replication (matching R's update.omega=TRUE, update.lambda=TRUE)
15071527
- [x] Bootstrap: fixed weights (original lambda unchanged, omega renormalized for resampled controls)
1528+
- [x] Jackknife SE: fixed weights, LOO all units, formula `sqrt((n-1)/n * sum((u-ubar)^2))`
1529+
- [x] Jackknife: NaN SE for single treated or single nonzero-weight control
1530+
- [x] Jackknife: analytical p-value (not empirical)
15081531
- [x] Returns both unit and time weights for interpretation
15091532
- [x] Column centering (intercept=True) in Frank-Wolfe optimization
15101533

0 commit comments

Comments
 (0)