Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ The package `rdrobust` implements estimation, inference, and graphical procedure
- `rdrobust`: point estimation and robust bias-corrected inference.
- `rdbwselect`: data-driven bandwidth selection for RD estimation and inference.
- `rdplot`: data-driven RD plots based on binned means and local polynomial fits.
- Stata also includes `rdrobustplot`, a postestimation diagnostic plot after `rdrobust`.


## Python Implementation
Expand Down Expand Up @@ -36,9 +37,9 @@ To install/update in Stata type:
net install rdrobust, from(https://raw.githubusercontent.com/rdpackages/rdrobust/main/stata) replace
```

- Help: [rdrobust](stata/rdrobust.pdf), [rdbwselect](stata/rdbwselect.pdf), [rdplot](stata/rdplot.pdf).
- Help: [rdrobust](stata/rdrobust.pdf), [rdbwselect](stata/rdbwselect.pdf), [rdplot](stata/rdplot.pdf), [rdrobustplot](stata/rdrobustplot.pdf).

- Replication: [do-file](stata/rdrobust_illustration.do), [rdplot illustration](stata/rdplot_illustration.do), [senate data](stata/rdrobust_senate.dta).
- Replication: [rdrobust illustration](stata/rdrobust_illustration.do), [rdplot illustration](stata/rdplot_illustration.do), [new features illustration](stata/rdrobust_illustration_new.do), [senate data](stata/rdrobust_senate.dta).


## References
Expand Down
325 changes: 231 additions & 94 deletions stata/rdbwselect.ado

Large diffs are not rendered by default.

Binary file modified stata/rdbwselect.pdf
Binary file not shown.
28 changes: 21 additions & 7 deletions stata/rdbwselect.sthlp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{smcl}
{* *!version 10.0.0 2026-05-15}{...}
{* *!version 11.0.0 2026-05-13}{...}
{viewerjumpto "Syntax" "rdbwselect##syntax"}{...}
{viewerjumpto "Description" "rdbwselect##description"}{...}
{viewerjumpto "Options" "rdbwselect##options"}{...}
Expand Down Expand Up @@ -59,6 +59,8 @@ and {browse "https://rdpackages.github.io/references/Calonico-Cattaneo-Farrell-T

{p 8 8}{browse "https://rdpackages.github.io/":https://rdpackages.github.io/}{p_end}

{p 4 8}{it:Requires Stata 16 or later.}{p_end}


{marker options}{...}
{title:Options}
Expand Down Expand Up @@ -86,7 +88,7 @@ Default is {cmd:q(2)} (local quadratic regression).{p_end}

{p 4 8}{cmd:covs(}{it:covars}{cmd:)} specifies additional covariates to be used for estimation and inference.{p_end}

{p 4 8}{cmd:covs_drop(}{it:covsdropoption}{cmd:)} assesses collinearity in additional covariates used for estimation and inference. Options {opt pinv} (default choice) and {opt invsym} drop collinear additional covariates, differing only in the type of inverse function used. Option {opt off} omits the check for collinear additional covariates.{p_end}
{p 4 8}{cmd:covs_drop(}{it:covsdropoption}{cmd:)} assess collinearity in additional covariates used for estimation and inference. Options {opt pinv} (default choice) and {opt invsym} drops collinear additional covariates, differing only in the type of inverse function used. Option {opt off} omits the check for collinear additional covariates.{p_end}

{p 4 8}{cmd:kernel(}{it:kernelfn}{cmd:)} specifies the kernel function used to construct the local-polynomial estimator(s). Options are: {opt tri:angular}, {opt epa:nechnikov}, and {opt uni:form}.
Default is {cmd:kernel(triangular)}.{p_end}
Expand Down Expand Up @@ -142,8 +144,11 @@ Options are:{p_end}
{p 8 12}{cmd:vce(hc1)} for heteroskedasticity-robust plug-in residuals variance estimator with {it:hc1} weights.{p_end}
{p 8 12}{cmd:vce(hc2)} for heteroskedasticity-robust plug-in residuals variance estimator with {it:hc2} weights.{p_end}
{p 8 12}{cmd:vce(hc3)} for heteroskedasticity-robust plug-in residuals variance estimator with {it:hc3} weights.{p_end}
{p 8 12}{cmd:vce(nncluster }{it:clustervar [nnmatch]}{cmd:)} for cluster-robust nearest neighbor variance estimation, with {it:clustervar} indicating the cluster ID variable and {it:nnmatch} indicating the minimum number of neighbors to be used.{p_end}
{p 8 12}{cmd:vce(cluster }{it:clustervar}{cmd:)} for cluster-robust plug-in residuals variance estimation with degrees-of-freedom weights and {it:clustervar} indicating the cluster ID variable.{p_end}
{p 8 12}{cmd:vce(cluster }{it:clustervar}{cmd:)} for cluster-robust plug-in residuals variance estimation with degrees-of-freedom weights; equivalent to {cmd:vce(cr1 }{it:clustervar}{cmd:)}.{p_end}
{p 8 12}{cmd:vce(cr1 }{it:clustervar}{cmd:)} for cluster-robust plug-in residuals variance estimator with degrees-of-freedom correction.{p_end}
{p 8 12}{cmd:vce(cr2 }{it:clustervar}{cmd:)} for the Bell-McCaffrey (2002) bias-reduced cluster-robust variance estimator (CRV2).{p_end}
{p 8 12}{cmd:vce(cr3 }{it:clustervar}{cmd:)} for the Pustejovsky-Tipton (2018) cluster-robust variance estimator (CRV3), approximately unbiased with few clusters.{p_end}
{p 8 12}The CR2/CR3 leverage correction applies to both the conventional and the robust bias-corrected standard errors, including when the point-estimation bandwidth h differs from the bias-correction bandwidth b; in that case the cluster leverage is computed from the bias (b) regression.{p_end}
{p 8 12}Default is {cmd:vce(nn 3)}.{p_end}

{hline}
Expand All @@ -159,7 +164,7 @@ Options are:{p_end}
{p 4 8}MSE bandwidth selection procedure{p_end}
{p 8 8}{cmd:. rdbwselect vote margin}{p_end}

{p 4 8}All bandwidth selection procedures{p_end}
{p 4 8}All bandwidth bandwidth selection procedures{p_end}
{p 8 8}{cmd:. rdbwselect vote margin, all}{p_end}


Expand All @@ -170,11 +175,13 @@ Options are:{p_end}

{synoptset 20 tabbed}{...}
{p2col 5 20 24 2: Scalars}{p_end}
{synopt:{cmd:e(N)}}number of observations{p_end}
{synopt:{cmd:e(N_l)}}number of observations to the left of the cutoff{p_end}
{synopt:{cmd:e(N_r)}}number of observations to the right of the cutoff{p_end}
{synopt:{cmd:e(c)}}cutoff value{p_end}
{synopt:{cmd:e(p)}}order of the polynomial used for estimation of the regression function{p_end}
{synopt:{cmd:e(q)}}order of the polynomial used for estimation of the bias of the regression function estimator{p_end}
{synopt:{cmd:e(n_clust)}}number of clusters (only when {cmd:vce(cluster}|{cmd:cr1}|{cmd:cr2}|{cmd:cr3} ...{cmd:)} is specified){p_end}

{synopt:{cmd:e(h_mserd)}} MSE-optimal bandwidth selector for the RD treatment effect estimator.{p_end}
{synopt:{cmd:e(h_msetwo_l)}} MSE-optimal bandwidth selectors below the cutoff for the RD treatment effect estimator.{p_end}
Expand Down Expand Up @@ -209,14 +216,21 @@ Options are:{p_end}
{synopt:{cmd:e(b_cercomb2_r)}} for median({opt certwo_r},{opt cerrd},{opt cersum}), above the cutoff.{p_end}

{p2col 5 20 24 2: Macros}{p_end}
{synopt:{cmd:e(cmd)}}{cmd:rdbwselect}{p_end}
{synopt:{cmd:e(cmdline)}}command as typed{p_end}
{synopt:{cmd:e(title)}}title ({cmd:RD bandwidth selection (sharp/fuzzy)}, optionally {cmd:, covariate-adjusted}){p_end}
{synopt:{cmd:e(depvar)}}name of dependent (outcome) variable{p_end}
{synopt:{cmd:e(runningvar)}}name of running variable{p_end}
{synopt:{cmd:e(outcomevar)}}name of outcome variable{p_end}
{synopt:{cmd:e(clustvar)}}name of cluster variable{p_end}
{synopt:{cmd:e(covs)}}name of covariates{p_end}
{synopt:{cmd:e(covs)}}names of additional covariates{p_end}
{synopt:{cmd:e(vce_select)}}vcetype specified in vce(){p_end}
{synopt:{cmd:e(bwselect)}}bandwidth selection choice{p_end}
{synopt:{cmd:e(kernel)}}kernel choice{p_end}

{p2col 5 20 24 2: Matrices}{p_end}
{synopt:{cmd:e(mat_h)}}1x2 matrix of bandwidths for the RD treatment effect estimator (left, right){p_end}
{synopt:{cmd:e(mat_b)}}1x2 matrix of bandwidths for the bias of the RD treatment effect estimator (left, right){p_end}


{marker references}{...}
{title:References}
Expand Down
128 changes: 103 additions & 25 deletions stata/rdplot.ado
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,21 @@
* RDROBUST STATA PACKAGE -- rdplot
* Authors: Sebastian Calonico, Matias D. Cattaneo, Max H. Farrell, Rocio Titiunik
********************************************************************************
*!version 10.0.0 2026-05-15
*!rdrobust Stata package v11.0.0 2026-05-13

capture program drop rdplot
program define rdplot, eclass
version 16.0
syntax anything [if] [, c(real 0) p(integer 4) nbins(string) covs(string) covs_eval(string) covs_drop(string) binselect(string) scale(string) kernel(string) weights(string) h(string) support(string) masspoints(string) genvars hide ci(real 0) shade graph_options(string asis) nochecks *]

marksample touse

* Snapshot Mata externals so we can drop only the variables WE create,
* leaving the user's Mata workspace and the loaded rdrobust_*.mo
* library functions untouched.
mata: _mtx = direxternal("*"); st_local("_mata_before", rows(_mtx) ? invtokens(_mtx') : "")
mata: mata drop _mtx

tokenize "`anything'"
local y `1'
local x `2'
Expand Down Expand Up @@ -71,30 +79,38 @@ program define rdplot, eclass
}

*****************************************
preserve

* M1: do all work in an isolated temp frame instead of preserve+keep.
* Only the touse=1 subset is copied (in-memory, no disk I/O), and the
* user's frame is restored even on error or Ctrl-Break via nobreak.
local _orig_frame `c(frame)'
tempname _work_frame
capture frame drop `_work_frame'
frame put * if `touse', into(`_work_frame')

nobreak {
cwf `_work_frame'
capture noisily {
sort `x', stable
qui keep if `touse'


*************************************************************
**** DROP MISSINGS ******************************************
*************************************************************
qui drop if mi(`y') | mi(`x')
if ("`covs'"~="") {
qui ds `covs'
local covs_list = r(varlist)
local ncovs: word count `covs_list'
foreach z in `covs_list' {
qui drop if mi(`z')
local ncovs: word count `covs_list'
if (`p'>1) {
di as text "Note: covs() is meant for plotting rdrobust estimates (local linear). With p>1, results may not be visually compatible with the binned means depicted due to the underlying assumptions used."
}

di as error "{err}covs() option is meant to be used when plotting RDROBUST estimates. If the option is used for global polynomial fitting, it may deliver results that are not visually compatible with the local binned means depicted due to the underlying assumptions used."

}
if ("`weights'"~="") {
qui drop if mi(`weights')
qui drop if `weights'<=0
* Combine all missing-value drops into a single scan.
local drop_cond "mi(`y') | mi(`x')"
foreach z of local covs_list {
local drop_cond "`drop_cond' | mi(`z')"
}
if ("`weights'"!="") local drop_cond "`drop_cond' | mi(`weights') | `weights'<=0"
qui drop if `drop_cond'

**** CHECK colinearity ******************************************
local covs_drop_coll = 0
Expand Down Expand Up @@ -186,23 +202,44 @@ program define rdplot, eclass
local binselect = "esmv"
}

if ("`nochecks'"=="") {
if ("`checks'"=="") {
if (`c'<=`x_min' | `c'>=`x_max'){
di as error "{err}{cmd:c()} should be set within the range of `x'"
exit 125
}

if ("`p'"<"0" | "`nbins_l'"<"0" | "`nbins_r'"<"0"){
di as error "{err}{cmd:p()} and {cmd:nbins()} should be a positive integers"
di as error "{err}{cmd:p()} and {cmd:nbins()} should be a positive integers"
exit 411
}

if ("`masspoints'" != "" & ///
!inlist("`masspoints'", "check", "adjust", "off", "false")) {
di as error "{err}{cmd:masspoints()} must be one of check, adjust, off"
exit 125
}


if (`n'<20){
di as error "{err}Not enough observations to perform bin calculations"
di as error "{err}Not enough observations to perform bin calculations"
exit 2001
}
}
}
* End of validation-only capture block. Splitting validation from the
* Mata-work block sidesteps a Stata quirk where `exit N` inside an `if{}`
* inside `capture noisily { ... }` does NOT halt the noisily block when
* later `mata { ... }` blocks exist at the same scope; control jumps past
* the first mata block and lands in the second, leaking transient Mata
* externals and surfacing a misleading rc=3499 instead of the real
* validation error. See rdrobust.ado for the same pattern.
local _rc = _rc
if `_rc' {
cwf `_orig_frame'
frame drop `_work_frame'
exit `_rc'
}
capture noisily {

*******************************
****** Start MATA *************
Expand Down Expand Up @@ -729,17 +766,26 @@ if ("`covs_eval'"=="mean" & "`covs'"!="") {
qui getmata x_plot_l x_plot_r y_plot_l y_plot_r rdplot_id rdplot_mean_bin rdplot_mean_x rdplot_mean_y rdplot_N rdplot_min_bin rdplot_max_bin rdplot_se_y rdplot_ci_l rdplot_ci_r, replace force

ereturn clear
ereturn scalar N_l = `n_l'
ereturn scalar N_r = `n_r'
ereturn scalar N = `n'
ereturn scalar N_l = `n_l'
ereturn scalar N_r = `n_r'
ereturn scalar N_h_l = `n_h_l'
ereturn scalar N_h_r = `n_h_r'
ereturn scalar c = `c'
ereturn scalar p = `p'
ereturn scalar J_star_l = J_star_l
ereturn scalar J_star_r = J_star_r
ereturn matrix coef_l = gamma_p1_l
ereturn matrix coef_r = gamma_p1_r
if ("`covs'"!="") {
ereturn matrix coef_covs = gamma_p
}
ereturn local binselect = "`binselect'"
ereturn local binselect = "`binselect'"
ereturn local depvar "`y'"
ereturn local runningvar "`x'"
ereturn local cmdline "rdplot `0'"
ereturn local title "RD plot"
ereturn local cmd "rdplot"

****** polynomial equation for plots ******************
mat coef_l = e(coef_l)
Expand Down Expand Up @@ -822,8 +868,22 @@ if ("`covs_eval'"=="mean" & "`covs'"!="") {
}
}
}

restore

}
local _rc = _rc
cwf `_orig_frame'
frame drop `_work_frame'
if `_rc' {
* Error path: best-effort per-name Mata cleanup before reraising.
capture mata: _mtx = direxternal("*"); st_local("_mata_after", rows(_mtx) ? invtokens(_mtx') : "")
capture mata: mata drop _mtx
local _mata_new : list _mata_after - _mata_before
foreach _mname of local _mata_new {
capture mata mata drop `_mname'
}
error `_rc'
}
}



Expand Down Expand Up @@ -853,7 +913,25 @@ if ("`genvars'"!="") {
}
}

mata mata clear
* Drop transient matrices/scalars used as Mata-to-Stata transport buffers
* so they don't leak into the caller's namespace.
cap matrix drop coef_l coef_r
cap matrix drop J_es_hat_dw J_qs_hat_dw J_es_chk_dw J_qs_chk_dw
cap matrix drop J_es_hat_mv J_qs_hat_mv J_es_chk_mv J_qs_chk_mv
cap scalar drop M_l M_r nbins_l nbins_r
cap scalar drop J_star_l J_star_r J_star_l_orig J_star_r_orig
cap scalar drop bin_avg_l bin_avg_r bin_med_l bin_med_r
cap scalar drop J_star_l_IMSE J_star_r_IMSE J_star_l_MV J_star_r_MV
cap scalar drop scale_l scale_r

* Drop only the Mata externals we created (set difference vs the
* entry-time snapshot). Library functions stay loaded; the user's
* own Mata variables are preserved.
mata: _mtx = direxternal("*"); st_local("_mata_after", rows(_mtx) ? invtokens(_mtx') : "")
mata: mata drop _mtx
local _mata_new : list _mata_after - _mata_before
if `"`_mata_new'"' != "" mata mata drop `_mata_new'

end


Expand Down
Binary file modified stata/rdplot.pdf
Binary file not shown.
19 changes: 16 additions & 3 deletions stata/rdplot.sthlp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{smcl}
{* *!version 10.0.0 2026-05-15}{...}
{* *!version 11.0.0 2026-05-13}{...}
{viewerjumpto "Syntax" "rdplot##syntax"}{...}
{viewerjumpto "Description" "rdplot##description"}{...}
{viewerjumpto "Options" "rdplot##options"}{...}
Expand Down Expand Up @@ -42,7 +42,7 @@
{marker description}{...}
{title:Description}

{p 4 8}{cmd:rdplot} implements several data-driven Regression Discontinuity (RD) plots, using either evenly-spaced or quantile-spaced partitioning. Two types of RD plots are constructed: (i) RD plots with binned sample means tracing out the underlying regression function, and (ii) RD plots with binned sample means
{p 4 8}{cmd:rdplot} implements several data-driven Regression Discontinuity (RD) plots, using either evenly-spaced or quantile-spaced partitioning. Two type of RD plots are constructed: (i) RD plots with binned sample means tracing out the underlying regression function, and (ii) RD plots with binned sample means
mimicking the underlying variability of the data. For technical and methodological details see
{browse "https://rdpackages.github.io/references/Calonico-Cattaneo-Titiunik_2015_JASA.pdf":Calonico, Cattaneo and Titiunik (2015a)}.{p_end}

Expand All @@ -57,6 +57,8 @@ and {browse "https://rdpackages.github.io/references/Calonico-Cattaneo-Farrell-T

{p 8 8}{browse "https://rdpackages.github.io":https://rdpackages.github.io}{p_end}

{p 4 8}{it:Requires Stata 16 or later.}{p_end}


{marker options}{...}
{title:Options}
Expand Down Expand Up @@ -114,7 +116,7 @@ Default is {cmd:kernel(uniform)} (i.e., equal/no weighting to all observations o

{p 4 8}{cmd:covs_eval(}{it:covars_eval}{cmd:)} sets the evaluation points for the additional covariates, when included in the estimation. Options are: {opt 0} and {opt mean} (default).

{p 4 8}{cmd:covs_drop(}{it:covsdropoption}{cmd:)} assesses collinearity in additional covariates used for estimation and inference. Options {opt pinv} (default choice) and {opt invsym} drop collinear additional covariates, differing only in the type of inverse function used. Option {opt off} omits the check for collinear additional covariates.{p_end}
{p 4 8}{cmd:covs_drop(}{it:covsdropoption}{cmd:)} assess collinearity in additional covariates used for estimation and inference. Options {opt pinv} (default choice) and {opt invsym} drops collinear additional covariates, differing only in the type of inverse function used. Option {opt off} omits the check for collinear additional covariates.{p_end}

{dlgtab:Plot Options}

Expand Down Expand Up @@ -164,20 +166,31 @@ Default is {cmd:kernel(uniform)} (i.e., equal/no weighting to all observations o

{synoptset 20 tabbed}{...}
{p2col 5 20 24 2: Scalars}{p_end}
{synopt:{cmd:e(N)}}number of observations{p_end}
{synopt:{cmd:e(N_l)}}original number of observations to the left of the cutoff{p_end}
{synopt:{cmd:e(N_r)}}original number of observations to the right of the cutoff{p_end}
{synopt:{cmd:e(N_h_l)}}effective number of observations (given the bandwidth) to the left of the cutoff{p_end}
{synopt:{cmd:e(N_h_r)}}effective number of observations (given the bandwidth) to the right of the cutoff{p_end}
{synopt:{cmd:e(c)}}cutoff value{p_end}
{synopt:{cmd:e(p)}}order of the polynomial used for the fit{p_end}
{synopt:{cmd:e(J_star_l)}}selected number of bins to the left of the cutoff{p_end}
{synopt:{cmd:e(J_star_r)}}selected number of bins to the right of the cutoff{p_end}

{p2col 5 20 24 2: Macros}{p_end}
{synopt:{cmd:e(cmd)}}{cmd:rdplot}{p_end}
{synopt:{cmd:e(cmdline)}}command as typed{p_end}
{synopt:{cmd:e(title)}}title ({cmd:RD plot}){p_end}
{synopt:{cmd:e(depvar)}}name of dependent (outcome) variable{p_end}
{synopt:{cmd:e(runningvar)}}name of running variable{p_end}
{synopt:{cmd:e(binselect)}}method used to compute the optimal number of bins{p_end}

{synoptset 20 tabbed}{...}
{p2col 5 20 24 2: Matrices}{p_end}
{synopt:{cmd:e(coef_l)}}coefficients of the {it:p}-th order polynomial estimated to the left of the cutoff{p_end}
{synopt:{cmd:e(coef_r)}}coefficients of the {it:p}-th order polynomial estimated to the right of the cutoff{p_end}
{synopt:{cmd:e(coef_covs)}}coefficients of the additional covariates, only returned when {cmd:covs()} are used{p_end}
{synopt:{cmd:e(eq_l)}}polynomial equation (left of cutoff) as a string, suitable for overlaying the fit via {cmd:twoway function}{p_end}
{synopt:{cmd:e(eq_r)}}polynomial equation (right of cutoff) as a string, suitable for overlaying the fit via {cmd:twoway function}{p_end}


{marker references}{...}
Expand Down
Loading
Loading