-
Notifications
You must be signed in to change notification settings - Fork 0
Statistical Methods Guide
automation edited this page Aug 8, 2025
·
1 revision
- IQR: robust to non-normal data; configure lower/upper factors.
- Z-score: assumes approximate normality; configurable threshold.
- Modified Z-score: robust via MAD; default threshold 3.5.
- Mahalanobis distance: detects multivariate outliers using covariance structure.
-
chi2_threshold: percentile in (0,1] or absolute chi-square statistic. -
use_shrinkage=Trueto enable Ledoit–Wolf covariance if scikit-learn available.
-
- Grubbs' test: single outlier detection with p-value and critical value.
- Dixon's Q-test: for small n (<30); approximate p-value reported.
- Box-Cox (positive data): optimal lambda estimated; preserves NaNs.
- Log (natural, base 10, base 2): shifts applied for non-positive values.
- Square-root: shifts applied for negatives.
Best practices: drop NaNs before tests where needed; sample large data for Shapiro.
Warning: Dixon’s Q-test is recommended only for small sample sizes (n < 30).