Skip to content

Quick Start Tutorial

automation edited this page Aug 8, 2025 · 1 revision

Quick Start Tutorial

Basic Usage

import pandas as pd
from statclean import StatClean

df = pd.DataFrame({'values': [1, 2, 3, 100, 4, 5, 6]})
cleaner = StatClean(df)

outliers = cleaner.detect_outliers_zscore('values')
print(f"Outliers detected: {outliers.sum()}")

cleaner.remove_outliers_zscore('values')
cleaned_df = cleaner.clean_df
print(f"Cleaned shape: {cleaned_df.shape}")

Statistical Testing

result = cleaner.grubbs_test('values', alpha=0.05)
print(f"P-value: {result['p_value']:.6f}")
print(f"Outlier detected: {result['is_outlier']}")

Method Chaining

result = (cleaner
          .set_thresholds(zscore_threshold=2.5)
          .winsorize_outliers_iqr('values')
          .clean_df)

Clone this wiki locally