A loadable extension that adds 249 statistical functions to SQLite3.
sqlite3-stats (formerly sqlite3StatisticalLibrary) is a SQLite3 loadable extension that provides 249 statistical functions callable directly from SQL. Built on statcpp (a C++17 header-only statistics library with 524 functions), this extension exposes a curated subset of statistical capabilities as native SQL functions.
All 249 functions are verified by 266 integration tests and 388 Google Test automated tests.
- 249 SQL functions covering descriptive statistics, hypothesis testing, distributions, resampling, and more
- Zero dependencies at runtime — a single shared library (
.dylib/.so/.dll) - Standard SQL interface — use with
SELECT,GROUP BY,HAVING, subqueries, etc. - Window function support — rolling statistics, moving averages, outlier detection
- JSON output — complex results (multi-value, frequency tables, test results) returned as JSON
- Cross-platform — macOS, Linux, and Windows (MSVC)
| Category | Count | Description |
|---|---|---|
| Basic aggregates | 24 | Single-column: SELECT stat_xxx(col) FROM t |
| Parameterized aggregates | 20 | With parameters: SELECT stat_xxx(col, param) FROM t |
| Two-column aggregates | 27 | Two-column input: SELECT stat_xxx(col1, col2) FROM t |
| Window functions | 23 | Full-scan window functions returning one value per row |
| Complex aggregates | 32 | JSON-returning aggregates, two-sample tests, survival analysis |
| Scalar — tests/helpers | 40 | Distribution functions, proportion tests, multiple testing corrections |
| Scalar — distributions | 83 | Continuous/discrete distributions, effect size conversions, power analysis |
- CMake 3.20+
- C++17 compiler: Apple Clang 17+, GCC 13+, or MSVC (Visual Studio 2022+)
- SQLite3 (downloaded automatically by CMake)
- statcpp (downloaded automatically by CMake)
git clone https://github.com/mitsuruk/sqlite3-stats.git
cd sqlite3-stats
cmake -B build
cmake --build buildgit clone https://github.com/mitsuruk/sqlite3-stats.git
cd sqlite3-stats
cmake -B build
cmake --build build --config ReleaseThis produces:
ext_funcs.dylib(macOS) /ext_funcs.so(Linux) /ext_funcs.dll(Windows) — the loadable extensiona.out/a.out.exe— test runner (266 tests)
./build/a.out
# [OK] All tests completed (266 tests)..\build\Release\a.out.exe
# [OK] All tests completed (266 tests).Requires Google Test (brew install googletest on macOS).
Note: Google Test is not supported on Windows (MSVC) due to DLL boundary issues with dynamic linking.
cmake -B build -DSTAT_TESTS=true
cmake --build build
# Run via CTest
ctest --test-dir build --output-on-failure
# Run the Google Test executable directly
./build/stat_tests
# Run a specific test suite
./build/stat_tests --gtest_filter="BasicAggregates.*"| Test file | Functions | Tests |
|---|---|---|
| basic_aggregates_test.cpp | 24 | 55 |
| parameterized_aggregates_test.cpp | 20 | 46 |
| two_column_aggregates_test.cpp | 27 | 57 |
| window_functions_test.cpp | 23 | 41 |
| complex_aggregates_test.cpp | 32 | 63 |
| scalar_tests_helpers_test.cpp | 40 | 43 |
| scalar_distributions_test.cpp | 83 | 83 |
| Total | 249 | 388 |
-- .load command (recommended)
.load ./ext_funcs sqlite3_ext_funcs_init
-- Or via load_extension()
SELECT load_extension('./ext_funcs', 'sqlite3_ext_funcs_init');sqlite3_enable_load_extension(db, 1);
// macOS
sqlite3_load_extension(db, "./ext_funcs.dylib", "sqlite3_ext_funcs_init", &errmsg);
// Linux
sqlite3_load_extension(db, "./ext_funcs.so", "sqlite3_ext_funcs_init", &errmsg);
// Windows
sqlite3_load_extension(db, "./ext_funcs.dll", "sqlite3_ext_funcs_init", &errmsg);Note: The entry point
sqlite3_ext_funcs_initmust be specified explicitly. SQLite's auto-detection strips underscores and will not find it automatically.
SELECT stat_mean(score), stat_stdev(score), stat_median(score)
FROM exam_results;
SELECT class, stat_mean(score), stat_cv(score)
FROM exam_results
GROUP BY class;-- One-sample t-test (H0: mu = 50)
SELECT stat_t_test(score, 50) FROM exam_results;
-- Two-sample Welch's t-test (returns JSON)
SELECT stat_t_test_welch(value, group_id) FROM measurements;SELECT date, price,
stat_rolling_mean(price, 7) OVER (ORDER BY date) AS ma7,
stat_ema(price, 0.3) OVER (ORDER BY date) AS ema
FROM stock_prices;-- Normal distribution quantile (z-value for 95%)
SELECT stat_normal_quantile(0.975, 0, 1); -- 1.96
-- Power analysis: required sample size for two-sample t-test
SELECT stat_n_t2(0.5, 0.05, 0.80); -- effect size=0.5, alpha=0.05, power=0.80- NULL handling: NULL values are silently ignored in all aggregate/window functions.
- Insufficient data: Returns
NULLwhen there are not enough non-NULL values (e.g., variance requires n >= 2). - Numeric precision: IEEE 754 double precision (64-bit).
NaNandInfare mapped toNULL. - JSON output: Complex aggregates return results as JSON strings (objects or arrays).
Full documentation for all 249 functions:
sqlite3-stats/
├── CMakeLists.txt
├── README.md # This file (English)
├── README-ja.md # Japanese README
├── tests/
│ ├── test_helpers.hpp # Google Test fixture & helpers
│ ├── basic_aggregates_test.cpp # 24 functions, 55 tests
│ ├── parameterized_aggregates_test.cpp # 20 functions, 46 tests
│ ├── two_column_aggregates_test.cpp # 27 functions, 57 tests
│ ├── window_functions_test.cpp # 23 functions, 41 tests
│ ├── complex_aggregates_test.cpp # 32 functions, 63 tests
│ ├── scalar_tests_helpers_test.cpp # 40 functions, 43 tests
│ └── scalar_distributions_test.cpp # 83 functions, 83 tests
├── src/
│ ├── ext_funcs.cpp # Extension: 249 SQL functions
│ ├── main.cpp # Legacy test runner (266 tests)
│ └── include/ # Local headers
├── doc/
│ ├── function_reference.md # Function reference hub (English)
│ ├── function_reference-ja.md # Function reference hub (Japanese)
│ ├── sqlite3lib_LOAD_EXTENSION.md # Extension implementation guide (English)
│ ├── sqlite3lib_LOAD_EXTENSION-ja.md # Extension implementation guide (Japanese)
│ └── ref/ # Per-category function details
│ ├── basic_aggregates.md # English
│ ├── basic_aggregates-ja.md # Japanese
│ ├── parameterized_aggregates.md
│ ├── parameterized_aggregates-ja.md
│ ├── two_column_aggregates.md
│ ├── two_column_aggregates-ja.md
│ ├── window_functions.md
│ ├── window_functions-ja.md
│ ├── complex_aggregates.md
│ ├── complex_aggregates-ja.md
│ ├── scalar_tests_helpers.md
│ ├── scalar_tests_helpers-ja.md
│ ├── scalar_distributions.md
│ └── scalar_distributions-ja.md
├── cmake/
│ ├── sqlite3.cmake # SQLite3 auto-download
│ └── statcpp.cmake # statcpp auto-download
└── download/ # Auto-downloaded dependencies
├── sqlite3/
└── statcpp/
sqlite3-stats is part of the statcpp family. Choose the one that fits your use case:
| Use case | Repository | Description |
|---|---|---|
| C++ library | statcpp | C++17 header-only statistics library (524 functions) |
| UNIX CLI | statcppCLI | Command-line interface for UNIX pipelines |
| SQL (SQLite3) | sqlite3-stats (this repo) | SQLite3 loadable extension (249 functions) |
Required statcpp version: v0.2.0 or later
- macOS + Apple Clang 21.0.0
- Ubuntu + GCC 13
- Windows 11 (ARM64, Parallels) + MSVC 19.x (Visual Studio 2026 Community)
This project is licensed under the MIT License.
| Library | License |
|---|---|
| statcpp | MIT License |
| SQLite3 | Public Domain |
This project uses the following tools and AI assistants:
- statcpp — C++17 header-only statistics library (524 functions)
- Claude Code for VS Code (Opus 4.5/4.6) — Code generation, refactoring, documentation
- OpenAI ChatGPT 5.2 — Documentation review
- LM Studio google/gemma-2-27b — Documentation review
Disclaimer: This library is not intended to replace commercial statistical software in terms of numerical stability or edge-case handling. When using results in research or production, we recommend cross-validating with established tools such as R or Python/SciPy.