|
6 | 6 |
|
7 | 7 | # 2024 Landscape Analysis |
8 | 8 |
|
9 | | -Python is widely adopted in data science, and its use for statistics is expanding rapidly---particularly in education and applied research. |
| 9 | +Python is widely adopted in data science, and its use for statistics is expanding rapidly—particularly in education and applied research. |
10 | 10 | The statistical ecosystem in Python is currently anchored by six major libraries: |
11 | 11 |
|
12 | 12 | - [numpy](https://www.numpy.org/), which provides fast, flexible array and numerical operations, and underpins nearly all statistical and scientific computing in Python. |
13 | 13 | It supports descriptive statistics, correlation and covariance computations, random sampling, and tools for constructing histograms and binning data. |
14 | 14 | - [pandas](https://www.pandas.org/), which offers intuitive, high-performance data structures for tabular and time series data, making data cleaning, wrangling, reshaping, aggregation, and exploratory analysis straightforward and efficient. |
15 | | -- [scipy](https://www.scipy.org/), which builds on NumPy to deliver a broad range of scientific and statistical functionality---including, in its [`scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html) submodule, a comprehensive suite of probability distributions, summary statistics, and basic statistical tests. |
| 15 | +- [scipy](https://www.scipy.org/), which builds on NumPy to deliver a broad range of scientific and statistical functionality—including, in its [`scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html) submodule, a comprehensive suite of probability distributions, summary statistics, and basic statistical tests. |
16 | 16 | It also provides modules for clustering, optimization, interpolation, and signal processing. |
17 | 17 | - [matplotlib](https://matplotlib.org/), the foundational plotting library in Python, which enables the creation of high-quality static, animated, and interactive visualizations, and serves as the basis for many higher-level plotting and statistical graphics libraries. |
18 | | -- [statsmodels](https://www.statsmodels.org/), which offers tools for econometrics, classical statistics, and statistical modeling---including linear and generalized linear models, time series analysis, survival analysis, and hypothesis testing, with extensive support for model diagnostics and statistical inference. |
| 18 | +- [statsmodels](https://www.statsmodels.org/), which offers tools for econometrics, classical statistics, and statistical modeling—including linear and generalized linear models, time series analysis, survival analysis, and hypothesis testing, with extensive support for model diagnostics and statistical inference. |
19 | 19 | - [scikit-learn](https://scikit-learn.org/), which is best known for machine learning but also supports statistical modeling, offering a consistent API for regression, classification, clustering, model evaluation, statistical preprocessing, and dimensionality reduction. |
20 | 20 |
|
21 | 21 | These core libraries are generally well-tested, reliable, and uphold high software engineering standards, making them trusted foundations for research and application. |
|
0 commit comments