You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(bench): set default to 5k groups; document 30% outlier scenario
- Default benchmark: 5 rows/group, 5k groups (faster, still representative)
- Added 30% outlier scenario to examples; clarified that response-only outliers
don’t trigger slow robust re-fits
- Updated example tables for Mac and Linux with new per-1k-group timings
- (optional) bench CLI default --groups=5000
Here’s a concise, ready-to-paste paragraph you can drop directly **under the “Interpretation”** section in your `groupby_regression.md` file:
210
+
211
+
---
212
+
213
+
### Cross-Platform Comparison (Mac vs Linux)
214
+
215
+
Benchmark results on a Linux server (Apptainer, Python 3.11, joblib 1.4) show similar scaling but roughly **2–2.5 × longer wall-times** than on a MacBook (Pro/i7).
216
+
For the baseline case of 50 k rows / 10 k groups (~5 rows/group):
217
+
218
+
| Scenario | Mac (s / 1 k groups) | Linux (s / 1 k groups) | Ratio (Linux / Mac) |
Parallel efficiency on Linux (≈ 5 × speed-up from 1 → 10 jobs) matches the Mac results exactly.
225
+
The difference reflects platform-specific factors such as CPU frequency, BLAS implementation, and process-spawn overhead in Apptainer—not algorithmic changes.
226
+
Overall, **scaling behavior and outlier stability are identical across platforms.**
227
+
228
+
---
229
+
230
+
231
+
178
232
### Future Work
179
233
180
234
A future extension will introduce **leverage‑outlier** generation (outliers in X and Y) to replicate the observed 25× slowdown and allow comparative testing of different robust fitters.
0 commit comments