|
| 1 | +# Changelog |
| 2 | + |
| 3 | +All notable changes to AliasDataFrame will be documented in this file. |
| 4 | + |
| 5 | +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## [Unreleased] |
| 10 | + |
| 11 | +## [1.1.0] - 2025-01-09 |
| 12 | + |
| 13 | +### Added |
| 14 | +- **Selective compression mode (Pattern 2)** - Compress specific columns from a larger schema |
| 15 | + - New API: `compress_columns(spec, columns=['dy', 'dz'])` |
| 16 | + - Enables incremental compression workflows |
| 17 | + - Only specified columns are registered and compressed |
| 18 | +- **Idempotent compression** - Re-compressing with same schema is safe (no-op) |
| 19 | + - Prevents errors in automation and scripting |
| 20 | + - Useful for incremental data collection |
| 21 | +- **Schema updates** - Update compression schema for specific columns |
| 22 | + - Works for SCHEMA_ONLY and DECOMPRESSED states |
| 23 | + - Errors on COMPRESSED state (must decompress first) |
| 24 | +- **Enhanced validation** - Column existence checked before compression |
| 25 | + - Clear error messages with available columns listed |
| 26 | + - Validates columns present in compression spec |
| 27 | +- **Pattern mixing support** - Combine Pattern 1 and Pattern 2 |
| 28 | + - Pattern 1: Schema-first (define all, compress incrementally) |
| 29 | + - Pattern 2: On-demand (compress as needed) |
| 30 | + - Column-local schema semantics (schemas can diverge) |
| 31 | + |
| 32 | +### Changed |
| 33 | +- `compress_columns()` now supports 5 modes (previously 3): |
| 34 | + 1. Schema-only definition: `compress_columns(spec, columns=[])` |
| 35 | + 2. Apply existing schema: `compress_columns(columns=['dy'])` |
| 36 | + 3. Compress all in spec: `compress_columns(spec)` |
| 37 | + 4. **Selective compression (NEW)**: `compress_columns(spec, columns=['dy', 'dz'])` |
| 38 | + 5. Auto-compress eligible: `compress_columns()` |
| 39 | +- Improved error messages for compression failures |
| 40 | + - Specific guidance for state transition errors |
| 41 | + - Clear suggestions for resolution |
| 42 | +- Updated documentation with comprehensive examples |
| 43 | + |
| 44 | +### Fixed |
| 45 | +- None (fully backward compatible) |
| 46 | + |
| 47 | +### Performance |
| 48 | +- Negligible overhead from new validation (~O(1) dict lookups) |
| 49 | +- No regression in existing compression performance |
| 50 | +- Validated with 9.6M row TPC residual dataset |
| 51 | + |
| 52 | +### Documentation |
| 53 | +- Added `docs/COMPRESSION_GUIDE.md` with comprehensive usage guide |
| 54 | +- Updated method docstrings with Pattern 2 examples |
| 55 | +- Added state machine documentation |
| 56 | +- Added troubleshooting section |
| 57 | + |
| 58 | +### Testing |
| 59 | +- Added 10 comprehensive tests for selective compression mode |
| 60 | +- All 61 tests passing |
| 61 | +- Test coverage: ~95% |
| 62 | +- No regression in existing functionality |
| 63 | + |
| 64 | +### Use Case |
| 65 | +Enables incremental compression for TPC residual analysis: |
| 66 | +- 9.6M cluster-track residuals |
| 67 | +- 8 compressed columns |
| 68 | +- 508 MB → 330 MB (35% file size reduction) |
| 69 | +- Sub-micrometer precision maintained |
| 70 | +- Compress columns incrementally as data is collected |
| 71 | + |
| 72 | +--- |
| 73 | + |
| 74 | +## [1.0.0] - 2024-XX-XX |
| 75 | + |
| 76 | +### Added |
| 77 | +- Initial compression/decompression implementation |
| 78 | +- State machine with 3 states (COMPRESSED, DECOMPRESSED, SCHEMA_ONLY) |
| 79 | +- Bidirectional compression with mathematical transforms |
| 80 | +- Lazy decompression via aliases |
| 81 | +- Precision measurement (RMSE, max error, mean error) |
| 82 | +- Schema persistence across save/load cycles |
| 83 | +- Forward declaration support ("zero pointer" pattern) |
| 84 | +- Collision detection for compressed column names |
| 85 | +- ROOT TTree export with compression aliases |
| 86 | +- Comprehensive test suite |
| 87 | + |
| 88 | +### Features |
| 89 | +- Compress columns using expression-based transforms |
| 90 | +- Decompress columns with optional schema retention |
| 91 | +- Measure compression quality metrics |
| 92 | +- Save/load compressed DataFrames |
| 93 | +- Export to ROOT with decompression aliases |
| 94 | +- Recompress after modification |
| 95 | + |
| 96 | +### Documentation |
| 97 | +- Complete API documentation |
| 98 | +- Usage examples |
| 99 | +- State machine explanation |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## Version Numbering |
| 104 | + |
| 105 | +This project uses [Semantic Versioning](https://semver.org/): |
| 106 | +- **MAJOR** version for incompatible API changes |
| 107 | +- **MINOR** version for new functionality (backward compatible) |
| 108 | +- **PATCH** version for bug fixes (backward compatible) |
| 109 | + |
| 110 | +--- |
| 111 | + |
| 112 | +## Contributing |
| 113 | + |
| 114 | +When adding entries to this changelog: |
| 115 | +1. Add new changes to the [Unreleased] section |
| 116 | +2. Move to versioned section on release |
| 117 | +3. Follow the format: Added / Changed / Deprecated / Removed / Fixed / Security |
| 118 | +4. Include use cases and examples for major changes |
| 119 | +5. Note backward compatibility status |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +**Last Updated:** 2025-01-09 |
0 commit comments