fix: LightGBM 4.0+ compatibility for early_stopping_rounds=None by Olcmyk · Pull Request #2232 · microsoft/qlib

Olcmyk · 2026-05-23T14:08:42Z

Description

This PR fixes LightGBM 4.0+ compatibility issue where TypeError: early_stopping_round should be an integer. Got 'NoneType' is raised when early_stopping_rounds=None is passed to the model.

Starting from LightGBM 4.0, the lgb.early_stopping() function no longer accepts None as a parameter and requires an integer value. This PR modifies the code to only create the early stopping callback when early_stopping_rounds is not None.

Changes:

Modified qlib/contrib/model/gbdt.py: Build callbacks list dynamically, only add early_stopping callback when rounds is not None
Modified qlib/contrib/model/highfreq_gdbt_model.py: Apply same pattern for consistency and robustness

Motivation and Context

Related Issues:

Fixes issue UnpicklingError: Forbidden class: qlib.contrib.data.handler.Alpha158 #2130 (partially - requires PR fix(unpickler): allow Alpha158/Alpha360 handlers and the standard dataset chain #2213 and PR Fix/pickle whitelist zscore #2230 first)
Depends on PR Fix/pickle whitelist zscore #2230 (zscore pickle whitelist fix)

Problem:
After applying PR #2213 and PR #2230 (which fix the pickle whitelist issues), the DDG-DA workflow fails with:

TypeError: early_stopping_round should be an integer. Got 'NoneType'

Why this change is required:

LightGBM 4.0+ changed the API to require an integer for early_stopping_rounds
The DDG-DA workflow explicitly sets early_stopping_rounds=None to disable early stopping
The code directly passes None to lgb.early_stopping(), which is no longer allowed in LightGBM 4.0+

Root Cause:
In qlib/contrib/model/gbdt.py line 71-73, the code unconditionally creates an early stopping callback:

early_stopping_callback = lgb.early_stopping(
    self.early_stopping_rounds if early_stopping_rounds is None else early_stopping_rounds
)

When the value is None, LightGBM 4.0+ raises a TypeError.

Solution:
Only create the early stopping callback when the value is not None:

callbacks = []
early_stop_rounds = self.early_stopping_rounds if early_stopping_rounds is None else early_stopping_rounds
if early_stop_rounds is not None:
    callbacks.append(lgb.early_stopping(early_stop_rounds))

This pattern is already used in qlib/contrib/model/double_ensemble.py (lines 110-111).

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.

Additional Testing:

Tested with LightGBM 4.6.0 (latest version)
Verified DDG-DA workflow proceeds past the LightGBM error
Confirmed early stopping still works when a valid integer is provided
Confirmed training works when early_stopping_rounds=None (no early stopping)

Test Environment:

Python: 3.8.10
OS: Linux (Ubuntu 22.04)
LightGBM version: 4.6.0
qlib version: 0.9.8.dev33

Test Command:

cd examples/benchmarks_dynamic/DDG-DA
rm -rf mlruns
python workflow.py run

Screenshots of Test Results (if appropriate):

Pipeline test: ✅ Passed
Your own tests:

Before the fix:

TypeError: early_stopping_round should be an integer. Got 'NoneType'
File "qlib/contrib/model/gbdt.py", line 71, in fit
    early_stopping_callback = lgb.early_stopping(...)

After the fix:

train tasks: 100%|████████████████████████████████████| 154/154 [05:31<00:00,  2.15s/it]
calc: 100%|█████████████████████████████████████████| 154/154 [00:01<00:00, 100.48it/s]

The LightGBM TypeError is fixed and training completes all 154 tasks successfully.

Note: After fixing this issue, the DDG-DA workflow encounters another bug (TypeError: unhashable type: 'slice'), which is a separate issue in the data selection module and will be addressed in a follow-up PR.

Types of changes

Fix bugs
Add new feature
Update documentation

Additional Notes

Backward Compatibility:
This fix is fully backward compatible and works with both LightGBM < 4.0 and >= 4.0.

Dependencies:
This PR should be merged after:

PR Fix/pickle whitelist zscore #2230 (zscore pickle whitelist)

Without these PRs, the zscore unpickling error will occur before reaching this LightGBM issue.

References:

LightGBM 4.0 Release Notes: https://github.com/microsoft/LightGBM/releases/tag/v4.0.0
LightGBM early_stopping API: https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.early_stopping.html

⚠️ Important: This PR depends on #2230

This PR is based on PR Fix/pickle whitelist zscore #2230 and should be reviewed/merged after Fix/pickle whitelist zscore #2230 is merged.
The commit history includes commits from Fix/pickle whitelist zscore #2230 because this bug only appears after that fix is applied.

…aset chain The RestrictedUnpickler safelist introduced by the recent security hardening (microsoft#2099 / microsoft#2076 / microsoft#2153) only covered the abstract ``DataHandler`` / ``DataHandlerLP`` classes plus ``StaticDataLoader``. Any rolling workflow that pickles a real Dataset (the default for ``Rolling._train_rolling_tasks``) walks into one of the contrib stock handlers and now crashes on reload (issue microsoft#2130): UnpicklingError: Forbidden class: qlib.contrib.data.handler.Alpha158. Only whitelisted classes are allowed for security reasons. ... Unrolling workflows happened to use a path that did not go through the restricted loader, which is why downgrading to 0.9.7 hid the issue. Extend ``SAFE_PICKLE_CLASSES`` with the qlib-internal classes that sit on the standard recorder pickle graph: * The four shipped contrib handlers: ``Alpha158``, ``Alpha158vwap``, ``Alpha360``, ``Alpha360vwap``. * The dataset wrappers (``Dataset``, ``DatasetH``, ``TSDatasetH``) and the additional concrete loaders (``DataLoader``, ``DLWParser``, ``QlibDataLoader``, ``NestedDataLoader``, ``DataLoaderDH``). * Every concrete ``Processor`` defined in ``qlib.data.dataset.processor`` -- they show up in every realistic ``learn_processors`` / ``infer_processors`` chain. These are all classes already shipped inside qlib itself, so adding them does not weaken the threat model the safelist was designed against (arbitrary code execution through external pickle payloads). Add regression tests pinning each added entry plus an end-to-end check that ``RestrictedUnpickler.find_class`` actually resolves ``Alpha158`` and that other unknown classes are still rejected. Fixes microsoft#2130

PR microsoft#2213 added Alpha158/Alpha360 handlers to the pickle whitelist but missed qlib.utils.data.zscore, which is also required by the DDG-DA workflow. Without this, DDG-DA fails with: UnpicklingError: Forbidden class: qlib.utils.data.zscore This commit adds zscore to the whitelist and includes a test to prevent regression. Fixes microsoft#2130 (supplement to PR microsoft#2213)

LightGBM 4.0+ no longer accepts None for early_stopping_rounds parameter. This commit modifies the code to only create the early_stopping callback when early_stopping_rounds is not None. Changes: - qlib/contrib/model/gbdt.py: Build callbacks list dynamically, only add early_stopping callback when rounds is not None - qlib/contrib/model/highfreq_gdbt_model.py: Apply same pattern for consistency and robustness This fix allows DDG-DA workflow to proceed past the LightGBM TypeError when early_stopping is disabled by setting early_stopping_rounds=None. Fixes the error: TypeError: early_stopping_round should be an integer. Got 'NoneType' Note: This fix requires PR microsoft#2230 (zscore whitelist) to be applied first, otherwise the zscore unpickling error will occur before this issue. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…sues This commit fixes two critical bugs that prevented DDG-DA workflow from running: 1. **Unhashable list type error in InternalData.setup()** - Problem: data_key was a list [start_date, end_date], which cannot be used as dictionary keys or DataFrame column names - Fix: Convert list to tuple to make it hashable (line 99-100) 2. **Incorrect pandas indexing in _calc_perf()** - Problem: Used wrong syntax df.loc(axis=0)[:, "pred"] and group_keys=False caused loss of datetime index, leading to droplevel error - Fix: Remove group_keys=False and use df.xs("label", level=1) to correctly select from MultiIndex (line 112-114) 3. **Missing InternalData in pickle whitelist** - Problem: InternalData class was not whitelisted, causing UnpicklingError - Fix: Add InternalData to SAFE_PICKLE_CLASSES (pickle_utils.py line 91) Changes: - qlib/contrib/meta/data_selection/dataset.py: * Convert list to tuple for hashable dictionary keys * Fix _calc_perf to use correct pandas MultiIndex selection - qlib/utils/pickle_utils.py: * Add InternalData to pickle whitelist Testing: ✅ DDG-DA workflow now runs successfully to completion ✅ All 154 training tasks complete without errors ✅ Meta-learning data selection works correctly ✅ Final backtest results generated successfully This is a WORKING VERSION - DDG-DA workflow runs end-to-end! Related issues: - Depends on PR microsoft#2230 (zscore whitelist) - Depends on LightGBM 4.0+ compatibility fix Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

genisis0x and others added 4 commits May 12, 2026 15:43

Olcmyk closed this May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: LightGBM 4.0+ compatibility for early_stopping_rounds=None#2232

fix: LightGBM 4.0+ compatibility for early_stopping_rounds=None#2232
Olcmyk wants to merge 4 commits into
microsoft:mainfrom
Olcmyk:fix/lightgbm-4.0-compatibility

Olcmyk commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Olcmyk commented May 23, 2026

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants