-
-
Notifications
You must be signed in to change notification settings - Fork 211
Closed
Labels
bugserversideThese issues are present in the rest API and not fixable by the Python package.These issues are present in the rest API and not fixable by the Python package.
Milestone
Description
Description
The estimation_procedure_id does not always seem to correspond with the displayed estimation procedure. I came across this when reproducing tasks from existing datasets to new datasets.
Steps/Code to Reproduce
For example some tasks from the dataset 'credit-approval' with id 29:
import openml
task_df = openml.tasks.list_tasks(data_id=29, output_format='dataframe').iloc[:5]
print(task_df[['tid', 'estimation_procedure']])
print(openml.tasks.get_task(29).estimation_procedure_id)
print(openml.tasks.get_task(259).estimation_procedure_id)
print(openml.tasks.get_task(1793).estimation_procedure_id)
print(openml.tasks.get_task(88).estimation_procedure_id)
print(openml.tasks.get_task(1728).estimation_procedure_id)
gives:
tid estimation_procedure
29 29 10-fold Crossvalidation
88 88 10 times 10-fold Learning Curve
259 259 33% Holdout set
1728 1728 10-fold Learning Curve
1793 1793 5 times 2-fold Crossvalidation
1
1
1
13
13
Expected Results
The first three should have estimation_procedure_id 1, 6 and 2.
The first three should have estimation_procedure_id 3 and 13.
Actual Results
Actually the first three all have id 1. While the last two both have id 13.
Versions
Windows-10-10.0.19043-SP0
Python 3.10.1 (tags/v3.10.1:2cd268a, Dec 6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)]
NumPy 1.22.0
SciPy 1.8.0
Scikit-Learn 1.0.2
OpenML 0.12.2
Metadata
Metadata
Assignees
Labels
bugserversideThese issues are present in the rest API and not fixable by the Python package.These issues are present in the rest API and not fixable by the Python package.