Skip to content

Commit 483f467

Browse files
maint: update documentation and contributor guide
Co-authored-by: Pieter Gijsbers <p.gijsbers@tue.nl>
1 parent d36593d commit 483f467

File tree

6 files changed

+135
-192
lines changed

6 files changed

+135
-192
lines changed

.all-contributorsrc

Lines changed: 0 additions & 36 deletions
This file was deleted.

CONTRIBUTING.md

Lines changed: 96 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1+
# Contributing to `openml-python`
12
This document describes the workflow on how to contribute to the openml-python package.
23
If you are interested in connecting a machine learning package with OpenML (i.e.
34
write an openml-python extension) or want to find other ways to contribute, see [this page](https://openml.github.io/openml-python/main/contributing.html#contributing).
45

5-
Scope of the package
6-
--------------------
6+
## Scope of the package
77

88
The scope of the OpenML Python package is to provide a Python interface to
99
the OpenML platform which integrates well with Python's scientific stack, most
@@ -15,66 +15,112 @@ in Python, [scikit-learn](http://scikit-learn.org/stable/index.html).
1515
Thereby it will automatically be compatible with many machine learning
1616
libraries written in Python.
1717

18-
We aim to keep the package as light-weight as possible and we will try to
18+
We aim to keep the package as light-weight as possible, and we will try to
1919
keep the number of potential installation dependencies as low as possible.
2020
Therefore, the connection to other machine learning libraries such as
2121
*pytorch*, *keras* or *tensorflow* should not be done directly inside this
2222
package, but in a separate package using the OpenML Python connector.
2323
More information on OpenML Python connectors can be found [here](https://openml.github.io/openml-python/main/contributing.html#contributing).
2424

25-
Reporting bugs
26-
--------------
27-
We use GitHub issues to track all bugs and feature requests; feel free to
28-
open an issue if you have found a bug or wish to see a feature implemented.
29-
30-
It is recommended to check that your issue complies with the
31-
following rules before submitting:
32-
33-
- Verify that your issue is not being currently addressed by other
34-
[issues](https://github.com/openml/openml-python/issues)
35-
or [pull requests](https://github.com/openml/openml-python/pulls).
36-
37-
- Please ensure all code snippets and error messages are formatted in
38-
appropriate code blocks.
39-
See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks).
40-
41-
- Please include your operating system type and version number, as well
42-
as your Python, openml, scikit-learn, numpy, and scipy versions. This information
43-
can be found by running the following code snippet:
44-
```python
45-
import platform; print(platform.platform())
46-
import sys; print("Python", sys.version)
47-
import numpy; print("NumPy", numpy.__version__)
48-
import scipy; print("SciPy", scipy.__version__)
49-
import sklearn; print("Scikit-Learn", sklearn.__version__)
50-
import openml; print("OpenML", openml.__version__)
51-
```
25+
## Determine what contribution to make
5226

53-
Determine what contribution to make
54-
-----------------------------------
5527
Great! You've decided you want to help out. Now what?
56-
All contributions should be linked to issues on the [Github issue tracker](https://github.com/openml/openml-python/issues).
28+
All contributions should be linked to issues on the [GitHub issue tracker](https://github.com/openml/openml-python/issues).
5729
In particular for new contributors, the *good first issue* label should help you find
58-
issues which are suitable for beginners. Resolving these issues allow you to start
30+
issues which are suitable for beginners. Resolving these issues allows you to start
5931
contributing to the project without much prior knowledge. Your assistance in this area
6032
will be greatly appreciated by the more experienced developers as it helps free up
6133
their time to concentrate on other issues.
6234

63-
If you encountered a particular part of the documentation or code that you want to improve,
35+
If you encounter a particular part of the documentation or code that you want to improve,
6436
but there is no related open issue yet, open one first.
6537
This is important since you can first get feedback or pointers from experienced contributors.
6638

6739
To let everyone know you are working on an issue, please leave a comment that states you will work on the issue
6840
(or, if you have the permission, *assign* yourself to the issue). This avoids double work!
6941

70-
General git workflow
71-
--------------------
42+
## Contributing Workflow Overview
43+
To contribute to the openml-python package, follow these steps:
44+
45+
0. Determine how you want to contribute (see above).
46+
1. Set up your local development environment.
47+
1. Fork and clone the `openml-python` repository. Then, create a new branch from the ``develop`` branch. If you are new to `git`, see our [detailed documentation](#basic-git-workflow), or rely on your favorite IDE.
48+
2. [Install the local dependencies](#install-local-dependencies) to run the tests for your contribution.
49+
3. [Test your installation](#testing-your-installation) to ensure everything is set up correctly.
50+
4. Implement your contribution. If contributing to the documentation, see [here](#contributing-to-the-documentation).
51+
5. [Create a pull request](#pull-request-checklist).
52+
53+
### Install Local Dependencies
54+
55+
We recommend following the instructions below to install all requirements locally.
56+
However, it is also possible to use the [openml-python docker image](https://github.com/openml/openml-python/blob/main/docker/readme.md) for testing and building documentation. Moreover, feel free to use any alternative package managers, such as `pip`.
57+
58+
59+
1. To ensure a smooth development experience, we recommend using the `uv` package manager. Thus, first install `uv`. If any Python version already exists on your system, follow the steps below, otherwise see [here](https://docs.astral.sh/uv/getting-started/installation/).
60+
```bash
61+
pip install uv
62+
```
63+
2. Create a virtual environment using `uv` and activate it. This will ensure that the dependencies for `openml-python` do not interfere with other Python projects on your system.
64+
```bash
65+
uv venv --seed --python 3.8 ~/.venvs/openml-python
66+
source ~/.venvs/openml-python/bin/activate
67+
pip install uv # Install uv within the virtual environment
68+
```
69+
3. Then install openml with its test dependencies by running
70+
```bash
71+
uv pip install -e .[test]
72+
```
73+
from the repository folder.
74+
Then configure the pre-commit to be able to run unit tests, as well as [pre-commit](#pre-commit-details) through:
75+
```bash
76+
pre-commit install
77+
```
78+
79+
### Testing (Your Installation)
80+
To test your installation and run the tests for the first time, run the following from the repository folder:
81+
```bash
82+
pytest tests
83+
```
84+
For Windows systems, you may need to add `pytest` to PATH before executing the command.
85+
86+
Executing a specific unit test can be done by specifying the module, test case, and test.
87+
You may then run a specific module, test case, or unit test respectively:
88+
```bash
89+
pytest tests/test_datasets/test_dataset.py
90+
pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest
91+
pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest::test_get_data
92+
```
93+
94+
To test your new contribution, add [unit tests](https://github.com/openml/openml-python/tree/develop/tests), and, if needed, [examples](https://github.com/openml/openml-python/tree/develop/examples) for any new functionality being introduced. Some notes on unit tests and examples:
95+
* If a unit test contains an upload to the test server, please ensure that it is followed by a file collection for deletion, to prevent the test server from bulking up. For example, `TestBase._mark_entity_for_removal('data', dataset.dataset_id)`, `TestBase._mark_entity_for_removal('flow', (flow.flow_id, flow.name))`.
96+
* Please ensure that the example is run on the test server by beginning with the call to `openml.config.start_using_configuration_for_example()`, which is done by default for tests derived from `TestBase`.
97+
* Add the `@pytest.mark.sklearn` marker to your unit tests if they have a dependency on scikit-learn.
98+
99+
### Pull Request Checklist
100+
101+
You can go to the `openml-python` GitHub repository to create the pull request by [comparing the branch](https://github.com/openml/openml-python/compare) from your fork with the `develop` branch of the `openml-python` repository. When creating a pull request, make sure to follow the comments and structured provided by the template on GitHub.
102+
103+
**An incomplete contribution** -- where you expect to do more work before
104+
receiving a full review -- should be submitted as a `draft`. These may be useful
105+
to: indicate you are working on something to avoid duplicated work,
106+
request broad review of functionality or API, or seek collaborators.
107+
Drafts often benefit from the inclusion of a
108+
[task list](https://github.com/blog/1375-task-lists-in-gfm-issues-pulls-comments)
109+
in the PR description.
110+
111+
---
112+
113+
# Appendix
114+
115+
## Basic `git` Workflow
72116
73117
The preferred workflow for contributing to openml-python is to
74118
fork the [main repository](https://github.com/openml/openml-python) on
75119
GitHub, clone, check out the branch `develop`, and develop on a new branch
76120
branch. Steps:
77121
122+
0. Make sure you have git installed, and a GitHub account.
123+
78124
1. Fork the [project repository](https://github.com/openml/openml-python)
79125
by clicking on the 'Fork' button near the top right of the page. This creates
80126
a copy of the code under your GitHub user account. For more details on
@@ -84,20 +130,20 @@ branch. Steps:
84130
local disk:
85131
86132
```bash
87-
$ git clone git@github.com:YourLogin/openml-python.git
88-
$ cd openml-python
133+
git clone git@github.com:YourLogin/openml-python.git
134+
cd openml-python
89135
```
90136
91137
3. Switch to the ``develop`` branch:
92138
93139
```bash
94-
$ git checkout develop
140+
git checkout develop
95141
```
96142
97143
3. Create a ``feature`` branch to hold your development changes:
98144
99145
```bash
100-
$ git checkout -b feature/my-feature
146+
git checkout -b feature/my-feature
101147
```
102148
103149
Always use a ``feature`` branch. It's good practice to never work on the ``main`` or ``develop`` branch!
@@ -106,98 +152,24 @@ local disk:
106152
4. Develop the feature on your feature branch. Add changed files using ``git add`` and then ``git commit`` files:
107153
108154
```bash
109-
$ git add modified_files
110-
$ git commit
155+
git add modified_files
156+
git commit
111157
```
112158
113159
to record your changes in Git, then push the changes to your GitHub account with:
114160
115161
```bash
116-
$ git push -u origin my-feature
162+
git push -u origin my-feature
117163
```
118164
119165
5. Follow [these instructions](https://help.github.com/articles/creating-a-pull-request-from-a-fork)
120-
to create a pull request from your fork. This will send an email to the committers.
166+
to create a pull request from your fork.
121167
122168
(If any of the above seems like magic to you, please look up the
123169
[Git documentation](https://git-scm.com/documentation) on the web, or ask a friend or another contributor for help.)
124170
125-
Pull Request Checklist
126-
----------------------
127-
128-
We recommended that your contribution complies with the
129-
following rules before you submit a pull request:
130-
131-
- Follow the
132-
[pep8 style guide](https://www.python.org/dev/peps/pep-0008/).
133-
With the following exceptions or additions:
134-
- The max line length is 100 characters instead of 80.
135-
- When creating a multi-line expression with binary operators, break before the operator.
136-
- Add type hints to all function signatures.
137-
(note: not all functions have type hints yet, this is work in progress.)
138-
- Use the [`str.format`](https://docs.python.org/3/library/stdtypes.html#str.format) over [`printf`](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting) style formatting.
139-
E.g. use `"{} {}".format('hello', 'world')` not `"%s %s" % ('hello', 'world')`.
140-
(note: old code may still use `printf`-formatting, this is work in progress.)
141-
142-
- If your pull request addresses an issue, please use the pull request title
143-
to describe the issue and mention the issue number in the pull request description. This will make sure a link back to the original issue is
144-
created. Make sure the title is descriptive enough to understand what the pull request does!
145-
146-
- An incomplete contribution -- where you expect to do more work before
147-
receiving a full review -- should be submitted as a `draft`. These may be useful
148-
to: indicate you are working on something to avoid duplicated work,
149-
request broad review of functionality or API, or seek collaborators.
150-
Drafts often benefit from the inclusion of a
151-
[task list](https://github.com/blog/1375-task-lists-in-gfm-issues-pulls-comments)
152-
in the PR description.
153-
154-
- Add [unit tests](https://github.com/openml/openml-python/tree/develop/tests) and [examples](https://github.com/openml/openml-python/tree/develop/examples) for any new functionality being introduced.
155-
- If an unit test contains an upload to the test server, please ensure that it is followed by a file collection for deletion, to prevent the test server from bulking up. For example, `TestBase._mark_entity_for_removal('data', dataset.dataset_id)`, `TestBase._mark_entity_for_removal('flow', (flow.flow_id, flow.name))`.
156-
- Please ensure that the example is run on the test server by beginning with the call to `openml.config.start_using_configuration_for_example()`.
157-
- Add the `@pytest.mark.sklearn` marker to your unit tests if they have a dependency on scikit-learn.
158-
159-
- All tests pass when running `pytest`. On
160-
Unix-like systems, check with (from the toplevel source folder):
161-
162-
```bash
163-
$ pytest
164-
```
165-
166-
For Windows systems, execute the command from an Anaconda Prompt or add `pytest` to PATH before executing the command.
167-
168-
- Documentation and high-coverage tests are necessary for enhancements to be
169-
accepted. Bug-fixes or new features should be provided with
170-
[non-regression tests](https://en.wikipedia.org/wiki/Non-regression_testing).
171-
These tests verify the correct behavior of the fix or feature. In this
172-
manner, further modifications on the code base are granted to be consistent
173-
with the desired behavior.
174-
For the Bug-fixes case, at the time of the PR, this tests should fail for
175-
the code base in develop and pass for the PR code.
176-
177-
- If any source file is being added to the repository, please add the BSD 3-Clause license to it.
178-
179-
180-
*Note*: We recommend to follow the instructions below to install all requirements locally.
181-
However it is also possible to use the [openml-python docker image](https://github.com/openml/openml-python/blob/main/docker/readme.md) for testing and building documentation.
182-
This can be useful for one-off contributions or when you are experiencing installation issues.
183-
184-
First install openml with its test dependencies by running
185-
```bash
186-
$ pip install -e .[test]
187-
```
188-
from the repository folder.
189-
Then configure pre-commit through
190-
```bash
191-
$ pre-commit install
192-
```
193-
This will install dependencies to run unit tests, as well as [pre-commit](https://pre-commit.com/).
194-
To run the unit tests, and check their code coverage, run:
195-
```bash
196-
$ pytest --cov=. path/to/tests_for_package
197-
```
198-
Make sure your code has good unittest **coverage** (at least 80%).
199-
200-
Pre-commit is used for various style checking and code formatting.
171+
## Pre-commit Details
172+
[Pre-commit](https://pre-commit.com/) is used for various style checking and code formatting.
201173
Before each commit, it will automatically run:
202174
- [ruff](https://docs.astral.sh/ruff/) a code formatter and linter.
203175
This will automatically format your code.
@@ -216,23 +188,7 @@ $ pre-commit run --all-files
216188
```
217189
Make sure to do this at least once before your first commit to check your setup works.
218190
219-
Executing a specific unit test can be done by specifying the module, test case, and test.
220-
You may then run a specific module, test case, or unit test respectively:
221-
```bash
222-
$ pytest tests/test_datasets/test_dataset.py
223-
$ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest
224-
$ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest::test_get_data
225-
```
226-
227-
*NOTE*: In the case the examples build fails during the Continuous Integration test online, please
228-
fix the first failing example. If the first failing example switched the server from live to test
229-
or vice-versa, and the subsequent examples expect the other server, the ensuing examples will fail
230-
to be built as well.
231-
232-
Happy testing!
233-
234-
Documentation
235-
-------------
191+
## Contributing to the Documentation
236192
237193
We are glad to accept any sort of documentation: function docstrings,
238194
reStructuredText documents, tutorials, etc.
@@ -247,9 +203,9 @@ information.
247203
248204
For building the documentation, you will need to install a few additional dependencies:
249205
```bash
250-
$ pip install -e .[examples,docs]
206+
uv pip install -e .[examples,docs]
251207
```
252208
When dependencies are installed, run
253209
```bash
254-
$ sphinx-build -b html doc YOUR_PREFERRED_OUTPUT_DIRECTORY
210+
sphinx-build -b html doc YOUR_PREFERRED_OUTPUT_DIRECTORY
255211
```

ISSUE_TEMPLATE.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
<!--
2+
It is recommended to check that your issue complies with the
3+
following rules before submitting:
4+
5+
- Verify that your issue is not being currently addressed by other
6+
issues (https://github.com/openml/openml-python/issues)
7+
or pull requests (https://github.com/openml/openml-python/pulls).
8+
9+
- Please ensure all code snippets and error messages are formatted in
10+
appropriate code blocks. See https://help.github.com/articles/creating-and-highlighting-code-blocks
11+
-->
12+
113
#### Description
214
<!-- Example: Joblib Error thrown when calling fit on LatentDirichletAllocation with evaluate_every > 0-->
315

@@ -20,7 +32,10 @@ it in the issue: https://gist.github.com
2032

2133
#### Versions
2234
<!--
23-
Please run the following snippet and paste the output below.
35+
Please include your operating system type and version number, as well
36+
as your Python, openml, scikit-learn, numpy, and scipy versions. This information
37+
can be found by running the following code snippet:
38+
2439
import platform; print(platform.platform())
2540
import sys; print("Python", sys.version)
2641
import numpy; print("NumPy", numpy.__version__)
@@ -30,4 +45,5 @@ import openml; print("OpenML", openml.__version__)
3045
-->
3146

3247

33-
<!-- Thanks for contributing! -->
48+
<!-- Thanks for contributing! -->
49+

0 commit comments

Comments
 (0)