-
Notifications
You must be signed in to change notification settings - Fork 2k
docs: Improve getting started and testing guides for humans and agents #20970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e9fceb7
2ed084f
20db17d
83e2793
350b910
014b72e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,7 +21,38 @@ | |
|
|
||
| This section describes how you can get started at developing DataFusion. | ||
|
|
||
| ## Windows setup | ||
| ## Quick Start | ||
|
|
||
| For the fastest path to a working local environment, follow these steps | ||
| from the repository root: | ||
|
|
||
| ```shell | ||
| # 1. Install Rust (https://rust-lang.org/tools/install/) and verify the active toolchain with | ||
| rustup show | ||
|
|
||
| # 2. Install protoc 3.15+ (see details below) | ||
| protoc --version | ||
|
|
||
| # 3. Download test data used by examples and many tests | ||
| git submodule update --init --recursive | ||
|
|
||
| # 4. Build the workspace | ||
| cargo build | ||
|
|
||
| # 5. Verify that Rust integration tests can be run | ||
| cargo test -p datafusion --test parquet_integration | ||
|
|
||
| # 6. Verify that sqllogictests can run | ||
| cargo test --profile=ci --test sqllogictests | ||
| ``` | ||
|
|
||
| Notes: | ||
|
|
||
| - The pinned Rust version is defined in `rust-toolchain.toml`. | ||
| - `protoc` is required to compile DataFusion from source. | ||
| - Some tests and examples rely on git submodule data being present locally. | ||
|
|
||
| ## Windows Setup | ||
|
|
||
| ```shell | ||
| wget https://az792536.vo.msecnd.net/vms/VMBuild_20190311/VirtualBox/MSEdge/MSEdge.Win10.VirtualBox.zip | ||
|
|
@@ -34,19 +65,19 @@ cargo build | |
|
|
||
| DataFusion has support for [dev containers](https://containers.dev/) which may be used for | ||
| developing DataFusion in an isolated environment either locally or remote if desired. Using dev containers for developing | ||
| DataFusion is not a requirement by any means but is available for those where doing local development could be tricky | ||
| DataFusion is not a requirement but is available where doing local development could be tricky | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. drive by cleanup to make this more concise |
||
| such as with Windows and WSL2, those with older hardware, etc. | ||
|
|
||
| For specific details on IDE support for dev containers see the documentation for [Visual Studio Code](https://code.visualstudio.com/docs/devcontainers/containers), | ||
| [IntelliJ IDEA](https://www.jetbrains.com/help/idea/connect-to-devcontainer.html), | ||
| [Rust Rover](https://www.jetbrains.com/help/rust/connect-to-devcontainer.html), and | ||
| [GitHub Codespaces](https://docs.github.com/en/codespaces/setting-up-your-project-for-codespaces/adding-a-dev-container-configuration/introduction-to-dev-containers). | ||
|
|
||
| ## Protoc Installation | ||
| ## `protoc` Installation | ||
|
|
||
| Compiling DataFusion from sources requires an installed version of the protobuf compiler, `protoc`. | ||
|
|
||
| On most platforms this can be installed from your system's package manager | ||
| On most platforms this can be installed from your system's package manager. For example: | ||
|
|
||
| ``` | ||
| # Ubuntu | ||
|
|
@@ -71,7 +102,7 @@ libprotoc 3.15.0 | |
|
|
||
| Alternatively a binary release can be downloaded from the [Release Page](https://github.com/protocolbuffers/protobuf/releases) or [built from source](https://github.com/protocolbuffers/protobuf/blob/main/src/README.md). | ||
|
|
||
| ## Bootstrap environment | ||
| ## Bootstrap Environment | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Made heading consistent. |
||
|
|
||
| DataFusion is written in Rust and it uses a standard rust toolkit: | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -32,8 +32,10 @@ community as well as get more familiar with Rust and the relevant codebases. | |||||||||||||
|
|
||||||||||||||
| ## Development Environment | ||||||||||||||
|
|
||||||||||||||
| Setup your development environment [here](development_environment.md), and learn | ||||||||||||||
| how to test the code [here](testing.md). | ||||||||||||||
| Start with the [Development Environment Quick Start](development_environment.md#quick-start). | ||||||||||||||
|
|
||||||||||||||
| For more detail, see the full [development environment guide](development_environment.md) | ||||||||||||||
| and the [testing guide](testing.md). | ||||||||||||||
|
|
||||||||||||||
| ## Finding and Creating Issues to Work On | ||||||||||||||
|
|
||||||||||||||
|
|
@@ -99,6 +101,19 @@ If you are concerned that a larger design will be lost in a string of small PRs, | |||||||||||||
|
|
||||||||||||||
| Note all commits in a PR are squashed when merged to the `main` branch so there is one commit per PR after merge. | ||||||||||||||
|
|
||||||||||||||
| ## Before Submitting a PR | ||||||||||||||
|
|
||||||||||||||
| Before submitting a PR, run the standard formatting and lint checks and fix any | ||||||||||||||
| issues they report: | ||||||||||||||
|
|
||||||||||||||
| ```bash | ||||||||||||||
| ./ci/scripts/rust_fmt.sh | ||||||||||||||
| ./ci/scripts/rust_clippy.sh | ||||||||||||||
|
Comment on lines
+110
to
+111
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
This script is the entry point for all non-functional tests. It includes the previous two scripts as well as several others. |
||||||||||||||
| ``` | ||||||||||||||
|
|
||||||||||||||
| These scripts are the same checks run in CI for Rust formatting and clippy. | ||||||||||||||
| You should also run any relevant commands from the [testing quick start](testing.md#testing-quick-start). | ||||||||||||||
|
|
||||||||||||||
| ## Conventional Commits & Labeling PRs | ||||||||||||||
|
|
||||||||||||||
| We generate change logs for each release using an automated process that will categorize PRs based on the title | ||||||||||||||
|
|
||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -23,6 +23,38 @@ Tests are critical to ensure that DataFusion is working properly and | |||||||
| is not accidentally broken during refactorings. All new features | ||||||||
| should have test coverage and the entire test suite is run as part of CI. | ||||||||
|
|
||||||||
| ## Testing Quick Start | ||||||||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. THis is based on what started in AGENTS.md but I made it slightly easier to understand |
||||||||
|
|
||||||||
| While developing a feature or bug fix, best practice is to run the smallest set | ||||||||
| of tests that gives confidence for your change, then expand as needed. | ||||||||
|
|
||||||||
| Initially, run the tests in the crates you changed. For example, if you made changes | ||||||||
| to files in `datafusion-optimizer/src`, run the corresponding crate tests: | ||||||||
|
|
||||||||
| ```shell | ||||||||
| cargo test -p datafusion-optimizer | ||||||||
| ``` | ||||||||
|
|
||||||||
| Then, run the `sqllogictest` suite, which is the main regression suite for SQL | ||||||||
| behavior and covers most DataFusion features. | ||||||||
|
Comment on lines
+38
to
+39
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
|
|
||||||||
| ```shell | ||||||||
| cargo test --profile=ci --test sqllogictests | ||||||||
| ``` | ||||||||
|
|
||||||||
| Finally, before submitting a PR, run the tests for the core `datafusion` and | ||||||||
| `datafusion-cli` crates: | ||||||||
|
|
||||||||
| ```shell | ||||||||
| cargo test -p datafusion | ||||||||
| cargo test -p datafusion-cli | ||||||||
| ``` | ||||||||
|
|
||||||||
| Some integration tests require optional external services such as Docker-backed | ||||||||
| containers and may skip when unavailable. | ||||||||
|
|
||||||||
| ## Testing Overview | ||||||||
|
|
||||||||
| DataFusion has several levels of tests in its [Test Pyramid] and tries to follow | ||||||||
| the Rust standard [Testing Organization] described in [The Book]. | ||||||||
|
|
||||||||
|
|
||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pulled most of this out of agents.md and left a link there instead