Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 176 additions & 0 deletions sentry_streams/docs/source/devenv.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
How to set up the development environment
=========================================

This guide helps you to set up the development environment and troubleshoot common problems.

.. warning::

There is a lot of work to be done to automate the creation of the development
environment for sentry_streams. Work is on going for this. At present some parts cannot
be validated automatically.

Requirements
------------

This repo contains two libraries: `sentry_flink` and `sentry_streams`. We will focus on
`sentry_streams` as it is the only one that is packaged and released.

The `sentry_streams` library contains both Python and Rust code. Everything is packaged as
a Python package that contains also a native library (the rust one).

In order to have a working development environment, you need to have both a python venv
and a rust toolchain. The Rust library is built via `maturin` and the python library is,
at the time of writing, managed via `uv`.

This is what we require:

- Python >= 3.11
- Rust >= 1.83.0
- direnv
- cmake >= 3.5 (needed to build librdkafka)

Set up the environment
----------------------

1. Allow direnv to work on the project root. We use direnv to manage multiple environment
variables to make maturin and pyo3 work. Without those env vars set you will not
be able to build the library or run tests.

2. Install the rust tool chain with `rustup <https://rustup.rs/>`_

3. Have python >= 3.11 installed.

a. MacOs note: There are multiple ways to install the python environment on MacOS.
We tested successfully the installation via `brew`. Other ways, like `pyenv` and
`uv`, should work as well but we saw incompatibilities with the way `maturin` expects
environment variables to be set. See below for more details if you want to use
`pyenv` or `uv` to install python.
Comment on lines +43 to +47
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be handled by the envrc and direnv allow. if 3.11 does not exist uv will download it. if that leads to problems let's treat it like any other bug

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that leads to problems let's treat it like any other bug

We are treating it as any other bug. https://linear.app/getsentry/issue/STREAM-284/automate-all-the-validation-step-in-the-dev-env-guide . Still till that is not fixed people should at least know a way to make it work otherwise nobody would try to fix the issue.


4. Run `make install-dev` from the root of the repo. This will instal `uv` and build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done by direnv allow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Run `make install-dev` from the root of the repo. This will instal `uv` and build
4. Run `make install-dev` from the root of the repo. This will install `uv` and build

both `sentry_streams` and `sentry_flink`.

This should be it. After this try to run python tests `make tests-streams` and the
Rust ones: `cargo test`. If something fails here, your environment does not work.
Read below to fix it.

If you need to clean everything up and restart: `make reset`.


What can go wrong
-----------------

In this repo we both call Rust code from Python with pyO3 and call Python code from Rust.
This has a number of implications:

- The Rust library needs to compile successfully for the Python code to run and the
tests to succeed.

- The Rust code uses the already started Python interpreter when we run the pipeline or
when we run Python tests. This case is generally not problematic, no matter how the
virtual environment is configured, as long as Python tests can run, the rust code can
call the Python code.

- The Rust tests need to start a Python interpreter when we test parts of the Rust code
that call into Python code. This is where things can break depending on how the interpreter
is installed and the venv is setup.

Some environment variables are needed to successfully build the library. These are set by direnv.
They are explained here to give a better understanding in case of issues.

- `CMAKE_POLICY_VERSION_MINIMUM=3.5`` - This is needed to build librdkafka on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also definitely set by envrc.

newer versions of cmake

Needed environment variables to allow Rust to start a Python interpreter properly:

- `PYTHONPATH=.` - This is needed in order to make the Python interpreter started by
Copy link
Member

@untitaker untitaker Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this envvar is already set (STREAMS_TEST_PYTHONPATH)

specifically overwriting this envvar like this will fix one bug while re-introducing another. i think if PYTHONPATH needs to be set for cargo test, this points to an issue with direnv. otherwise it's a new bug to be investigated and should be discussed.

also since we now overwrite sys.path within the rust process i'm no longer sure PYTHONPATH has any effect at all

Rust find the virtual environment.

- `PYO3_PYTHON` - This is needed to make the Python interpreter find the right `site-packages`
directory when started by the Rust code. See `rust-envvars <https://github.com/getsentry/streams/blob/main/scripts/rust-envvars>`_
Comment on lines +77 to +89
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of this is set by the envrc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this paragraph is not part of the happy path. You should only land here if something went wrong. If something goes wrong knowing what variables are needed for what is needed to allow people to try to fix the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this just "neutrally" documents what the envrc does, shouldn't it be possible to document it next to the code that sets those envvars, i.e. as code comments?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. I'd move the explanation in the code and reference that from here.



Common Errors
-------------

These are common problems we observed

`sys.path` does not contain the venv `site-packages` directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: bash

panic message: `"Unable to import: PyErr { type: <class 'ModuleNotFoundError'>, value: ModuleNotFoundError(\"No module named 'sentry_kafka_schemas'\"),
traceback: Some(\"Traceback (most recent call last):
File \\\"<string>\\\", line 1, in <module>
File \\\"/Users/filippopacifici/code/streams/sentry_
streams/sentry_streams/pipeline/__init__.py\\\", line 1, in <module>
from sentry_streams.pipeline.chain import (
File \\\"/Users/filippopacifici/co
de/streams/sentry_streams/sentry_streams/pipeline/chain.py\\\", line 29, in <module>
from sentry_streams.pipeline.msg_codecs import (
File \\\"/Use
rs/filippopacifici/code/streams/sentry_streams/sentry_streams/pipeline/msg_codecs.py\\\", line 6, in <module>
from sentry_kafka_schemas import get_codec\\n\") }"`


Any other errors where the sentry_streams python code fails to import a
3rd party package points to the same issue.

It means that the interpreter started by Rust is able to find the the project
code as it fails when the sentry_streams code tries to import 3rd parties.

This should only happen when Rust starts the python interpreter (tests).

We need to manually set the `sys.path` variable in the Python interpreter when
Rust starts it. This is done by this code `testutils.rs <https://github.com/getsentry/streams/blob/main/sentry_streams/src/testutils.rs#L140-L141>`_.
`STREAMS_TEST_PYTHONEXECUTABLE` and `STREAMS_TEST_PYTHONPATH` have to be set.

Cannot find Python standard library packages
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This generally presents itself as not being able to load some standard library
package.

.. code-block:: bash

Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = '.'
program name = 'python3'
isolated = 0
environment = 1
user site = 1
safe_path = 0
import site = 1
is in build tree = 0
stdlib dir = '/install/lib/python3.11'
sys._base_executable = '/Users/untitaker/projects/streams/sentry_streams/target/debug/deps/rust_streams-7b3bb705f1a0bf53'
sys.base_prefix = '/install'
sys.base_exec_prefix = '/install'
sys.platlibdir = 'lib'
sys.executable = '/Users/untitaker/projects/streams/sentry_streams/target/debug/deps/rust_streams-7b3bb705f1a0bf53'
sys.prefix = '/install'
sys.exec_prefix = '/install'
sys.path = [
'/Users/untitaker/projects/streams/sentry_streams',
'/install/lib/python311.zip',
'/install/lib/python3.11',
'/install/lib/python3.11/lib-dynload',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'


The `PYTHON_HOME` environment variables should be preventing this and it is
set by direnv. In case of `uv` installed python `.python-version` should make
the environment point to the right version.

Still we saw this some times with `uv` installed python. Check which python version
`uv` is using via `uv python list`. Verify that the one you are using is referring
to a directory that exists.

If the problem persists consider not using a python interpreter not installed by
`uv`.
1 change: 1 addition & 0 deletions sentry_streams/docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@
configure_pipeline
runtime/arroyo
deployment
devenv