-
-
Notifications
You must be signed in to change notification settings - Fork 0
Make the rust adapter finalize chains when a sink is missing #218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a8739e0 to
d24f21c
Compare
d24f21c to
aecbbad
Compare
| from sentry_streams.adapters.arroyo.rust_arroyo import RustArroyoAdapter | ||
| from sentry_streams.adapters.stream_adapter import RuntimeTranslator | ||
| from sentry_streams.pipeline.message import Message | ||
| from sentry_streams.pipeline.pipeline import ( | ||
| Map, | ||
| Pipeline, | ||
| streaming_source, | ||
| ) | ||
| from sentry_streams.runner import iterate_edges | ||
| from sentry_streams.rust_streams import ( |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
.github/workflows/build-streams.yaml
Outdated
| args: --release --out dist --find-interpreter --features=extension-module | ||
| sccache: ${{ !startsWith(github.ref, 'refs/tags/') }} | ||
| manylinux: "2014" | ||
| manylinux: "2_28" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished `release` profile [optimized + debuginfo] target(s) in 52.75s
💥 maturin failed
Caused by: Error ensuring manylinux_2_17 compliance
Caused by: Your library is not manylinux_2_17 (aka manylinux2014) compliant because it depends on black-listed symbols: ["libc.so.6: __cxa_thread_atexit_impl"]
Error: The process '/usr/bin/docker' failed with exit code 1
at ExecState._setResult (/home/runner/work/_actions/PyO3/maturin-action/aef21716ff3dcae8a1c301d23ec3e4446972a6e3/dist/index.js:1823:25)
at ExecState.CheckComplete (/home/runner/work/_actions/PyO3/maturin-action/aef21716ff3dcae8a1c301d23ec3e4446972a6e3/dist/index.js:1806:18)
at ChildProcess.<anonymous> (/home/runner/work/_actions/PyO3/maturin-action/aef21716ff3dcae8a1c301d23ec3e4446972a6e3/dist/index.js:1700:27)
at ChildProcess.emit (node:events:524:28)
at maybeClose (node:internal/child_process:1104:16)
at ChildProcess._handle.onexit (node:internal/child_process:304:5)
Error: The process '/usr/bin/docker' failed with exit code 1
Some symbols required by some rust dependencies do not work in the libc version in manylinux2014.
Upgrading the manylinux version
evanh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a feature we actually want to support? Shouldn't all pipelines have a sink?
|
|
||
| def terminate(self, stream: Route) -> None: | ||
| """ | ||
| Performs all the operations to complete the construciton of the graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Performs all the operations to complete the construciton of the graph | |
| Performs all the operations to complete the construction of the graph |
| @abstractmethod | ||
| def terminate(self, stream: Union[StreamT, StreamSinkT]) -> None: | ||
| """ | ||
| Performs all the operations to complete the construciton of the graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Performs all the operations to complete the construciton of the graph | |
| Performs all the operations to complete the construction of the graph |
untitaker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do both cargo and uv lockfiles change? I don't see you intentionally adding any new dependencies. I think you might have outdated uv and/or outdated Rust.
This might also explain why the workflow needs changing?
Either this or the pipeline generation should fail if a sink is not present. @evanh what do you think ? If we require a sink we need a NoopSink as it is not given that the pipeline should provide results. Though at that point, I am not sure the NoopSink provides a great value considering the current DSL makes it very clear whene the pipeline ends. |
I'm OK with that. I would prefer if pipelines always ended with Sinks that do something. The pipeline doesn't need to produce something, but ideally it should end with a |
We have scenarios where none of this is applicable. THink about the subscription results consumer. It does trigger some tasks but they are in the middle of a custom strategy. Maybe one day that can be refactored into a task but that will be quite far away. So, in that case, asking to add a sink would require them to add a NoopSink. Then the question is: Should the noopSink be explicit ? |
What does it do after the tasks? I feel like we're missing some abstraction here: It doesn't make sense to me to have a Pipeline end with a Map or a Filter step. Those steps don't make sense to me if they have no output. So if someone wants to execute some code on every message, without outputting anything, then that feels like some kind of Sink.
I think so, yes. As I said above, I feel like there is an abstraction missing and it should be possible to model all our pipelines with a Sink. I think it's fine for now to have a NoopSink while we figure out the abstractions. Having the NoopSink be explicit also gives us an easy signal on pipelines that need to be improved in the future. |
|
I think that if we implicitly add NoopSinks I would question why we have sinks in the first place, and not just transform steps and sources (kind of like arroyo, just that some steps return |
There are two reasons to have sinks and I believe we should keep them:
@evanh I think you convinced me. All pipelines should have a sink and it should be explicit, so we can tell when a user forgot it or where the pipeline is actually supposed to end. It makes no sense, semantically speaking, to have a pipeline that terminates with a filter. As long as that is allowed there should be a sink that terminates the pipeline. |
This is an alternative approach to #218. The rust adapter chains map and filter steps together in one step in rust. This means that, when a map or filter is processed, it is not added to the Rust consumer immediately. This only happens when the chain is finalized. that happens when we encounter a step that cannot be chained or a sink. Problem: if there is a sequence of map operations and no sink, these were never added to the rust consumer. This PR requires aa Sink to the end of each branch. This idea depends on the assumption that, if you terminate a branch with anything that can produce output, is likely wrong. If a step can produce an output and it is at the end of the branch, what should we do with those messages ? Requiring a sink implies asking this question to the author of the pipeline and letting them decide.
The rust adapter chains map and filter steps together in one step in rust.
This means that, when a map or filter is processed, it is not added to the
Rust consumer immediately. This only happens when the chain is
finalized.that happens when we encounter a step that cannot be chained or a sink.
Problem: if there is a sequence of map operations and no sink, these were
never added to the rust consumer.
See the new test added for a demonstration.
This PR adds a finalzie method on the adapter that is called by the
code that iterates through the pipeline so that every pending data
structure is finalized before we start using the adapter.
A better option would be to make the translator able to understand when a
chain needs to be closed but it does nort know whether there is a sink or not.