-
Notifications
You must be signed in to change notification settings - Fork 487
compute: tokenize mz_join_core #20600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fbbdcf8 to
957b326
Compare
Performance MeasurementsWe are interested in testing these performance aspects of tokenization:
As a test case I used a simple cross-join: materialize=> create table t (a int);
CREATE TABLE
materialize=> explain select t1.a + t2.a from t t1, t t2;
Optimized Plan
------------------------------------------
Explained Query: +
Return +
Project (#2) +
Map ((#0 + #1)) +
CrossJoin type=differential +
Get l0 +
Get l0 +
With +
cte l0 = +
ArrangeBy keys=[[]] +
ReadStorage materialize.public.t+Impact on Normal ProcessingRun the join query to completion, measure the time that takes. -- setup
create table t (a int);
insert into t select generate_series(1, 20000);
-- experiment
\timing
select t1.a + t2.a from t t1, t t2 limit 1;
-- result on main
Time: 245638.411 ms (04:05.638)
-- result on this branch
Time: 242369.678 ms (04:02.370)There is no significant difference in run times, so we can assume the token check's impact on join processing to be negligible. Time to ShutdownCancel the join query after 1s, use a subscribe to measure the time until the dataflow stops showing up in the introspection sources. -- setup
create table t (a int);
insert into t select * from generate_series(1, 20000);
-- first SQL session
copy (subscribe mz_internal.mz_dataflows with (progress)) to stdout;
-- second SQL session
set statement_timeout = '1s';
insert into t select t1.a + t2.a from t t1, t t2;
-- result on main
1697819133000 f 1 215 Dataflow: oneshot-select-t5113
1697819134000 t \N \N \N
1697819135000 t \N \N \N
...
1697819378000 t \N \N \N
1697819378000 f -1 215 Dataflow: oneshot-select-t5113
-- => 245s
-- result on this branch
1697810777000 f 1 329 Dataflow: oneshot-select-t7929
1697810778000 t \N \N \N
1697810778000 f -1 329 Dataflow: oneshot-select-t7929
-- => 1sThe token check makes the dataflow shut down more or less instantly. |
|
Looks great, but one ask: can you move the second commit to a different PR? It's a good change, but significantly larger than the actual change of functionality. |
957b326 to
d7489ee
Compare
This commit adds a shutdown token check to the `mz_join_core` linear join implementation. When the dataflow is shutting down, this makes the operator discard all its existing work and new input data, rather than processing it. As a result, differential join operators shut down faster and emit less data, which in turn speeds up shutdown of downstream operators. Unfortunately, we can't make the same change for the DD join operator. We could add a token check into the result closure we pass to that operator, but the shutdown check would interfere with the fueling of the DD join operator. Fuel is consumed based on the number of updates emitted. When the token is dropped, the join closure stops producing updates, which means the operator stops consuming fuel, so it does not yield anymore until it has drained all its inputs. If there are many inputs left, the replica may not accept commands for potentially quite a long time.
d7489ee to
4c7e240
Compare
Done: #22563 |
antiguru
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
vmarcos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
TFTRs! |
This PR adds a shutdown token check to the
mz_join_corelinear join implementation. When the dataflow is shutting down, this makes the operator discard all its existing work and new input data, rather than processing it. As a result, differential join operators shut down faster and emit less data, which in turn speeds up shutdown of downstream operators.Unfortunately, we can't make the same change for the DD join operator. See #18927 for details.
Performance tests below.
Motivation
Joins can consume resources and impact interactivity even after their dataflows have been dropped.
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.