feat(berlinmod): scaffold the full BerlinMOD-9 × 3-form parity matrix on NebulaStream (33 YAMLs, 27/27 cells)#15
Draft
estebanzimanyi wants to merge 1 commit into
Conversation
a5ff0f0 to
6d3f885
Compare
… on NebulaStream (33 YAMLs, 27/27 cells) Additive scaffold for the BerlinMOD-9 × 3 streaming-form parity contract on MobilityNebula, sibling to the existing SNCB Q-series and matching the MobilityFlink MobilityDB#3 / MobilityKafka MobilityDB#1 streaming-form definitions. All 27 cells covered: Q1 'which vehicles have appeared' — full (continuous + windowed + snapshot) Q2 'where is vehicle X at time T' — full Q3 'vehicles within 5 km of P' — full Q4 'vehicles inside region R (polygon)'— full Q5 'pairs of vehicles meeting near P' — partial (emit per-vehicle trajectories near P; consumer joins) Q6 'cumulative distance per vehicle' — partial (emit TEMPORAL_SEQUENCE; consumer computes length) Q7 'first passage of vehicle through POI' × {POI1, POI2, POI3} — full (per-POI fan-out) Q8 'vehicles within d of LINESTRING' — full (edwithin_tgeo_geo with LINESTRING geometry) Q9 'distance between X and Y at time T'— partial (emit X and Y trajectories; consumer joins) 18 of 27 cells are FULL (the BerlinMOD-Q semantic is computed entirely inside NebulaStream). 9 cells are PARTIAL — NebulaStream emits the per-window inputs (trajectory, candidate vehicles) and a consumer post-processes for the final BerlinMOD-Q answer. The partial pattern is the natural expression of these queries in NebulaStream's current SQL surface; the path to FULL is documented per-Q in docs/berlinmod-streaming-forms.md (a stream-self-join for Q5/Q9, a temporal_length scalar function for Q6). Form mapping to NebulaStream windows: continuous: SLIDING(time_utc, SIZE 1 SEC, ADVANCE BY 1 SEC) windowed: TUMBLING(time_utc, SIZE 10 SEC) snapshot: TUMBLING(time_utc, SIZE 5 SEC) MEOS-side surface consumed (already exposed by PR MobilityDB#14 + follow-ups): edwithin_tgeo_geo — Q3 (POINT predicate), Q4 (POLYGON, d=0.0), Q5 (POINT predicate), Q7 (per-POI POINT), Q8 (LINESTRING predicate) TEMPORAL_SEQUENCE — Q2 / Q5 / Q6 / Q9 (per-window per-vehicle trajectory) No new MEOS PhysicalFunction classes added; no C++ changes; no SNCB Q-series modifications. All 33 YAMLs are additive in a new Queries/berlinmod/ subdirectory. Add (additions): Queries/berlinmod/q1_{continuous,windowed,snapshot}.yaml (3) Queries/berlinmod/q2_{continuous,windowed,snapshot}.yaml (3) Queries/berlinmod/q3_{continuous,windowed,snapshot}.yaml (3) Queries/berlinmod/q4_{continuous,windowed,snapshot}.yaml (3) Queries/berlinmod/q5_{continuous,windowed,snapshot}.yaml (3, partial) Queries/berlinmod/q6_{continuous,windowed,snapshot}.yaml (3, partial) Queries/berlinmod/q7_poi{1,2,3}_{continuous,windowed,snapshot}.yaml (9, full via fan-out) Queries/berlinmod/q8_{continuous,windowed,snapshot}.yaml (3, LINESTRING predicate) Queries/berlinmod/q9_{continuous,windowed,snapshot}.yaml (3, partial) Input/input_berlinmod.csv (sample data: 3 vehicles × 21 events, 14 simulated seconds) docs/berlinmod-streaming-forms.md Validation: every YAML parses cleanly via python3 yaml.safe_load. Runtime verification gated on the NebulaStream test harness. Coverage: 27 of 27 cells (100 %), with 18 FULL and 9 PARTIAL annotated explicitly per Q. Path to FULL for the 9 PARTIAL cells is one MobilityNebula C++ PhysicalFunction class each (or a NebulaStream upstream stream-self-join), documented in docs/berlinmod-streaming-forms.md.
6d3f885 to
395e364
Compare
Contributor
|
These are great additions, happy to review when you're ready! |
This was referenced May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft. Additive. BerlinMOD-9 × 3-form streaming-parity scaffold on MobilityNebula, sibling to the existing SNCB Q-series. 27 of 27 cells (Q1..Q9 × 3 forms) — 18 cells full (the BerlinMOD-Q semantic computed entirely inside NebulaStream) and 9 cells partial (NebulaStream emits the per-window inputs; consumer post-processes for the final answer).
No touch to existing SNCB queries, no MEOS C++ PhysicalFunction changes — strictly new YAMLs in a
Queries/berlinmod/subdirectory plus sample CSV + docs.Coverage
27 / 27 cells covered. The 9 partial cells are honest about the current NebulaStream surface — they need either a stream-self-join (Q5, Q9) or a
temporal_lengthscalar function (Q6) to graduate to full, both of which are documented as one-PR additions indocs/berlinmod-streaming-forms.md.Form mapping to NebulaStream windows
SLIDING(time_utc, SIZE 1 SEC, ADVANCE BY 1 SEC)TUMBLING(time_utc, SIZE 10 SEC)TUMBLING(time_utc, SIZE 5 SEC)What "partial" means concretely
TEMPORAL_SEQUENCEtrajectory for each vehicle near PTEMPORAL_SEQUENCEtrajectoryTEMPORAL_SEQUENCEfiltered tovehicle_id ∈ {100, 200}Each partial cell IS a valid runnable NebulaStream query that emits useful BerlinMOD-shaped output; the BerlinMOD-Q final answer is one consumer-side reduction step beyond what NebulaStream returns.
What this PR adds
Queries/berlinmod/q1..q9_{continuous,windowed,snapshot}.yamlQueries/berlinmod/q7_poi{1,2,3}_{continuous,windowed,snapshot}.yamlInput/input_berlinmod.csvdocs/berlinmod-streaming-forms.mdTotal: 35 new files, 33 YAMLs validated via
python3 yaml.safe_load.MEOS-side surface consumed (no additions requested)
All predicates use operators already exposed by PR #14 and follow-ups:
edwithin_tgeo_geo(lon, lat, t, geom, d)TEMPORAL_SEQUENCE(lon, lat, t)No new PhysicalFunction classes; no C++ changes.
Sibling parity work in the ecosystem
No-touch boundary
Queries/Query0..Query5.yaml+Queries/sncb_brake_monitoring.yaml) — untouchednes-physical-operators/src/Functions/Meos/— untouchednes-physical-operators/src/Aggregation/Function/Meos/— untouchedberlinmod_streamand TCP port32325chosen distinct from SNCB'ssncb_stream/ port32324to avoid coexistence conflicts