codegen(meos): generate tier-aware MEOS facade for the full JMEOS 1.4 surface (stacks on #4)#5
Open
estebanzimanyi wants to merge 4 commits into
Open
Conversation
…matrix on MobilityFlink
All nine BerlinMOD reference queries × three streaming forms each
(continuous, windowed, snapshot) on MobilityFlink — the complete 27-cell
stream-layers parity-matrix row, locally verified end-to-end with no
external dependencies (no Kafka, no Docker, no MEOS native lib, no
JMEOS call).
Queries:
Q1 which vehicles have appeared in the stream
Q2 where is vehicle X at time T
Q3 which vehicles within d of P at time T
Q4 which vehicles entered region R, and when
Q5 pairs of vehicles meeting near point P
Q6 cumulative distance per vehicle
Q7 first passage of vehicles through POIs
Q8 vehicles close to a road segment
Q9 distance between vehicles X and Y at time T
Each query has three form classes (Q<N>{Continuous,Windowed,Snapshot}Function)
and a companion BerlinMODQ<N>LocalTest driver running the three forms
through a Flink mini-cluster against a hardcoded synthetic corpus.
Spatial predicates today are pure Java — Haversine distance for
point-to-point (Q3, Q5, Q6, Q9), point-in-box for region containment
(Q4), and a planar-projection point-to-line-segment distance (Q8). Each
spatial call site is marked TODO(meos) for migration to the JMEOS
bridge of the corresponding MEOS operator once the in-flight MEOS 1.4
bump signals settled (Q3 edwithin_tgeo_geo; Q4 STBox eintersects; Q5
NAD / edwithin_tgeo_tgeo; Q6 trajectory length; Q7 edwithin_tgeo_geo;
Q8 distance(tgeompoint, geometry(LINESTRING)); Q9 tdistance). Q1 and
Q2 have no spatial predicate.
State patterns exercised:
- keyed simple flag (Q1)
- keyed last-known position (Q2, Q8)
- keyed transition + entry log (Q4)
- keyed accumulator (Q6)
- keyed first-passage map (Q7)
- shared key-by-constant state (Q9 pair-wise, Q5 multi-pair MapState)
Verified output counts (see PR description for the exact-line excerpts):
Q | continuous | windowed | snapshot
---|------------|----------|---------
Q1 | 3 | 2 | 9
Q2 | 7 | 2 | 3
Q3 | 21 | 2 | 6
Q4 | 4 | 5 | 9
Q5 | 14 | 2 | 3 (only pair (100,200) qualifies for our P + radii)
Q6 | 21 | 6 | 9 (drift corpus; v100=601m, v200=300m, v300=1205m)
Q7 | 3 | 6 | 9 (3 (vehicle, POI) first-passages; intra-window scope)
Q8 | 21 | 2 | 6 (same shape as Q3 with segment-distance)
Q9 | 7 | 2 | 3 (X=100, Y=200; distance 4124m = ~4.1km)
Build verification: mvn clean package green; all nine LocalTests run to
completion (Flink mini-cluster, parallelism=1) producing exactly the
expected output shapes.
… 1.4 MEOSBridge
Introduce MEOSBridge as the runtime spatial-predicate surface for all
BerlinMOD-9 × 3-form streaming cells. The bridge calls into MEOS via
JMEOS 1.4 (geog_dwithin over WGS84 geographies) when libmeos is loadable
and falls back to the pure-Java Haversine / SegmentDistance utilities
when it is not — the fallback path is what the BerlinMODQ*LocalTest
mini-cluster drivers exercise (system property mobilityflink.meos.enabled=false).
- New berlinmod/MEOSBridge.java with the dwithinMetres /
dwithinSegmentMetres / distanceMetres surface and a fail-soft
static init that flips MEOS_AVAILABLE to false on UnsatisfiedLinkError.
- All BerlinMOD-9 × 3-form spatial predicates rewritten to call
MEOSBridge instead of Haversine / SegmentDistance directly. 27 cells,
one bridge call surface, identical predicate semantics.
- JMEOS.jar updated to the JMEOS#15 regen branch artefact (478 305
bytes); this is the JMEOS 1.4 regen build that exposes geog_dwithin /
geom_in / geom_to_geog / edwithin_tgeo_geo / nad_tgeo_geo / tpoint_length.
- aisdata/Main.java and aisdata/TrajectoryWindowFunction.java adapted
to the JMEOS 1.4 meos_initialize() / meos_initialize_timezone()
split (the old two-arg meos_initialize(String, error_handler_fn)
signature is gone in JMEOS#15).
- All nine BerlinMODQ*LocalTest mini-cluster drivers set
mobilityflink.meos.enabled=false at main() entry so they remain
green-CI without libmeos.so on the runtime path.
- target/ build artefacts gitignored.
The README's spatial-predicate paragraph is updated to describe the
MEOSBridge route as the production path; the TODO(meos) markers across
the BerlinMOD cells are gone.
Build: mvn clean package -DskipTests green.
Verify: BerlinMODQ{1,3,5,8}LocalTest all finish with FINISHED state
on the mini-cluster fallback path.
…s + extended types + utils.spatial)
Updates the bundled `flink-processor/jar/JMEOS.jar` to a combined build
of JMEOS PR #19 (regen against MEOS-API meos-idl.json, 2,699 methods
including extended types) AND PR #18 (utils.spatial.Haversine +
utils.spatial.PointToSegment wrappers that MEOSBridge.java imports).
Surface delta vs the previous bundled jar:
- public static methods: 2 699 (was 1 685)
- utils.spatial.Haversine.distance(lon1, lat1, lon2, lat2) → double
- utils.spatial.PointToSegment.distance(pLon, pLat, s1Lon, s1Lat, s2Lon, s2Lat) → double
- tnpoint_ methods: 50
- tcbuffer / tpose / trgeo: now exposed
- sha: a5895c9b94… size: 1,210,863 B
Unblocks the MEOSBridge.java import path (line 116) — previously the
jar shipped PR #19's GeneratedFunctions but not PR #18's utils.spatial,
so base-branch mvn compile was RED. Both PRs now coalesced into a
single jar built by:
mvn -pl codegen,jmeos-core compile -Dmaven.test.skip=true
cd jmeos-core/target/classes && jar cf JMEOS.jar .
Unblocks codegen/flink-meos-ops wedge stacked on this branch.
… surface Add a generated, tier-aware Java facade over the MEOS public API, organized as one Java class per MEOS object-model class plus one per public-MEOS-header for free functions: - 50 `MeosOps<Class>` classes (751 methods): one per MEOS object-model class (TFloat, TInt, TBool, TText, TGeomPoint, TGeogPoint, TCbuffer, TNpoint, TPose, TRGeometry, TBox, STBox, Set, Span, SpanSet, …). - 6 `MeosOpsFree<Header>` classes (1,346 methods): one per public MEOS header for functions not assigned to any object-model class (MeosOpsFreeCore, MeosOpsFreeGeo, MeosOpsFreeCbuffer, MeosOpsFreeNpoint, MeosOpsFreePose, MeosOpsFreeRgeo). - 1 shared `MeosOpsRuntime` (single `MEOS_AVAILABLE` static-init across all 56 facades). Each emitted method forwards to `functions.GeneratedFunctions.<name>(...)` after probing the shared `MeosOpsRuntime.MEOS_AVAILABLE` flag. Each method carries a Javadoc tier marker (stateless / bounded-state / windowed / cross-stream / io-meta) so consumers know the per-method wiring shape. Total emit: 2,097 of JMEOS PR #19's 2,699-method surface (77.7%); remainder is the JMEOS-deliberately-omitted type-catalog helpers plus the streaming-relevance-baseline ambiguous (59) and sequence-only (14) buckets, both surfaced separately for design decisions before emit. Two generators under flink-processor/tools/codegen/: - codegen-oo.py: reads JMEOS jar signatures via javap-p + streaming-relevance baseline + MEOS object model → emits per-OO-class facades. - codegen-free.py: same shape, but for functions not in the OO model → emits per-header facades. Both are ~250 LOC, deterministic, audit-by-regeneration. Manifests record provenance (JMEOS method total, baseline target count, emit count, per-tier breakdown, per-class/per-header method count, sample of functions absent from JMEOS). Coexists with the existing berlinmod.MEOSBridge hand-written BerlinMOD-scoped bridge (high-level, query-shaped); the generated MeosOps* facades expose the raw MEOS surface tier-by-tier (low-level, catalog-shaped). Both share the same MEOS_AVAILABLE discipline and `functions.GeneratedFunctions` delegation. Stacks on feat/jmeos-bridge-swap; additive-only; touches no existing file. Locally compile-verified against the union of JMEOS PR #19's jmeos-core + PR #18's utils.spatial (the latter needed by MEOSBridge, separately tracked).
fad7fed to
e5707ac
Compare
Member
Author
|
Coordination confirmation: rebased onto the post-union-jar refresh ( Local verification: Full module now compiles green — the codegen wedge sits on top of Coordination item resolved. Thanks for the union-jar refresh. |
This was referenced May 21, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a generated, tier-aware Java facade over the full MEOS public API surface, so downstream Flink-side parity work can stop hand-wiring per-operator JMEOS calls and instead consume one mechanical facade per MEOS object-model class (or per public header for free functions).
What is generated
MeosOps<Class>— one per MEOS object-model classtools/codegen/codegen-oo.pyMeosOpsFree<Header>— one per public MEOS header for fns not assigned to any OO classtools/codegen/codegen-free.pyMeosOpsRuntime(singletonMEOS_AVAILABLEstatic init across all 56 facades)Each emitted method forwards verbatim to
functions.GeneratedFunctions.<name>(...)after probingMeosOpsRuntime.MEOS_AVAILABLE(set once per JVM). Each method carries a Javadoc tier marker:statelessScalarFunction/ direct call inMapFunctionbounded-stateScalarFunction(state in MEOS handle)windowedAggregateFunctionoverTUMBLE/HOPcross-streamCoProcessFunction/IntervalJoinio-metaformatclauseTier breakdown of the 2,097 emitted methods: 804 stateless · 797 bounded-state · 161 windowed · 140 cross-stream · 195 io-meta.
What's not emitted (honest gap)
*_basetype,*_type,*_spantype, …)sequence-onlytier — inherently non-streamable, marked as honest "cannot satisfy" pending an emission-shape decisionstreamingSemanticsfacet RFC for MEOS-APICoexistence with
berlinmod.MEOSBridgeMEOSBridge.java(hand-written, BerlinMOD-scoped, introduced on this branch's parentfeat/jmeos-bridge-swap) and the generatedMeosOps*facades coexist by design:MEOSBridgekeeps the per-BerlinMOD-query intent (Haversine fallback,dwithinSegmentMetres, etc.) — high-level, query-shaped.MeosOps*exposes the raw MEOS surface tier-by-tier — low-level, catalog-shaped.Both share the same
MEOS_AVAILABLEdiscipline (viaMeosOpsRuntime) and the samefunctions.GeneratedFunctionsdelegation.How to regenerate
Both generators are ~250 LOC, deterministic, audit-by-regeneration. Manifests under
tools/codegen/record per-class / per-header / per-tier breakdowns + absent-from-JMEOS audit.Stacking
This PR stacks on
feat/jmeos-bridge-swap. Additive-only: 57 new Java files + 5 files undertools/codegen/. No existing file is touched (no diff toMEOSBridge.java,Main.java,TrajectoryWindowFunction.java,pom.xml, orjar/JMEOS.jar).Note on the base branch's current compile state
feat/jmeos-bridge-swap'sMEOSBridge.java:116importsutils.spatial.PointToSegmentfrom JMEOS PR #18'sfeat/spatial-haversinebranch. The recent bundled-jar refresh on this branch (commit0a57c07, JMEOS PR #19'sjmeos-corejar) brought in the 2,699-methodfunctions.GeneratedFunctionssurface but did not include PR #18'sutils.spatial.*wrappers. As a result, the base-branchmvn compilecurrently fails onMEOSBridge.java.This PR's own diff is green in isolation (
javacof justorg.mobilitydb.flink.meos.*succeeds against the refreshed jar) and green in the full module when the bundled jar is the union of JMEOS PR #19'sjmeos-core+ PR #18'sutils.spatial.*(locally verified: 123 .class files compile clean, including all 57 newMeosOps*).Recipe to produce the union jar (~2 minutes):
Once the bundled jar is refreshed with the union, the base branch + this PR compile together cleanly.