notebooks: MCAP robotics DataFrame demo#37
Conversation
Walks daft.read_mcap against a 1.19 GB MCAP from the DapengFeng/MCAP
dataset on HuggingFace (FAST-LIVO/hku1: Livox LiDAR + IMU + stereo
compressed cameras, 29,702 messages over 127.7s).
Covers schema inference, per-topic groupby, topic+time pushdown,
topic_start_time_resolver (per-file per-topic keyframe alignment, new
in v0.7.2), and a 50ms-bucket join between LiDAR sweeps and IMU samples.
Direct HTTP reads (daft.read_mcap("https://...")) tracked in
Eventual-Inc/Daft#6983 — notebook downloads via huggingface_hub for now.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30af30239a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| " .with_column(\"bucket\", daft.col(\"lidar_ts\") // BUCKET_NS)\n", | ||
| ")\n", | ||
| "\n", | ||
| "joined = lidar.join(imu, on=\"bucket\").select(\n", |
There was a problem hiding this comment.
Match adjacent buckets for ±50ms window
The bucket equality join only matches IMU/LiDAR rows that fall in the exact same 50ms epoch bucket, which misses valid pairs near bucket boundaries even when they are within ±50ms (for example, timestamps 2ms apart on opposite sides of a boundary). Because this section claims to return IMU samples within ±50ms of each LiDAR frame, the current logic silently drops qualifying matches and can skew downstream alignment/fusion analyses.
Useful? React with 👍 / 👎.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
A walkthrough notebook for
daft.read_mcapagainst a real robotics recording — a 1.19 GB MCAP from the DapengFeng/MCAP dataset on HuggingFace (FAST-LIVO/hku1: Livox LiDAR + IMU + stereo compressed cameras, 29,702 messages over 127.7s).Covers the things a robotics user would actually ask of an MCAP DataFrame:
topics=[\"/livox/imu\"]avoids materializing camera payloadsstart_time/end_timetopic_start_time_resolver— the new per-file per-topic keyframe alignment from Daft #5886 (v0.7.2)Known limitation
Direct HTTP reads (
daft.read_mcap(\"https://...\")) currently fail withTypeError: Expected a FileSystemHandler instance, got HTTPFileSystem—daft/filesystem.pywraps fsspec inPyFileSystemwithout going throughFSSpecHandler. Tracked in Eventual-Inc/Daft#6983. Notebook uses `huggingface_hub.hf_hub_download` as the workaround; will update once #6983 lands.Placement
Dropped in `notebooks/` alongside the existing format-walkthrough notebooks (`getting_started_with_common_crawl.ipynb`, `window_functions.ipynb`). Happy to convert to a PEP 723 script under `examples/io/` in a follow-up if that fits the repo direction better.
Test plan
🤖 Generated with Claude Code