Skip to content

Persistent Memory Growth / Failure to Deallocate under high Object Store Load (DataFusion + delta-rs) #846

@eshaanmangal-tf

Description

@eshaanmangal-tf

Description:

I am experiencing a memory "leak" (unbounded growth) when using snmalloc as the global allocator in a Rust-based query server. The application uses DataFusion and delta-rs to query object storage.

The Issue:

Memory is not being released back to the OS (or reused effectively) at the end of a REST request lifecycle. This leads to a steady climb in RSS until the process is OOM killed.

With jemalloc: Memory is reclaimed/recycled correctly.
With snmalloc: Memory usage climbs linearly and crashes.

Environment:

OS: Linux
snmalloc-rs version: 0.3.8
Relevant Crates: DataFusion, delta-rs (which relies heavily on arrow and FFI).
Runtime: [Tokio 1.48]

Observations:

Interestingly, in the snmalloc trace, the OOM occurs well before the "Limits" (32GB) defined in our monitoring. It seems the allocator is struggling with the specific allocation patterns of DataFusion's execution plan (large buffers for Arrow record batches).

Request:

Are there specific configurations for snmalloc or known issues with the way it interacts with large, short-lived Arrow buffers that might prevent timely deallocation?

Refrence links

delta-io/delta-rs#4241 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions