Skip to content

Query and version endpoints load unbounded results into memory #252

@thehabes

Description

@thehabes

Summary

Several RERUM endpoints load all matching documents into memory at once with no pagination or limit:

  1. queryHeadRequestdb.find(props).toArray() with no limit. A broad query like {"type": "Annotation"} loads every matching document into memory before sending the response.

  2. getAllVersions — Loads all version documents for an object at once. An object with 100+ versions loads all of them into a single array.

  3. /history/{id} and /since/{id} — Both traverse the version chain and accumulate all results in memory.

Evidence

From load testing (Run 4):

  • Phase 9 (DDoS), Query Flood scenario: 200 VUs sending POST /v1/api/query with {"type":"Annotation"} for 3 minutes. Each request loaded ALL matching annotations into memory. RERUM survived but memory grew from 62MB to 276MB per worker.

  • Phase 9, History Tree Attack: 50 VUs hitting /v1/history/{id} on an object with 100+ versions for 3 minutes. Each request loaded 100+ documents into memory simultaneously.

  • Phase 9, Query Amplification: 100 VUs alternating between {"type":"Annotation"} and {"label":{"$exists":true}} — broad queries returning large result sets, stressing serialization and memory.

All scenarios contributed to the unbounded memory growth documented in #251.

Recommendation

  1. Add default pagination to query endpoints: db.find(props).limit(100).toArray() with a limit query parameter (default 100, max 1000)
  2. Add a limit to getAllVersions — either cap at a reasonable number or stream results
  3. Consider cursor-based pagination for large result sets to avoid loading everything at once

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions