Skip to content

Conversation

@huseynsnmz
Copy link

Summary

This PR fixes a panic in the disk buffer that crashes the entire Vector process when encountering I/O errors (such as missing buffer files).

The Problem:
When the disk buffer reader encounters I/O errors like NotFound (file missing), it panics at receiver.rs:59 with:

thread 'vector-worker' panicked at receiver.rs:59:33:
Reader encountered unrecoverable error: Io { source: Os { code: 2, kind: NotFound, message: "No such file or directory" } }

This panic crashes Vector completely, stopping all pipelines across all components.

The Solution:
Treat I/O errors the same way we already treat corruption/checksum errors - emit an error event, log it, update metrics, and skip to the next buffer file. The disk buffer reader already has roll_to_next_data_file() logic that handles advancing past problematic files, so we simply remove the panic and emit a BufferReadError instead.

Behavior Change:

  • Before: I/O error → panic → entire Vector process crashes → manual restart required
  • After: I/O error → log ERROR + emit metric → skip corrupted file → pipeline continues

Events in the corrupted/missing file are lost (same as corruption handling today). Users can monitor via buffer_errors_total{error_code="io_error"} metric.

Vector configuration

No specific configuration needed for testing. The fix applies to any Vector configuration using disk buffers:

sinks:
  example:
    type: http
    inputs: [source]
    buffer:
      type: disk
      max_size: 1073741824  # 1GB

How did you test this PR?

  1. Unit Tests: All 73 existing buffer tests pass (ran with --test-threads=1 due to pre-existing test isolation issue)
  2. Linting: cargo clippy --package vector-buffers -- -D warnings
  3. Formatting: cargo fmt --check

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
    • Changelog added: changelog.d/disk-buffer-io-error-panic.fix.md

References

@huseynsnmz huseynsnmz requested a review from a team as a code owner January 21, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant