-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Open
Description
Overview
Running base/node in production environments requires careful handling of node data (execution data, state, and configuration). However, the repository documentation does not clearly explain recommended backup and recovery practices for node operators.
Problem Description
Node operators may be unsure about:
- which data directories are safe and useful to back up
- which data can be re-synced instead of restored
- how to recover from disk failure or corrupted state
- how backups differ between mainnet and testnet setups
Without guidance, operators may either over-backup unnecessary data or risk losing critical state.
Expected Behavior
- Operators should understand what data is worth backing up.
- Recovery expectations after failure should be documented.
- Guidance should help reduce downtime and operational risk.
Steps to Reproduce
- Deploy
base/nodein a long-running environment. - Consider disaster recovery or migration to a new host.
- Look for official guidance on backup and restore procedures.
- Observe that this information is not clearly documented.
Proposed Solution
- Add a documentation section covering backup and recovery best practices.
- Clearly describe:
- which directories contain critical vs reproducible data
- recommended backup frequency and methods
- recovery steps for common failure scenarios
- Keep guidance high-level and infrastructure-agnostic.
Acceptance Criteria
- Documentation clearly explains backup and recovery expectations.
- Operators can plan backups without deep inspection of the codebase.
- No changes are made to node behavior or configuration.
- Guidance is concise and suitable for long-term maintenance.
Metadata
Metadata
Assignees
Labels
No labels