Write ContextCache atomically with owner-only file and directory mode#69070
Open
co-cy wants to merge 1 commit intosaltstack:3006.xfrom
Open
Write ContextCache atomically with owner-only file and directory mode#69070co-cy wants to merge 1 commit intosaltstack:3006.xfrom
co-cy wants to merge 1 commit intosaltstack:3006.xfrom
Conversation
`salt.utils.cache.ContextCache.cache_context` opened the cache file through `salt.utils.files.fopen(self.cache_path, "w+b")` and let the process umask pick the file mode; the parent directory was likewise created via plain `os.mkdir()` with no mode argument. On default Linux installs the file ended up `0o644` (world-readable) inside a `0o755` directory. Pillar context can carry credentials (passwords, vault tokens, API keys sourced from external pillar backends), so any local user could read them out of the file. Even after the file mode is tightened the parent directory remained world-traversable, so a local user could still `ls` the cache and learn which modules and external-pillar backends were in use (filenames look like ``salt.loaded.ext.<module>.<name>.p``). Two changes: 1. Switch the file write to `tempfile.mkstemp` + `os.replace`. mkstemp creates the temporary file with mode `0o600` by default (per its documented contract: "the file is readable and writable only by the creating user ID"); `os.replace` atomically renames it into place, so the final cache file inherits the restricted mode from the first byte and never sits on disk world-readable. The atomic rename also closes a separate but related rough edge: the previous `fopen(..., "w+b")` truncated the file before writing it back, so a concurrent reader could observe a half-written or empty file. After this change readers see either the old content or the new one -- never a partial write. 2. Pass `mode=stat.S_IRWXU` (`0o700`) to `os.mkdir` for the parent `context/` directory. On default umask this would have been `0o755`. Existing directories from earlier installs are left untouched -- operators upgrading may want to `chmod 0700` their cache context directory once. Note on hard crashes (`SIGKILL` / OOM / power loss) between `mkstemp` and `os.replace`: the `.tmp_*` file remains on disk. It is mode `0o600` so the leak is operational hygiene, not security; operators can clean stale temporaries with e.g. `find <cachedir>/context/ -name '.tmp_*' -mtime +1 -delete`. Add three behavioural tests: - file mode is `0o600` after write, - parent directory is created with mode `0o700`, - successive writes use distinct inodes (proving atomic rename rather than in-place truncate). Refs: saltstack#69069
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
salt.utils.cache.ContextCache.cache_contextopens the cache file throughsalt.utils.files.fopen(self.cache_path, "w+b")and lets the process umask pick the file mode; the parent directory is created via plainos.mkdir()with nomodeargument. On default Linux installs the file ends up0o644(world-readable) inside a0o755directory.The cache holds pillar context — pillar values plus surrounding state, including credentials sourced from external pillar backends (
vault,git_pillar,consul). With the default modes any local user can:ls /var/cache/salt/master/context/→ see filenames likesalt.loaded.ext.pillar.vault.p,salt.loaded.ext.pillar.git_pillar.p— leaks which external-pillar backends and modules are in use;cat <file>→ read the credentials directly.A separate but related rough edge: the same code path rewrites the cache in place via
"w+b"mode, which truncates the file before writing back. A concurrent reader can observe a half-written file and fail to deserialize.This PR makes two changes:
tempfile.mkstemp+os.replace.mkstempcreates the temporary file with mode0o600by default (per its documented contract: "the file is readable and writable only by the creating user ID");os.replaceatomically renames it into place. The final cache file therefore has the correct mode from the first byte, and concurrent readers never see a partial write.context/directory is created withmode=stat.S_IRWXU(0o700).What issues does this PR fix or reference?
Fixes #69069
Previous Behavior
$ ls -ld /var/cache/salt/master/context/ drwxr-xr-x 2 salt salt context/ $ ls -l /var/cache/salt/master/context/*.p -rw-r--r-- 1 salt salt jinja2.pNew Behavior
$ ls -ld /var/cache/salt/master/context/ drwx------ 2 salt salt context/ $ ls -l /var/cache/salt/master/context/*.p -rw------- 1 salt salt jinja2.pMigration note
Existing
context/directories from earlier installs are left untouched — auto-changing their mode would be surprising (some operators may have deliberately set them differently for monitoring tools). On upgrade, operators can runchmod 0700on their cache context directory once. New installs and freshly-created directories get the right mode automatically.Crash-safety note
If the process is killed (
SIGKILL/ OOM / power loss) betweenmkstempandos.replace, the.tmp_*file remains on disk. It is mode0o600so this is operational hygiene rather than a security issue; operators can clean stale temporaries withfind <cachedir>/context/ -name '.tmp_*' -mtime +1 -delete.Merge requirements satisfied?
changelog/69069.fixed.mdtests/pytests/unit/utils/test_cache.py:test_cache_context_writes_with_owner_only_mode— file is0o600.test_cache_context_creates_parent_directory_with_owner_only_mode— parent is0o700.test_cache_context_overwrites_atomically_via_replace— successive writes use distinct inodes (proves atomic rename rather than in-place truncate).Commits signed with GPG?
No.