mantle/system/nproc: account for page cache in cgroup available memory#4494
mantle/system/nproc: account for page cache in cgroup available memory#4494dustymabe merged 2 commits intocoreos:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
The pull request effectively addresses the issue of cgroup available memory calculation by accounting for page cache, which significantly improves the accuracy of GetCurrentMemAvailableMiB(). The new getCgroupMemoryStatField helper function is a well-encapsulated addition for parsing cgroup memory statistics. The changes in harness.go to introduce conditional warning logs for memory reservation waits also enhance the clarity and debuggability of the test harness.
845f338 to
f83ee02
Compare
wait... hmm I thought that the golang-lint was managed by repo-templates for this repo. Seems to be out of sync with the go.mod build version. |
f83ee02 to
95410f8
Compare
it will be soon. I'm working to get a few PRs merged before I pivot back to #4468 where there will be a lot of cleanups done for golangci-lint. I don't want to do those cleanups yet because I want to get #3989 merged before I do all the new golangci lint cleanups because I want to be able to backport to older branches with less pain. I just dropped the existing linters for now in e117447 and rebased this PR. |
prestist
left a comment
There was a problem hiding this comment.
LGTM - the page cache fix makes sense. curious though, how bad was the underestimation? was the page cache eating up a big chunk of memory.current and causing tests to sit around waiting for memory?
yeah. basically there were processes (like Look at the logs from the x86_64 kola run in https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/blue/organizations/jenkins/bump-lockfile/detail/bump-lockfile/567/pipeline/542 and you can see it |
5e57106 to
95410f8
Compare
The cgroup available memory calculation used memory.current (total cgroup usage) directly, which includes page cache (file-backed memory). Since inactive page cache is reclaimable by the kernel under memory pressure, it should not count as unavailable. This caused GetCurrentMemAvailableMiB() to significantly underestimate available memory, making QEMU instance scheduling overly conservative. Read the "inactive_file" field from /sys/fs/cgroup/memory.stat, which reports the page cache size that can be reclaimed easily in bytes, and subtract it from current usage before computing available memory. The effective formula becomes: available = limit - (current - inactive_file) This mirrors how /proc/meminfo computes MemAvailable by considering reclaimable caches. A new helper getCgroupMemoryStatField() is added for parsing individual fields from memory.stat, returning 0 gracefully if the file or field is absent. Written-by: <anthropic/claude-opus-4.6>
Let's pass in a boolean and also warn on the first wait and then periodically after that.
95410f8 to
afcec9a
Compare
|
ok i had to adjust the strategy here to use luckily CI on the previous version was failing and made me look into the calculation more. |
The cgroup available memory calculation used memory.current (total
cgroup usage) directly, which includes page cache (file-backed memory).
Since page cache is reclaimable by the kernel under memory pressure, it
should not count as unavailable. This caused GetCurrentMemAvailableMiB()
to significantly underestimate available memory, making QEMU instance
scheduling overly conservative.
Read the "file" field from /sys/fs/cgroup/memory.stat, which reports
the page cache size in bytes, and subtract it from current usage before
computing available memory. The effective formula becomes:
available = limit - (current - page_cache)
This mirrors how /proc/meminfo computes MemAvailable by considering
reclaimable caches.
A new helper getCgroupMemoryStatField() is added for parsing individual
fields from memory.stat, returning 0 gracefully if the file or field is
absent.
Written-by: <anthropic/claude-opus-4.6>
Also a second commit in there to warn when any test initially gets delayed due to memory availability.