Skip to content

mantle/system/nproc: account for page cache in cgroup available memory#4494

Merged
dustymabe merged 2 commits intocoreos:mainfrom
dustymabe:dusty-page-cache
Mar 19, 2026
Merged

mantle/system/nproc: account for page cache in cgroup available memory#4494
dustymabe merged 2 commits intocoreos:mainfrom
dustymabe:dusty-page-cache

Conversation

@dustymabe
Copy link
Member

The cgroup available memory calculation used memory.current (total
cgroup usage) directly, which includes page cache (file-backed memory).
Since page cache is reclaimable by the kernel under memory pressure, it
should not count as unavailable. This caused GetCurrentMemAvailableMiB()
to significantly underestimate available memory, making QEMU instance
scheduling overly conservative.

Read the "file" field from /sys/fs/cgroup/memory.stat, which reports
the page cache size in bytes, and subtract it from current usage before
computing available memory. The effective formula becomes:

available = limit - (current - page_cache)

This mirrors how /proc/meminfo computes MemAvailable by considering
reclaimable caches.

A new helper getCgroupMemoryStatField() is added for parsing individual
fields from memory.stat, returning 0 gracefully if the file or field is
absent.

Written-by: <anthropic/claude-opus-4.6>


Also a second commit in there to warn when any test initially gets delayed due to memory availability.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively addresses the issue of cgroup available memory calculation by accounting for page cache, which significantly improves the accuracy of GetCurrentMemAvailableMiB(). The new getCgroupMemoryStatField helper function is a well-encapsulated addition for parsing cgroup memory statistics. The changes in harness.go to introduce conditional warning logs for memory reservation waits also enhance the clarity and debuggability of the test harness.

@prestist
Copy link
Contributor

 Running [/home/runner/golangci-lint-1.64.4-linux-amd64/golangci-lint run --out-format=github-actions --timeout=5m] in [] ...
  level=warning msg="[config_reader] The output format `github-actions` is deprecated, please use `colored-line-number`"
  Error: can't load config: the Go language version (go1.24) used to build golangci-lint is lower than the targeted Go version (1.25.0)
  Failed executing command with error: can't load config: the Go language version (go1.24) used to build golangci-lint is lower than the targeted Go version (1.25.0)

wait... hmm I thought that the golang-lint was managed by repo-templates for this repo. Seems to be out of sync with the go.mod build version.

@dustymabe
Copy link
Member Author

wait... hmm I thought that the golang-lint was managed by repo-templates for this repo

it will be soon. I'm working to get a few PRs merged before I pivot back to #4468 where there will be a lot of cleanups done for golangci-lint.

I don't want to do those cleanups yet because I want to get #3989 merged before I do all the new golangci lint cleanups because I want to be able to backport to older branches with less pain.

I just dropped the existing linters for now in e117447 and rebased this PR.

prestist
prestist previously approved these changes Mar 18, 2026
Copy link
Contributor

@prestist prestist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - the page cache fix makes sense. curious though, how bad was the underestimation? was the page cache eating up a big chunk of memory.current and causing tests to sit around waiting for memory?

@dustymabe
Copy link
Member Author

was the page cache eating up a big chunk of memory.current and causing tests to sit around waiting for memory?

yeah. basically there were processes (like kola itself) that ate up a bunch of memory (virt) and it never got reclaimed. The RSS wasn't high, but it was all just sitting in page cache I think.

Look at the logs from the x86_64 kola run in https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/blue/organizations/jenkins/bump-lockfile/detail/bump-lockfile/567/pipeline/542 and you can see it

prestist
prestist previously approved these changes Mar 19, 2026
The cgroup available memory calculation used memory.current (total
cgroup usage) directly, which includes page cache (file-backed memory).
Since inactive page cache is reclaimable by the kernel under memory
pressure, it should not count as unavailable. This caused GetCurrentMemAvailableMiB()
to significantly underestimate available memory, making QEMU instance
scheduling overly conservative.

Read the "inactive_file" field from /sys/fs/cgroup/memory.stat, which
reports the page cache size that can be reclaimed easily in bytes, and
subtract it from current usage before computing available memory. The
effective formula becomes:

  available = limit - (current - inactive_file)

This mirrors how /proc/meminfo computes MemAvailable by considering
reclaimable caches.

A new helper getCgroupMemoryStatField() is added for parsing individual
fields from memory.stat, returning 0 gracefully if the file or field is
absent.

Written-by: <anthropic/claude-opus-4.6>
Let's pass in a boolean and also warn on the first wait and then
periodically after that.
@dustymabe
Copy link
Member Author

ok i had to adjust the strategy here to use inactive_file instead of file. incative_file is page cache that's currently not being used (i.e. used page cache can't actually be evicted so the earlier code was overestimating how many byte we could subtract from the reported current usage).

luckily CI on the previous version was failing and made me look into the calculation more.

@dustymabe dustymabe enabled auto-merge (rebase) March 19, 2026 15:04
Copy link
Contributor

@prestist prestist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dustymabe dustymabe disabled auto-merge March 19, 2026 17:14
@dustymabe dustymabe merged commit 64453dc into coreos:main Mar 19, 2026
4 checks passed
@dustymabe dustymabe deleted the dusty-page-cache branch March 19, 2026 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants