-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Handle transient ENOMEM in statx probing #151641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
r? @ibraheemdev rustbot has assigned @ibraheemdev. Use |
54e20c5 to
d9404ad
Compare
| // | ||
| // See: https://github.com/rust-lang/rust/issues/65662 | ||
| // | ||
| // FIXME what about transient conditions like `ENOMEM`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure there are no other transient coditions (other than ENOMEM) that we should worry about here?
As the comment says "like" which indicates to me as an uninformed reader, that there may be more transient conditions to consider other than ENOMEM.
If so, removing this comment is not correct IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya true, deleting the comment was premature...
One concern I have is that caching STATX_STATE::Unavailable feels risky if the failure is transient, since that would effectively disable statx for the lifetime of the process.
- Should Unavailable be cached only for ENOSYS, or also cases like EPERM (seccomp)?
- For other errors that aren’t clearly “missing” or transient, is it better to avoid caching and just fall back to legacy stat for that call?
- Any other transient errors you’d want handled explicitly?
If this needs a bit more discussion, I could open a GitHub issue first, please suggest, thanks mate! @terrarier2111 @ibraheemdev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to discuss here since this is where we would see the changes anyway.
One concern I have is that caching STATX_STATE::Unavailable feels risky if the failure is transient, since that would effectively disable statx for the lifetime of the process.
There is nothing "risky" about using the stat fallback.
- For other errors that aren’t clearly “missing” or transient, is it better to avoid caching and just fall back to legacy stat for that call?
- Any other transient errors you’d want handled explicitly?
It's your PR, could you propose a reasonable solution here? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ENOMEM: don't cache anything, just fall back tostatfor this call and retry the probe next time- All other errors: Cache as
Unavailablesince we can't be sure (seccomp could return anything) EAGAINcomes to mind but I think it's unlikely from statx.
The updated code is in the latest commit (64c3e3f)
This reverts commit 7d472aa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's the worst thing to allow a retry on ENOMEM, but this code changes more than just the ENOMEM handling.
As an alternative we could keep a retry counter and fall back if it fails a number of times, but things are going rather poorly if you get a repeated kernel OOM from an empty statx. I don't really think it's worth it.
@tbu- it looks like you've touched this code most recently, do you have any thoughts?
|
Reminder, once the PR becomes ready for a review, use |
|
I think it's a good idea to not cache that For handling this, maybe for that situation an error should be returned directly? I'm not too sure about the counter idea since it's not absolute that |
60f62ba to
64c3e3f
Compare
current approach doesn't cache anything on falling back to stat seems safer since it at least gives the user some result - curious what @tgross35 and @tbu- think! |
|
@rustbot ready |
|
I think I only fixed up comments around the If I were to refactor it, I'd probably make sure we only test for the presence of |
Fixes a FIXME in
sys/fs/unix.rs.Currently, if the
statxavailability probe fails withENOMEM,statxis permanently disabled for the process. SinceENOMEMis transient, this change allows falling back tostatfor the current call without updatingSTATX_SAVED_STATEtoUnavailable, permittingstatxto be retried later.