Skip to content

fix: drop unsafe host env vars when envp is NULL#661

Open
ShivanshVij wants to merge 1 commit intocontainers:mainfrom
loopholelabs:shiv/fix-null-envp
Open

fix: drop unsafe host env vars when envp is NULL#661
ShivanshVij wants to merge 1 commit intocontainers:mainfrom
loopholelabs:shiv/fix-null-envp

Conversation

@ShivanshVij
Copy link
Copy Markdown

@ShivanshVij ShivanshVij commented Apr 30, 2026

Summary

This PR fixes a silent boot hang when krun_set_exec/krun_set_env is called with envp = NULL:

const char *const envp[] = { NULL };
krun_set_exec(ctx, exec, argv, envp);   // explicit empty array - works
krun_set_exec(ctx, exec, argv, NULL);   // documented as "inherit host env" - silently hangs the boot

Every in-tree example, demo, and test happens to use the first form, so the bug never fired in CI even though the second form is documented behavior that external callers are supposed to be able to use.

The host-env-inheritance fallback (which triggers when envp = NULL) currently dumps environment variables onto the kernel cmdline with naive serialization (literal quotes, no whitespace or length filtering), which silently corrupt tokenization or pushes past the guest kernel's COMMAND_LINE_SIZE. KRUN_INIT gets truncated, init falls back to /bin/sh, boot hangs.

Fixes

  • New filter restricts inherited vars to printable ASCII without whitespace and drops individual vars that wouldn't fit a 1024-byte budget. The bogus literal quotes around values are also gone (init's getenv("HOME") no longer returns "/root").
  • Adds an exec-null-envp integration test as a regression guard, plus unit tests for the filter. Runner now counts TestOutcome::Timeout as a failure so a future hangs of this shape will cause CI to fail.

For Reviewers

I think that this PR has very few (if any) unintended side effects, except for this change: https://github.com/containers/libkrun/pull/661/changes#diff-30fb34448238a2b61c86dc96c1e58a7afa4af2fee6085b829d28464ad1e263ebR378

We're modifying the testing harness to fail on timeouts (it wasn't previously doing this). While I believe this change is correct (regardless of just for this PR), I did want to point it out as it may have unintended side effects. I've made sure all the tests pass locally.

Signed-off-by: Shivansh Vij <shivanshvij@loopholelabs.io>
@ShivanshVij ShivanshVij changed the title fix(libkrun): drop unsafe host env vars when envp is NULL fix: drop unsafe host env vars when envp is NULL Apr 30, 2026
@mtjhrc
Copy link
Copy Markdown
Collaborator

mtjhrc commented May 4, 2026

Thanks for the fix!

We're modifying the testing harness to fail on timeouts (it wasn't previously doing this). While I believe this change is correct (regardless of just for this PR), I did want to point it out as it may have unintended side effects. I've made sure all the tests pass locally.

Yes that is a bug I noticed recently too, thanks for fixing it!

Please split the test change (especially the test framework changes) into separate commit(s). This makes git log way more clear what happened and helps if we want to cherry pick changes into another branch. Also please write a proper commit message (something like the PR description). Also note we usually prefix the commit title with the subsystem/area it changes e.g. tests: count timed-out tests as failed

Btw we will likely be removing the env/arg passing mechanism via kernel cmdline for libkrun 2.0 (main branch now). We should fix this in the 1.18.x release (keep the PR open for main branch though, we will cherry pick it into 1.18).

The fix itself looks good!

@dorindabassey
Copy link
Copy Markdown
Collaborator

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refines how host environment variables are inherited when envp is NULL in krun_set_exec and krun_set_env. It introduces a 1024-byte limit and strict filtering (ASCII-only, no whitespace) to ensure the resulting string is compatible with the Linux kernel command line. Additionally, the test runner now correctly categorizes timeouts as failures, and a new test case for NULL environment pointers has been added. Feedback suggests using env::vars_os() to avoid potential panics on invalid Unicode and pre-allocating the environment string buffer to improve performance.

Comment thread src/libkrun/src/lib.rs
Comment on lines +1319 to +1321
fn inherit_host_env() -> String {
serialize_env(env::vars())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::env::vars() can cause a panic if any environment variable contains invalid Unicode. In a library like libkrun, it is safer to use std::env::vars_os() and skip variables that cannot be converted to UTF-8 strings, ensuring the host process remains stable even if the environment is malformed.

fn inherit_host_env() -> String {
    serialize_env(env::vars_os().filter_map(|(k, v)| {
        match (k.into_string(), v.into_string()) {
            (Ok(k), Ok(v)) => Some((k, v)),
            _ => None,
        }
    }))
}

Comment thread src/libkrun/src/lib.rs
}

fn serialize_env<I: IntoIterator<Item = (String, String)>>(vars: I) -> String {
let mut buf = String::new();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the maximum size of the environment string is capped at MAX_INHERITED_ENV_BYTES, initializing the buffer with this capacity avoids unnecessary reallocations as variables are appended.

Suggested change
let mut buf = String::new();
let mut buf = String::with_capacity(MAX_INHERITED_ENV_BYTES);

Copy link
Copy Markdown
Collaborator

@dorindabassey dorindabassey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, just a few comments.

Comment thread src/libkrun/src/lib.rs
/// the caller passes `envp = NULL` to `krun_set_exec`/`krun_set_env`. Linux
/// genreally accepts only the first 2048 bytes so setting this value to 1024 leaves
/// room for default options (KRUN_INIT, KRUN_RLIMITS, etc.).
const MAX_INHERITED_ENV_BYTES: usize = 1024;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about this constant, cmdline size limits vary for different archs, for example riscv64 max size is 1024 bytes, aarch64 is 2048 and so on, should MAX_INHERITED_ENV_BYTES be architecture-aware? Something like this if you're trying to leave room for default options:

Suggested change
const MAX_INHERITED_ENV_BYTES: usize = 1024;
#[cfg(target_arch = "riscv64")]
const MAX_INHERITED_ENV_BYTES: usize = 256;
#[cfg(target_arch = "aarch64")]
const MAX_INHERITED_ENV_BYTES: usize = 1024;
#[cfg(target_arch = "x86_64")]
const MAX_INHERITED_ENV_BYTES: usize = 8192;

Or is the 1024 limit intentional?

nit: s/genreally/generally

Copy link
Copy Markdown
Collaborator

@mtjhrc mtjhrc May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dorindabassey Keep in mind the MAX_INHERITED_ENV_BYTES has to be smaller than the kernel cmdline size though (because there are also other arguments to the kernel unrelated to this...). Where did you get these values btw?

Also TBH, I wouldn't worry too much about this, we can never do this fully correctly (as the comment this PR added mentions it's best effort). In the long term we should stop using this to pass the arguments. That said its possible to "optimize" this to pick a better value. I'm just not sure It's good use of time to worry about this, this argument passing mechanism is inherently broken.

Comment thread src/libkrun/src/lib.rs
Comment on lines +1307 to +1308
/// the caller passes `envp = NULL` to `krun_set_exec`/`krun_set_env`. Linux
/// genreally accepts only the first 2048 bytes so setting this value to 1024 leaves
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here it's not Linux that has a 2048 limit - the limit depends on the architecture

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is misleading, even if we keep a single limit (not platform specific), this comments needs to be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants