Implement head builtin command by thieman · Pull Request #54 · DataDog/rshell

thieman · 2026-03-11T20:12:27Z

Summary

Adds head as a safe builtin command with full GNU head compatibility
Supports -n N/--lines, -c N/--bytes, -q/--quiet/--silent, -v/--verbose, multiple files with ==> filename <== headers, and stdin via - or implicit
Memory-safe chunked I/O; large N values clamped to prevent allocations; context cancellation checked per iteration

Test plan

35 YAML scenario tests across lines/, bytes/, headers/, errors/, stdin/, and hardening/ subdirectories
All scenarios validated byte-for-byte against GNU bash + coreutils in Docker (RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash)
Go unit tests, GNU compat tests, and pentest hardening tests in interp/builtins/head/
go test ./interp/... ./tests/... passes locally

🤖 Generated with Claude Code

Adds `head` as a safe builtin with support for: - `-n N` / `--lines=N`: output first N lines (default 10) - `-c N` / `--bytes=N`: output first N bytes - `-q` / `--quiet` / `--silent`: suppress file headers - `-v` / `--verbose`: always print file headers - Multiple files with `==> filename <==` headers - stdin via `-` or implicit when no files given - Memory-safe chunked I/O; large N values clamped to prevent allocations - 35 YAML scenario tests covering lines, bytes, headers, errors, stdin, and hardening Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

thieman · 2026-03-11T20:14:05Z

@DataDog @codex make a comprehensive code and security reviews

datadog-datadog-prod-us1 · 2026-03-11T20:14:08Z

I can only run on private repositories.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 001f4f1019

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

interp/builtins/head/head.go

thieman

Review Summary

Reviewed the head builtin implementation. The code is well-structured, memory-safe, passes through the sandbox correctly (callCtx.OpenFile throughout, never os.Open), handles resource exhaustion properly (1 MiB line cap, 32 KB chunk reads, count clamping), and has excellent test coverage for the happy path.

Overall assessment: needs fixes — two P2 issues found:

#	File	Finding
1	`interp/builtins/head/head.go:189`	Spurious leading `\n` separator when first file fails to open but a later file succeeds
2	`SHELL_FEATURES.md:16`	`-z` listed in synopsis but implementation rejects it (unlike `tail` which actually supports `-z`)
3	`tests/scenarios/cmd/head/`	Missing scenario test for `head missing.txt good.txt` (first-file-fails-second-succeeds with headers)

Positive observations

callCtx.OpenFile is used exclusively — no direct os.Open calls, sandbox is intact ✅
Count clamping (MaxCount = 1<<31-1) prevents allocation attacks ✅
Per-line 1 MiB cap in scanner mode prevents line-length DoS ✅
Context cancellation checked at every loop iteration ✅
bufio.Scanner initial buffer (4 KiB) doesn't allocate proportionally to user input ✅
All imports are in the symbol allowlist; no new allowlist entries needed ✅
scanLinesPreservingNewline correctly preserves exact bytes (CRLF, missing final newline, null bytes) ✅
last-flag-wins logic via modeSeq counter is elegant and avoids raw arg scanning ✅
Comprehensive test suite: Go unit tests, GNU compat tests, pentest tests, Unix/Windows platform tests, and YAML scenarios ✅

interp/builtins/head/head.go

SHELL_FEATURES.md

…-z docs Three issues fixed: 1. Spurious leading \n separator when first file fails to open (P2 bash compat): Replace idx-based separator logic with prevHeaderPrinted bool. The separator is now only printed when a previous file was successfully processed, matching GNU head behavior (head missing.txt good.txt no longer produces a leading \n). 2. -q/-v last-flag-wins semantics (P1 bash compat, raised by @codex review): Replace *bool quiet/verbose with boolSeqFlag that shares a sequence counter. The flag with the highest pos wins, so "head -q -v file" prints headers (-v wins) and "head -v -q file" suppresses them (-q wins), matching GNU head. --silent shares the same counter as --quiet and acts as an alias. Uses pflag's NoOptDefVal="true" so the flags remain no-argument booleans. 3. SHELL_FEATURES.md incorrectly listed -z in the head synopsis (P2 docs): Remove -z from the synopsis and add a note that -z/--zero-terminated and --follow are rejected (unlike tail which actually implements -z). Add test scenarios: first_fails_second_succeeds, last_flag_wins_verbose, last_flag_wins_quiet. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

thieman · 2026-03-11T20:33:18Z

@DataDog @codex make a comprehensive code and security reviews

datadog-prod-us1-3 · 2026-03-11T20:33:23Z

I can only run on private repositories.

thieman

Code Review — `head` builtin (PR #54)

Reviewed by: Claude Sonnet 4.6 (self-review iteration 2)

Overall Assessment: ✅ Safe to merge

This PR implements the head builtin command for the restricted shell interpreter. The implementation is well-architected, security-conscious, and thoroughly tested. All critical invariants are upheld:

Sandbox integrity: All file access goes through callCtx.OpenFile() — no direct os.Open calls.
Memory safety: Line scanner capped at 1 MiB per line; byte mode reads in fixed 32 KB chunks; no allocation proportional to user-supplied count.
Context cancellation: All read loops check ctx.Err() at each iteration.
Last-flag-wins semantics: Correctly implemented via shared sequence counters for both -n/-c and -q/-v flags.
Boolean flag parsing: NoOptDefVal = "true" correctly tells pflag not to consume the next argument as the flag's value.
Intentional divergences from bash (-n -N elide-tail, -c -N trailing bytes, --follow, -z) are properly marked with skip_assert_against_bash: true.

Only P3 (nice-to-have) findings:

#	Priority	File	Finding
1	P3	`interp/builtins/head/head.go:338`	`IsBoolFlag()` is dead code — pflag uses `NoOptDefVal` for custom Var flags
2	P3	`tests/scenarios/cmd/head/errors/`	No YAML scenario for `head -c -N` intentional divergence

interp/builtins/head/head.go

tests/scenarios/cmd/head/errors/negative_count.yaml

- Remove boolSeqFlag.IsBoolFlag() which pflag never calls for VarP/VarPF flags. Add a comment explaining that NoOptDefVal = "true" is the actual mechanism. This prevents future readers from thinking IsBoolFlag() is the active mechanism and accidentally removing the NoOptDefVal lines. - Add tests/scenarios/cmd/head/errors/negative_bytes_count.yaml to document the intentional rejection of head -c -N (elide-tail mode for bytes). Marked with skip_assert_against_bash: true since bash supports head -c -N but we intentionally do not implement it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4d1bb312f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-11T20:40:03Z

interp/builtins/head/head.go

+	sc := bufio.NewScanner(r)
+	buf := make([]byte, 4096)
+	sc.Buffer(buf, MaxLineBytes)
+	sc.Split(scanLinesPreservingNewline)


Avoid dropping stdin data across repeated '-' operands

In line mode, readLines uses a new bufio.Scanner for each file argument and stops after count tokens, but Scanner can prefetch beyond the emitted lines; when stdin is reused via multiple - operands (e.g. head -q -n 1 - - < file), the first pass can consume buffered bytes that the second - should read, so the second operand sees EOF and data is lost. GNU head outputs the next line in this regular-file-stdin case (alpha\nbeta\n), so this breaks bash compatibility and can truncate output in multi-operand scripts.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-11T20:40:03Z

interp/builtins/head/head.go

+			} else if printHeaders {
+				// A header was successfully printed for this file; subsequent
+				// files should emit a blank-line separator before their header.
+				printedHeader = true


Preserve header separators after per-file read errors

printedHeader is only set when processFile returns success, but headers are printed before reading starts; if a file opens and prints its header and then fails during read (for example a line-mode scanner error on an over-cap line), the next file header is emitted without the required blank-line separator because printedHeader remains false. This produces malformed multi-file output in error paths even though a prior header was already written.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-11T20:40:03Z

interp/builtins/head/head.go

+func (f *boolSeqFlag) Set(_ string) error {
+	*f.seq++
+	f.pos = *f.seq
+	return nil


Reject boolean arguments for --quiet/--silent/--verbose

boolSeqFlag.Set ignores the provided value string, so forms like --quiet=false or --verbose=false are accepted and still participate in last-flag-wins precedence instead of being rejected. GNU head treats these as invalid (option '--quiet' doesn't allow an argument), so this silently changes behavior for scripts that pass explicit boolean values and expect either an error or no effect.

Useful? React with 👍 / 👎.

thieman requested review from AlexandreYang, astuyve, julesmcrt and matt-dz as code owners March 11, 2026 20:12

chatgpt-codex-connector bot reviewed Mar 11, 2026

View reviewed changes

interp/builtins/head/head.go Outdated Show resolved Hide resolved

interp/builtins/head/head.go Outdated Show resolved Hide resolved

thieman commented Mar 11, 2026

View reviewed changes

interp/builtins/head/head.go Outdated Show resolved Hide resolved

SHELL_FEATURES.md Outdated Show resolved Hide resolved

thieman commented Mar 11, 2026

View reviewed changes

interp/builtins/head/head.go Outdated Show resolved Hide resolved

tests/scenarios/cmd/head/errors/negative_count.yaml Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 11, 2026

View reviewed changes

Conversation

thieman commented Mar 11, 2026

Summary

Test plan

Uh oh!

thieman commented Mar 11, 2026

Uh oh!

datadog-datadog-prod-us1 bot commented Mar 11, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

thieman left a comment

Choose a reason for hiding this comment

Review Summary

Positive observations

Uh oh!

Uh oh!

Uh oh!

thieman commented Mar 11, 2026

Uh oh!

datadog-prod-us1-3 bot commented Mar 11, 2026

Uh oh!

thieman left a comment

Choose a reason for hiding this comment

Code Review — head builtin (PR #54)

Overall Assessment: ✅ Safe to merge

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Code Review — `head` builtin (PR #54)