Security/harden shell cicd#211
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRemoves bundled-shell fallbacks for memory-targeting commands, tightens command-safety checks, returns retry/blocking guidance instead of executing host shells, pins npm global updates to fetched versions, switches CI jobs to npm ci, and updates tests accordingly. ChangesSecurity Hardening, Infrastructure Optimization, and Update Flow
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/hooks/memory-path-utils.ts`:
- Around line 8-25: The isSafe() check relying on SAFE_BUILTINS is bypassable
because it only checks the first token and allows shell wrappers/control syntax
to dispatch child commands; update the validation in isSafe() (and any related
code that iterates SAFE_BUILTINS) to perform strict token-level checks: parse
the entire command line into tokens and ensure the first token exactly matches
an allowed binary name and that no control/operator tokens (e.g., ; && || | `$(
)` backticks, $(), `exec`, `if`, `then`, `else`, `for`, `while`, `find -exec`,
`timeout`, meta-words like `env`) or redirections are present anywhere in the
token list, or alternatively remove wrapper/keyword names from SAFE_BUILTINS so
only literal safe binaries remain; in short, replace the current first-token
substring check with exact-token matching plus a reject-list of shell control
keywords to prevent wrappers from invoking child commands (update SAFE_BUILTINS
usage and the isSafe() implementation accordingly).
In `@src/hooks/pre-tool-use.ts`:
- Around line 284-286: The early return of null when shellCmd and config are
present lets dangerous real-shell commands run; instead, inside the pre-tool-use
flow (the branch after getShellCommand() returns a shellCmd and config exists)
detect memory-touching/virtual-memory paths and do not return null — either
reject the tool invocation or return the existing retry guidance response used
elsewhere; update the logic around getShellCommand(), shellCmd and config to
call the same denial/retry handler used for other memory-touching cases (reuse
the retry guidance path) so commands like sort /etc/passwd ~/.deeplake/... are
blocked rather than forwarded to the real Bash tool.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: cb3c19f8-f016-465b-a950-65a8f41ec869
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (8)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tssrc/notifications/sources/backend.tstests/claude-code/pre-tool-use-branches.test.tstests/cli/cli-update.test.ts
| expect(code).toBe(0); | ||
| expect(spawn).toHaveBeenCalledTimes(2); | ||
| expect(spawn.mock.calls[0]).toEqual(["npm", ["install", "-g", "@deeplake/hivemind@latest"]]); | ||
| expect(spawn.mock.calls[0]).toEqual(["npm", ["install", "-g", "@deeplake/hivemind@1.3.0"]]); |
There was a problem hiding this comment.
Is setting 1.3.0 correct?
There was a problem hiding this comment.
yes, this is set to ensure no race conditions during the test
…ough, tighten isSafe allowlist, drop codex just-bash fallback
Coverage ReportScope: files changed in this PR. Enforced threshold: 90% per metric (per file via
File Coverage — 4 files changed
Generated for commit 9714fa6. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/hooks/codex/pre-tool-use.ts`:
- Line 337: The fallback guidance produced by buildUnsupportedGuidance() still
suggests commands/forms that are blocked by isSafe, causing users to retry with
disallowed actions; update buildUnsupportedGuidance() to consult the same safety
allowlist or accept an allowlist parameter and remove references to disallowed
tools (e.g., sed, awk, unrestricted find) so the unroutable path only suggests
operations that isSafe will permit, ensuring text aligns with the safety rules
used elsewhere.
In `@src/hooks/pre-tool-use.ts`:
- Around line 102-107: Update the MEMORY_RETRY_GUIDANCE string to match the
actual allowed builtins in SAFE_BUILTINS (remove "sed" and "awk" from the
guidance) so users are not instructed to use commands that were removed for
security; locate the constant MEMORY_RETRY_GUIDANCE and edit its list of example
tools to exactly reflect the SAFE_BUILTINS set (or reword to say "common
text-processing tools like jq, grep, head, tail, sort, wc, find" without
mentioning sed/awk).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: d6fbbc61-4cb8-4cb5-b6d9-47164cf83982
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/codex/pre-tool-use.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tstests/claude-code/pre-tool-use-branches.test.tstests/claude-code/pre-tool-use.test.tstests/cli/cli-update.test.tstests/codex/codex-pre-tool-use-branches.test.ts
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
.github/workflows/release.yaml (1)
33-35:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStop persisting the checkout token across the release jobs.
This PR narrows secret exposure, but both checkout steps still leave credentials in
.git/config.publishnever needs that token at all, andreleaseshould scope auth to thegit pushstep instead of keeping it available for the whole job. Otherwise a compromised install/build step can still reuse it.Suggested hardening shape
- uses: actions/checkout@v6.0.2 with: token: ${{ secrets.GITHUB_TOKEN }} + persist-credentials: false- uses: actions/checkout@v6.0.2 with: ref: ${{ needs.release.outputs.sha }} fetch-depth: 1 + persist-credentials: falseThen add auth only where
git pushruns.Based on learnings: For every GitHub Actions workflow in
.github/workflows, ensure allactions/checkoutsteps setpersist-credentials: falseexplicitly.Also applies to: 225-233
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yaml around lines 33 - 35, The checkout steps are persisting the GITHUB_TOKEN in .git/config; update every actions/checkout@v6.0.2 invocation (including the ones in the publish and release jobs) to include persist-credentials: false so credentials are not left in the repo config, and then scope authentication only to the step that actually runs git push by injecting the token (e.g., via env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} or a separate git setup step) immediately before the git push command in the release job..github/workflows/ci.yaml (1)
24-24:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDisable credential persistence on the remaining checkout steps.
These three jobs still let
actions/checkoutpersist the repo token into.git/config. They only build/test, so that token is unnecessary and remains readable to later steps if one is compromised. Please addpersist-credentials: falsehere, the same waywindows-smokealready does.Based on learnings: For every GitHub Actions workflow in
.github/workflows, ensure allactions/checkoutsteps setpersist-credentials: falseexplicitly.Also applies to: 92-97, 359-359
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/ci.yaml at line 24, The checkout steps using actions/checkout@v6.0.2 are leaving the repo token in .git/config; update each actions/checkout step (including the one at the current diff and the other occurrences flagged) to include persist-credentials: false so the GITHUB_TOKEN is not persisted to the checked-out repo, mirroring the existing windows-smoke job’s configuration; ensure every actions/checkout usage in the workflow files explicitly sets persist-credentials: false.src/cli/update.ts (1)
341-344:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winKeep the manual fallback commands pinned to the fetched version.
The
npm-globalpath now avoids re-resolving@latest, but thenpxandunknownbranches still tell users to runnpm install -g ...@latest. That reintroduces the same drift window for the manual paths this patch is trying to remove. Reuselatestin those messages too.Minimal fix
- log(` npm install -g ${PKG_NAME}`@latest``); + log(` npm install -g ${PKG_NAME}@${latest}`); ... - warn(`Update manually: npm install -g ${PKG_NAME}`@latest``); + warn(`Update manually: npm install -g ${PKG_NAME}@${latest}`);Also applies to: 364-365
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cli/update.ts` around lines 341 - 344, The manual fallback messages currently hardcode "`@latest`" which can reintroduce version drift; update the log lines that construct the manual install hints (the log calls that use PKG_NAME + "`@latest`" in src/cli/update.ts) to use the resolved version variable (e.g., latest) instead of the literal "`@latest`" so they show "@${latest}" (apply the same change to the occurrences around the npx/unknown branches and the other messages at the later block around the lines mentioned). Ensure you reference PKG_NAME and the resolved latest variable when formatting those log strings so all manual commands are pinned to the fetched version.src/hooks/pre-tool-use.ts (1)
509-520:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAnchor the supported
findshape instead of matching a safe prefix.This regex accepts
find /... -name 'x'as soon as that prefix matches, so commands likefind ~/.deeplake/memory -name '*.md' -deleteor... -maxdepth 1get treated as successful even though the extra semantics are silently ignored. That is safer than falling through to the host shell, but it is still a correctness break: the agent sees a result for a command that was not actually honored. Restrict this branch to the exact supported shapes (find … -name '…'and optionally| wc -l) and send everything else to the retry-guidance path.🔧 Tighten the matcher
- const findMatch = shellCmd.match(/^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'/); + const findMatch = shellCmd.match( + /^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'\s*(\|\s*wc\s+-l)?\s*$/, + ); if (findMatch) { const dir = findMatch[1].replace(/\/+$/, "") || "/"; const namePattern = sqlLike(findMatch[2]).replace(/\*/g, "%").replace(/\?/g, "_"); logFn(`direct find: ${dir} -name '${findMatch[2]}'`); const paths = await findVirtualPathsFn(api, table, sessionsTable, dir, namePattern); let result = paths.join("\n") || ""; - if (/\|\s*wc\s+-l\s*$/.test(shellCmd)) result = String(paths.length); + if (findMatch[3]) result = String(paths.length); const capped = capOutputForClaude(result || "(no matches)", { kind: "find" }); return buildAllowDecision(`echo ${JSON.stringify(capped)}`, `[DeepLake direct] find ${dir}`); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 509 - 520, The current find matcher is too permissive and must be tightened to only accept the exact supported shapes (find <dir> -name '<pattern>' with an optional trailing "| wc -l") so extra flags like -delete or -maxdepth are rejected and fall through to the retry-guidance path. Replace the regex on shellCmd with a full-string anchored pattern such as ^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'\s*(?:\|\s*wc\s+-l\s*)?$ (or equivalent) and remove the separate /\|\s*wc\s+-l\s*$/ check by reading the optional group; keep using sqlLike, findVirtualPathsFn, capOutputForClaude and buildAllowDecision unchanged but only return an allow decision when the anchored regex matches. Ensure unmatched commands do not return here so they go to the retry guidance.src/hooks/codex/pre-tool-use.ts (1)
121-126:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winUnsafe memory commands still use
guide(exit 0), which undermines the new host-execution block.
Line 327-333correctly documents thatguideallows host execution, butLine 121-126still returnsaction: "guide"whenisSafe()fails. This leaves a bypass for unsafe memory-targeting commands.Proposed fix
if (!isSafe(rewritten)) { const guidance = buildUnsupportedGuidance(); - logFn(`unsupported command, returning guidance: ${rewritten}`); + logFn(`unsupported command, blocking with guidance: ${rewritten}`); return { - action: "guide", + action: "block", output: guidance, rewrittenCommand: rewritten, }; }Also applies to: 327-333
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/hooks/pre-tool-use.ts`:
- Around line 114-119: The current buildRetryGuidanceDecision returns a
{command, description} Bash-shaped decision but is being reused for Read calls
and causes shape-mismatch; for Read intercepts either (a) create a new decision
that materializes MEMORY_RETRY_GUIDANCE into a cache file and returns a
Read-shaped decision (use or mirror buildReadDecision so it emits
updatedInput.file_path pointing to that cache file), or (b) explicitly deny Read
retries by returning a Read-deny decision via
buildReadDecision/buildDenyDecision; update all call sites that previously used
buildRetryGuidanceDecision (including the additional occurrences noted) to use
this new read-specific path so Read responses include updatedInput.file_path
instead of a Bash command shape (refer to MEMORY_RETRY_GUIDANCE,
buildRetryGuidanceDecision, and buildReadDecision to locate code).
---
Outside diff comments:
In @.github/workflows/ci.yaml:
- Line 24: The checkout steps using actions/checkout@v6.0.2 are leaving the repo
token in .git/config; update each actions/checkout step (including the one at
the current diff and the other occurrences flagged) to include
persist-credentials: false so the GITHUB_TOKEN is not persisted to the
checked-out repo, mirroring the existing windows-smoke job’s configuration;
ensure every actions/checkout usage in the workflow files explicitly sets
persist-credentials: false.
In @.github/workflows/release.yaml:
- Around line 33-35: The checkout steps are persisting the GITHUB_TOKEN in
.git/config; update every actions/checkout@v6.0.2 invocation (including the ones
in the publish and release jobs) to include persist-credentials: false so
credentials are not left in the repo config, and then scope authentication only
to the step that actually runs git push by injecting the token (e.g., via env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} or a separate git setup step)
immediately before the git push command in the release job.
In `@src/cli/update.ts`:
- Around line 341-344: The manual fallback messages currently hardcode "`@latest`"
which can reintroduce version drift; update the log lines that construct the
manual install hints (the log calls that use PKG_NAME + "`@latest`" in
src/cli/update.ts) to use the resolved version variable (e.g., latest) instead
of the literal "`@latest`" so they show "@${latest}" (apply the same change to the
occurrences around the npx/unknown branches and the other messages at the later
block around the lines mentioned). Ensure you reference PKG_NAME and the
resolved latest variable when formatting those log strings so all manual
commands are pinned to the fetched version.
In `@src/hooks/pre-tool-use.ts`:
- Around line 509-520: The current find matcher is too permissive and must be
tightened to only accept the exact supported shapes (find <dir> -name
'<pattern>' with an optional trailing "| wc -l") so extra flags like -delete or
-maxdepth are rejected and fall through to the retry-guidance path. Replace the
regex on shellCmd with a full-string anchored pattern such as
^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'\s*(?:\|\s*wc\s+-l\s*)?$ (or
equivalent) and remove the separate /\|\s*wc\s+-l\s*$/ check by reading the
optional group; keep using sqlLike, findVirtualPathsFn, capOutputForClaude and
buildAllowDecision unchanged but only return an allow decision when the anchored
regex matches. Ensure unmatched commands do not return here so they go to the
retry guidance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 42ee0fa1-dc2f-47ca-ab3b-1e5f9e1b9cc5
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/codex/pre-tool-use.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tstests/claude-code/pre-tool-use-branches.test.tstests/claude-code/pre-tool-use.test.tstests/cli/cli-update.test.tstests/codex/codex-pre-tool-use-branches.test.ts
…s file_path, not command)
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
.github/workflows/release.yaml (2)
33-35:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftStop persisting the release job’s write token for the entire job.
This checkout leaves a write-capable
GITHUB_TOKENin.git/configall the way throughnpm ci,node, and build steps. In this job that is especially risky because the token is only needed for the finalgit push. Please switch checkout topersist-credentials: falseand authenticate the push step explicitly instead of keeping repo-write creds live for the whole release pipeline.🔒 Proposed direction
- uses: actions/checkout@v6.0.2 with: token: ${{ secrets.GITHUB_TOKEN }} + persist-credentials: false- name: Commit 2 — untrack bundles + point marketplace at release sha if: steps.check_bump.outputs.skip == 'false' + env: + GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} run: | # Remove the 7 bundle dirs from the index but keep the files in the # working tree (publish job in the same workflow run still needs # them on disk for npm publish). @@ - git push + git push "https://x-access-token:${GH_TOKEN}`@github.com/`${GITHUB_REPOSITORY}.git" HEAD:mainBased on learnings, every
actions/checkoutstep in.github/workflowsshould setpersist-credentials: falseexplicitly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yaml around lines 33 - 35, The checkout step currently uses "uses: actions/checkout@v6.0.2" and leaves a write-capable GITHUB_TOKEN persisted; change that checkout invocation to include persist-credentials: false so the token is not stored in .git/config, and then modify the final push step (the step that runs git push) to explicitly authenticate only at push time by injecting the token (e.g., via an environment variable or setting http.extraheader) using the job's GITHUB_TOKEN; ensure no other steps keep repo-write credentials and apply the same persist-credentials: false pattern to any other actions/checkout uses in the workflow.
225-233:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHarden the publish checkout the same way.
This job never pushes back to the repository, so there is no reason to persist checkout credentials in
.git/config. Keeping the default token around for the later install/build/publish steps only widens the blast radius if one of those steps is compromised.🔒 Proposed fix
- uses: actions/checkout@v6.0.2 with: # Pin to the bump commit captured by the release job. Pinning # to a SHA (not `main`) is required: the `production` # environment gate can pause for hours, during which `main` # may advance — checking out a moving ref would publish a # commit the reviewer never approved. ref: ${{ needs.release.outputs.sha }} fetch-depth: 1 + persist-credentials: falseBased on learnings, every
actions/checkoutstep in.github/workflowsshould setpersist-credentials: falseexplicitly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yaml around lines 225 - 233, The checkout step using actions/checkout@v6.0.2 (the one with ref: ${{ needs.release.outputs.sha }} and fetch-depth: 1) should explicitly set persist-credentials: false since this job does not push to the repo; update that checkout invocation to include persist-credentials: false to avoid leaving the default token in .git/config for subsequent install/build/publish steps..github/workflows/ci.yaml (1)
24-24:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSet
persist-credentials: falseon the remaining checkout steps.These three jobs only read the repo, so leaving the default checkout credentials in
.git/configunnecessarily exposesGITHUB_TOKENto every later step in the job.windows-smokealready applies the hardened pattern;duplication,test, andcross-node-installshould match it.🔒 Proposed fix
- - uses: actions/checkout@v6.0.2 + - uses: actions/checkout@v6.0.2 + with: + persist-credentials: false- uses: actions/checkout@v6.0.2 with: + persist-credentials: false # Full history so the "Build PR coverage comment" step can do # `git diff origin/<base>...HEAD` to detect touched src/ files. # Default shallow checkout (depth=1) produces "no merge base". fetch-depth: 0- - uses: actions/checkout@v6.0.2 + - uses: actions/checkout@v6.0.2 + with: + persist-credentials: falseBased on learnings, every
actions/checkoutstep in.github/workflowsshould setpersist-credentials: falseexplicitly.Also applies to: 92-97, 359-359
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/ci.yaml at line 24, Update the remaining actions/checkout steps to set persist-credentials: false so the checkout does not leave GITHUB_TOKEN in .git/config; specifically, in the checkout usages (e.g., the occurrences of "uses: actions/checkout@v6.0.2" used by the duplication, test, and cross-node-install jobs) add the persist-credentials: false input to match the hardened pattern already used by windows-smoke.src/hooks/pre-tool-use.ts (2)
482-513:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDirectory
Readrequests still return the wrong decision shape.
lsDiralways returnsbuildAllowDecision(...). ForRead,main()will therefore emit{ command, description }instead of{ file_path }, which triggers the same harness error this change is avoiding in the retry/graph branches. Please mirror the/graphhandling here: materialize the listing withwriteReadCacheFileFn(...)and returnbuildReadDecision(...)forRead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 482 - 513, The direct ls branch currently always returns buildAllowDecision which causes Read to emit the wrong shape; modify the lsDir handling (the block using listVirtualPathRowsFn, entries Map, capOutputForClaude) to, when op === "Read", write the materialized listing to the read-cache via writeReadCacheFileFn (use the lsOutput string produced by capOutputForClaude) and return buildReadDecision({ file_path: <cachePath> }, `[DeepLake direct] ls ${dir}`) instead of buildAllowDecision; for non-Read cases keep the existing buildAllowDecision behavior and preserve the same lsOutput/content used for caching.
273-293:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winDon't rewrite interpreter reads into a host-shell
cat.This branch bypasses the VFS entirely by returning an
allowdecision with a real shell command. For example,python3 ~/.deeplake/memory/../../etc/passwdis normalized intocat '/../../etc/passwd', and Claude will execute that on the host filesystem instead of the virtual backend. Please route this shape through the same direct-read path as normal memory reads (or return retry guidance when it can't be served) rather than handing a rewritten shell command back to the tool.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 273 - 293, This branch currently converts interpreter read patterns (isReadLike) into a host-shell `cat` via buildAllowDecision, which bypasses the VFS; instead, when you detect a cleanPath (from rewritePaths + pathMatch) you must not return buildAllowDecision with a shell command—route the request through the same direct-memory-read path used for touchesMemory-based reads (i.e., call the existing memory-read handler / decision builder used for normal memory reads) so the VFS services the file, and only if that handler cannot serve the path return a buildRetryDecision (or equivalent retry guidance) explaining the agent should retry via the VFS; update the logic around isReadLike/hasShellMeta/cleanPath and remove the direct `cat` return, using logFn for diagnostics and referencing touchesMemory, rewritePaths, cleanPath, and buildAllowDecision in your changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cli/update.ts`:
- Around line 300-307: The fallback/manual instructions still reference the
literal "`@latest`" and must be pinned to the resolved version; update any
messages and fallback command strings in the npx/unknown branches and in the
install failure catch to use the resolved latest variable (e.g., replace
occurrences of "`@deeplake/hivemind`@latest" or hard-coded "`@latest`" with
`${PKG_NAME}@${latest}`), touching the spawn call, the warn(...) calls, and any
npx/help text so all user-facing install hints consistently reference PKG_NAME
and latest.
In `@src/hooks/codex/pre-tool-use.ts`:
- Around line 327-339: The !isSafe(rewritten) branch currently returns action:
"guide" which exits 0 and still allows Codex to execute the original command;
change that branch to mirror the unroutable path by logging the unsafe command
(use logFn) and returning action: "block" with output:
buildUnsupportedGuidance() and include rewrittenCommand: rewritten so the unsafe
command is never handed to the host; update the branch that checks
isSafe(rewritten) (and any early returns that set action:"guide") to use
action:"block" instead.
---
Outside diff comments:
In @.github/workflows/ci.yaml:
- Line 24: Update the remaining actions/checkout steps to set
persist-credentials: false so the checkout does not leave GITHUB_TOKEN in
.git/config; specifically, in the checkout usages (e.g., the occurrences of
"uses: actions/checkout@v6.0.2" used by the duplication, test, and
cross-node-install jobs) add the persist-credentials: false input to match the
hardened pattern already used by windows-smoke.
In @.github/workflows/release.yaml:
- Around line 33-35: The checkout step currently uses "uses:
actions/checkout@v6.0.2" and leaves a write-capable GITHUB_TOKEN persisted;
change that checkout invocation to include persist-credentials: false so the
token is not stored in .git/config, and then modify the final push step (the
step that runs git push) to explicitly authenticate only at push time by
injecting the token (e.g., via an environment variable or setting
http.extraheader) using the job's GITHUB_TOKEN; ensure no other steps keep
repo-write credentials and apply the same persist-credentials: false pattern to
any other actions/checkout uses in the workflow.
- Around line 225-233: The checkout step using actions/checkout@v6.0.2 (the one
with ref: ${{ needs.release.outputs.sha }} and fetch-depth: 1) should explicitly
set persist-credentials: false since this job does not push to the repo; update
that checkout invocation to include persist-credentials: false to avoid leaving
the default token in .git/config for subsequent install/build/publish steps.
In `@src/hooks/pre-tool-use.ts`:
- Around line 482-513: The direct ls branch currently always returns
buildAllowDecision which causes Read to emit the wrong shape; modify the lsDir
handling (the block using listVirtualPathRowsFn, entries Map,
capOutputForClaude) to, when op === "Read", write the materialized listing to
the read-cache via writeReadCacheFileFn (use the lsOutput string produced by
capOutputForClaude) and return buildReadDecision({ file_path: <cachePath> },
`[DeepLake direct] ls ${dir}`) instead of buildAllowDecision; for non-Read cases
keep the existing buildAllowDecision behavior and preserve the same
lsOutput/content used for caching.
- Around line 273-293: This branch currently converts interpreter read patterns
(isReadLike) into a host-shell `cat` via buildAllowDecision, which bypasses the
VFS; instead, when you detect a cleanPath (from rewritePaths + pathMatch) you
must not return buildAllowDecision with a shell command—route the request
through the same direct-memory-read path used for touchesMemory-based reads
(i.e., call the existing memory-read handler / decision builder used for normal
memory reads) so the VFS services the file, and only if that handler cannot
serve the path return a buildRetryDecision (or equivalent retry guidance)
explaining the agent should retry via the VFS; update the logic around
isReadLike/hasShellMeta/cleanPath and remove the direct `cat` return, using
logFn for diagnostics and referencing touchesMemory, rewritePaths, cleanPath,
and buildAllowDecision in your changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 69fda5b3-2465-42a2-8b05-ca2546ff4937
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/codex/pre-tool-use.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tstests/claude-code/pre-tool-use-branches.test.tstests/claude-code/pre-tool-use.test.tstests/cli/cli-update.test.tstests/codex/codex-pre-tool-use-branches.test.ts
…and lets them run on the host
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
.github/workflows/release.yaml (1)
225-233:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThe publish job still persists
GITHUB_TOKENfor the whole job.Scoping
CLAWHUB_TOKENto Line 328 is a good hardening step, butactions/checkoutin the same job still writesGITHUB_TOKENinto.git/config. This job only reads a pinned SHA and never pushes, so it should also setpersist-credentials: false.🔒 Proposed fix
- uses: actions/checkout@v6.0.2 with: ref: ${{ needs.release.outputs.sha }} fetch-depth: 1 + persist-credentials: falseBased on learnings: For every GitHub Actions workflow in
.github/workflows(e.g., ci, release, publish smoke tests), ensure allactions/checkoutsteps setpersist-credentials: falseexplicitly (do not rely on defaults).Also applies to: 248-255, 322-330
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yaml around lines 225 - 233, The checkout step using actions/checkout@v6.0.2 that pins to ref: ${{ needs.release.outputs.sha }} and uses fetch-depth: 1 should explicitly set persist-credentials: false to avoid writing GITHUB_TOKEN into .git/config for this read-only job; update that actions/checkout step (and the other similar checkout steps referenced) to include persist-credentials: false so no credentials are persisted during checkout..github/workflows/ci.yaml (1)
24-32:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDisable persisted checkout credentials in the remaining read-only jobs.
These jobs only read the repo, but their
actions/checkoutsteps still leaveGITHUB_TOKENin.git/config. Sincewindows-smokealready follows the hardened pattern,duplication,test, andcross-node-installshould do the same.🔒 Proposed fix
duplication: steps: - uses: actions/checkout@v6.0.2 + with: + persist-credentials: false @@ test: steps: - uses: actions/checkout@v6.0.2 with: + persist-credentials: false fetch-depth: 0 @@ cross-node-install: steps: - uses: actions/checkout@v6.0.2 + with: + persist-credentials: falseBased on learnings: For every GitHub Actions workflow in
.github/workflows(e.g., ci, release, publish smoke tests), ensure allactions/checkoutsteps setpersist-credentials: falseexplicitly (do not rely on defaults).Also applies to: 92-105, 359-372
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/ci.yaml around lines 24 - 32, Update the remaining read-only jobs' checkout steps to explicitly disable persisted credentials: locate each actions/checkout@... invocation (e.g., the one shown alongside "Setup Node.js" and "Install dependencies" as well as the other occurrences around the ranges mentioned) and add persist-credentials: false to their with: block so the GITHUB_TOKEN is not written into .git/config for the duplication, test, cross-node-install and other read-only jobs.tests/cli/cli-update.test.ts (1)
155-168:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert the pinned manual fallback in the npm-install failure test.
Line 167 only checks the error prefix. A regression back to
@latestin thewarn("Try running it manually...")branch would still pass, even though that branch is part of this hardening change.🧪 Proposed fix
expect(code).toBe(1); expect(spawn).toHaveBeenCalledTimes(1); expect(stderrText()).toContain("npm install failed: ENOENT"); + expect(stderrText()).toContain("Try running it manually: npm install -g `@deeplake/hivemind`@1.3.0");🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/cli/cli-update.test.ts` around lines 155 - 168, Update the test "npm-global: returns 1 if `npm install` itself fails (does NOT attempt the refresh)" to additionally assert the manual fallback is pinned to the correct version (use the latestVersionOverride "1.3.0") instead of "`@latest`"; after calling runUpdate and checking the error prefix, add an assertion on stderrText() that the warning message contains the pinned version token (e.g. "`@1.3.0`") so the warning produced by the code path exercised by runUpdate includes the explicit version rather than "`@latest`".src/hooks/codex/pre-tool-use.ts (1)
302-314:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTighten the Codex
findmatcher to the supported grammar.This has the same prefix-match problem as the Claude hook:
find ~/.deeplake/memory -name '*.md' -deleteor... -o -name '*.json'is accepted as a plainfind -nameread and the unsupported suffix is ignored. Please anchor this to the exact shapes the VFS serves (optionally| wc -l) and block everything else.Suggested fix
- const findMatch = rewritten.match(/^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'/); + const findMatch = rewritten.match( + /^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'\s*(?:\|\s*wc\s+-l)?\s*$/, + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/codex/pre-tool-use.ts` around lines 302 - 314, The current regex in pre-tool-use.ts that sets findMatch from rewritten allows extra unsupported suffixes; tighten the matcher so rewritten only matches the exact supported grammars (find <dir> [-type <type>] -name '<pattern>' optionally followed by a pipe to wc -l) and reject anything else: update the pattern used to create findMatch (and its subsequent handling of dir, namePattern, and the wc -l check) so any extra tokens (e.g., -delete, -o, additional -name clauses, or other commands) cause no match/blocked path; use the same helper functions (sqlLike, findVirtualPathsFn, logFn) and return the existing block response when the input does not strictly match the anchored grammar.src/hooks/pre-tool-use.ts (2)
279-292:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winOnly rewrite the actual memory-path operand.
This branch currently rewrites the first absolute-looking token after whitespace, not the memory path that triggered interception. A command like
python3 script.py /etc/hosts ~/.deeplake/memory/index.mdwill be turned intocat '/etc/hosts', which leaks a host file through the rewritten Bash command. Match the~/.deeplake/memory,$HOME/.deeplake/memory, or absolute-memory argument first, then normalize just that token.Suggested fix
- const normalized = rewritePaths(cmd) + " " + rewritePaths(toolPath); - const pathMatch = normalized.match(/\s(\/[\w./_-]+)/); - const cleanPath = pathMatch ? pathMatch[1] : ""; + const argv = cmd.trim().split(/\s+/).slice(1); + const memoryArg = argv.find((arg) => touchesMemory(arg)); + const cleanPath = memoryArg ? rewritePaths(memoryArg) : "";🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 279 - 292, The current branch in pre-tool-use.ts identifies the first absolute-looking token after whitespace (pathMatch) and rewrites it, which can rewrite the wrong operand; instead locate the memory-related operand by matching tokens against memory path patterns (e.g., /^~\/\.deeplake\/memory\b/, /^\$HOME\/\.deeplake\/memory\b/, or absolute memory path like /.../\.deeplake\/memory), normalize only that matched token via rewritePaths, and use that normalized token as cleanPath; update the logic around isReadLike, rewritePaths, pathMatch/cleanPath, and the buildAllowDecision call so you only convert the actual memory path to cat (preserve other operands) and keep proper single-quote escaping when constructing the replacement.
516-527:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject unsupported
findsuffixes instead of prefix-matching them.This regex accepts any command that starts with
find ... -name '...', sofind ~/.deeplake/memory -name '*.md' -deleteor... -o -name '*.json'gets routed as a plain single-pattern read and the trailing semantics are silently dropped. Since this VFS only emulatesfind -name(optionally with| wc -l), anchor the matcher to that exact shape and send everything else to guidance. The same bug exists insrc/hooks/codex/pre-tool-use.ts.Suggested fix
- const findMatch = shellCmd.match(/^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'/); + const findMatch = shellCmd.match( + /^find\s+(\S+)\s+(?:-type\s+\S+\s+)?-name\s+'([^']+)'\s*(?:\|\s*wc\s+-l)?\s*$/, + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 516 - 527, The current regex in pre-tool-use.ts (used around variables shellCmd and findMatch and later calling findVirtualPathsFn) permissively matches any command that starts with "find ... -name '...'", which lets unsupported suffixes like "-delete" or "-o -name ..." slip through; change the matcher to strictly accept only the exact supported shapes: allow an optional "-type <char>" immediately before -name, optionally followed only by a pipe to "wc -l" (with possible whitespace) and then end-of-string; anchor the pattern (use ^...$) so trailing operators or other flags won't match, and return guidance for those cases; apply the identical stricter regex change in the corresponding matcher in src/hooks/codex/pre-tool-use.ts so both code paths reject unsupported find suffixes instead of prefix-matching them.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.github/workflows/ci.yaml:
- Around line 24-32: Update the remaining read-only jobs' checkout steps to
explicitly disable persisted credentials: locate each actions/checkout@...
invocation (e.g., the one shown alongside "Setup Node.js" and "Install
dependencies" as well as the other occurrences around the ranges mentioned) and
add persist-credentials: false to their with: block so the GITHUB_TOKEN is not
written into .git/config for the duplication, test, cross-node-install and other
read-only jobs.
In @.github/workflows/release.yaml:
- Around line 225-233: The checkout step using actions/checkout@v6.0.2 that pins
to ref: ${{ needs.release.outputs.sha }} and uses fetch-depth: 1 should
explicitly set persist-credentials: false to avoid writing GITHUB_TOKEN into
.git/config for this read-only job; update that actions/checkout step (and the
other similar checkout steps referenced) to include persist-credentials: false
so no credentials are persisted during checkout.
In `@src/hooks/codex/pre-tool-use.ts`:
- Around line 302-314: The current regex in pre-tool-use.ts that sets findMatch
from rewritten allows extra unsupported suffixes; tighten the matcher so
rewritten only matches the exact supported grammars (find <dir> [-type <type>]
-name '<pattern>' optionally followed by a pipe to wc -l) and reject anything
else: update the pattern used to create findMatch (and its subsequent handling
of dir, namePattern, and the wc -l check) so any extra tokens (e.g., -delete,
-o, additional -name clauses, or other commands) cause no match/blocked path;
use the same helper functions (sqlLike, findVirtualPathsFn, logFn) and return
the existing block response when the input does not strictly match the anchored
grammar.
In `@src/hooks/pre-tool-use.ts`:
- Around line 279-292: The current branch in pre-tool-use.ts identifies the
first absolute-looking token after whitespace (pathMatch) and rewrites it, which
can rewrite the wrong operand; instead locate the memory-related operand by
matching tokens against memory path patterns (e.g., /^~\/\.deeplake\/memory\b/,
/^\$HOME\/\.deeplake\/memory\b/, or absolute memory path like
/.../\.deeplake\/memory), normalize only that matched token via rewritePaths,
and use that normalized token as cleanPath; update the logic around isReadLike,
rewritePaths, pathMatch/cleanPath, and the buildAllowDecision call so you only
convert the actual memory path to cat (preserve other operands) and keep proper
single-quote escaping when constructing the replacement.
- Around line 516-527: The current regex in pre-tool-use.ts (used around
variables shellCmd and findMatch and later calling findVirtualPathsFn)
permissively matches any command that starts with "find ... -name '...'", which
lets unsupported suffixes like "-delete" or "-o -name ..." slip through; change
the matcher to strictly accept only the exact supported shapes: allow an
optional "-type <char>" immediately before -name, optionally followed only by a
pipe to "wc -l" (with possible whitespace) and then end-of-string; anchor the
pattern (use ^...$) so trailing operators or other flags won't match, and return
guidance for those cases; apply the identical stricter regex change in the
corresponding matcher in src/hooks/codex/pre-tool-use.ts so both code paths
reject unsupported find suffixes instead of prefix-matching them.
In `@tests/cli/cli-update.test.ts`:
- Around line 155-168: Update the test "npm-global: returns 1 if `npm install`
itself fails (does NOT attempt the refresh)" to additionally assert the manual
fallback is pinned to the correct version (use the latestVersionOverride
"1.3.0") instead of "`@latest`"; after calling runUpdate and checking the error
prefix, add an assertion on stderrText() that the warning message contains the
pinned version token (e.g. "`@1.3.0`") so the warning produced by the code path
exercised by runUpdate includes the explicit version rather than "`@latest`".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 96695df6-2bff-463e-8365-ad3e967b71a3
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (11)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/codex/pre-tool-use.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tstests/claude-code/pre-tool-use-branches.test.tstests/claude-code/pre-tool-use.test.tstests/cli/cli-update.test.tstests/codex/codex-integration.test.tstests/codex/codex-pre-tool-use-branches.test.ts
…ot the first absolute token (was leaking host files like /etc/hosts)
…s like -delete/-o fall through to guidance
…KEN isn't left in .git/config
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
.github/workflows/release.yaml (1)
322-333:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPin
clawhubnpm version before usingCLAWHUB_TOKEN(see.github/workflows/release.yaml~lines 322-333)
npm install -g clawhubinstalls a floating/latest CLI immediately beforeclawhub login --token "$CLAWHUB_TOKEN". Pin the CLI package to an explicit published version (npm supportsnpm install -g clawhub@<version>—current latest is0.18.0) to avoid supply-chain risk from unexpected CLI updates that could read/exfiltrate the token or behave differently.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yaml around lines 322 - 333, Change the "Install ClawHub CLI" step to install a pinned published version of the clawhub package instead of the floating latest (replace the current npm install -g clawhub invocation with an explicit version, e.g., clawhub@0.18.0), so the CLI installed just before the "Authenticate ClawHub CLI" step cannot be unexpectedly updated and potentially exfiltrate CLAWHUB_TOKEN; keep the "Authenticate ClawHub CLI" step and its use of the CLAWHUB_TOKEN env var and clawhub login invocation unchanged except for relying on the pinned CLI.src/hooks/pre-tool-use.ts (2)
281-293:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winAvoid host-shell
caton virtual paths.This early return turns
python3 ~/.deeplake/memory/data.jsonintocat '/data.json', and Claude then executes that on the real host shell. The read comes from the host filesystem root, not the VFS. Route this through the existing direct-read path, or fall back to retry guidance when the backend is unavailable.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 281 - 293, The current early-return in the isReadLike && !hasShellMeta branch rewrites virtual VFS paths (e.g., ~/.deeplake/memory/...) into a host-shell cat using buildAllowDecision, causing reads to hit the real host filesystem; change this logic so that when memoryArg is a virtual/VFS path (detected via touchesMemory and rewritePaths), do NOT return the host-shell cat; instead route the request through the existing direct-read handling path (the direct-read handler used elsewhere in this module) or, if the backend is unavailable, return the retry/guidance decision used for backend misses; update the block around touchesMemory, rewritePaths, logFn and buildAllowDecision to call the direct-read flow or the retry guidance rather than issuing a host-shell cat.
482-513:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEmit
file_pathfor Read directory listings.When a
Readtarget resolves to a directory, this branch still returnsbuildAllowDecision(...).main()then emits{ command, description }, which reproduces the same Read shape mismatch the file already documents. Materialize the listing to a cache file and returnbuildReadDecision(...)here instead; use a synthetic leaf filename rather than the directory path itself so later child reads can still create subdirectories.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 482 - 513, The current lsDir branch builds an inline echo response and returns buildAllowDecision (using lsDir, listVirtualPathRowsFn, capOutputForClaude, buildAllowDecision), which later causes a Read shape mismatch in main(); instead, materialize the directory listing to a cache file (e.g., create a synthetic leaf filename under the agent cache like `${dir.replace(/\/+$/,'') || '_' }_listing.json`), write the lsOutput there, and return buildReadDecision pointing at that cache file (with an appropriate description string similar to `[DeepLake direct] ls ${dir}`) so subsequent Read operations target an actual file; keep the same listing generation logic and ensure the synthetic filename ends with a non-directory leaf name so child reads can create subdirectories.src/hooks/memory-path-utils.ts (1)
61-70:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMatch the memory mount on boundaries, not substrings.
touchesMemory()andrewritePaths()currently fire on any string that merely contains the mount prefix. That also intercepts non-memory inputs likegrep "~/.deeplake/memory" README.mdorcat ~/.deeplake/memory-backup/x, then rewrites them into bogus virtual paths. Tighten both helpers to require the mount root or one of its descendants before intercepting/replacing.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/memory-path-utils.ts` around lines 61 - 70, touchesMemory() and rewritePaths() are matching MEMORY_PATH/TILDE_PATH/HOME_VAR_PATH as substrings and thus false-positive on names like "~/.deeplake/memory-backup"; update both helpers to only match the mount root or its descendants by requiring a boundary after the mount prefix (either end-of-string or a slash). In touchesMemory(), check that MEMORY_PATH/TILDE_PATH/HOME_VAR_PATH is followed by ($|/) rather than using includes; in rewritePaths(), tighten each regex (the MEMORY_PATH escape, /~\/.deeplake\/memory\/?/, /\$HOME\/.deeplake\/memory\/?/, and the quoted "$HOME" variant) to require end-of-string or a trailing slash after "memory" so only the mount root and its children are rewritten. Ensure you still escape MEMORY_PATH when building the RegExp.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.github/workflows/release.yaml:
- Around line 322-333: Change the "Install ClawHub CLI" step to install a pinned
published version of the clawhub package instead of the floating latest (replace
the current npm install -g clawhub invocation with an explicit version, e.g.,
clawhub@0.18.0), so the CLI installed just before the "Authenticate ClawHub CLI"
step cannot be unexpectedly updated and potentially exfiltrate CLAWHUB_TOKEN;
keep the "Authenticate ClawHub CLI" step and its use of the CLAWHUB_TOKEN env
var and clawhub login invocation unchanged except for relying on the pinned CLI.
In `@src/hooks/memory-path-utils.ts`:
- Around line 61-70: touchesMemory() and rewritePaths() are matching
MEMORY_PATH/TILDE_PATH/HOME_VAR_PATH as substrings and thus false-positive on
names like "~/.deeplake/memory-backup"; update both helpers to only match the
mount root or its descendants by requiring a boundary after the mount prefix
(either end-of-string or a slash). In touchesMemory(), check that
MEMORY_PATH/TILDE_PATH/HOME_VAR_PATH is followed by ($|/) rather than using
includes; in rewritePaths(), tighten each regex (the MEMORY_PATH escape,
/~\/.deeplake\/memory\/?/, /\$HOME\/.deeplake\/memory\/?/, and the quoted
"$HOME" variant) to require end-of-string or a trailing slash after "memory" so
only the mount root and its children are rewritten. Ensure you still escape
MEMORY_PATH when building the RegExp.
In `@src/hooks/pre-tool-use.ts`:
- Around line 281-293: The current early-return in the isReadLike &&
!hasShellMeta branch rewrites virtual VFS paths (e.g., ~/.deeplake/memory/...)
into a host-shell cat using buildAllowDecision, causing reads to hit the real
host filesystem; change this logic so that when memoryArg is a virtual/VFS path
(detected via touchesMemory and rewritePaths), do NOT return the host-shell cat;
instead route the request through the existing direct-read handling path (the
direct-read handler used elsewhere in this module) or, if the backend is
unavailable, return the retry/guidance decision used for backend misses; update
the block around touchesMemory, rewritePaths, logFn and buildAllowDecision to
call the direct-read flow or the retry guidance rather than issuing a host-shell
cat.
- Around line 482-513: The current lsDir branch builds an inline echo response
and returns buildAllowDecision (using lsDir, listVirtualPathRowsFn,
capOutputForClaude, buildAllowDecision), which later causes a Read shape
mismatch in main(); instead, materialize the directory listing to a cache file
(e.g., create a synthetic leaf filename under the agent cache like
`${dir.replace(/\/+$/,'') || '_' }_listing.json`), write the lsOutput there, and
return buildReadDecision pointing at that cache file (with an appropriate
description string similar to `[DeepLake direct] ls ${dir}`) so subsequent Read
operations target an actual file; keep the same listing generation logic and
ensure the synthetic filename ends with a non-directory leaf name so child reads
can create subdirectories.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 7c2f6343-dff4-4209-87db-3f12bf7e2dea
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (11)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/codex/pre-tool-use.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tstests/claude-code/pre-tool-use-branches.test.tstests/claude-code/pre-tool-use.test.tstests/cli/cli-update.test.tstests/codex/codex-integration.test.tstests/codex/codex-pre-tool-use-branches.test.ts
…kup aren't intercepted
… path traversal) and emit file_path for Read directory listings
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/hooks/pre-tool-use.ts (2)
420-452:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMissing reads are being misreported as “retry required.”
Once
Read/cat/head/tail/wc -lhas already been normalized into a concretevirtualPath,content === nullmeans “path not found”, not “unsupported command”. Falling through to the generic[RETRY REQUIRED]branch makes normal existence probes loop forever, because the agent is told to rewrite a shape that's already supported. Return a shape-safe not-found response from this branch instead of the final retry guidance.Also applies to: 529-536
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 420 - 452, When content === null after calling readVirtualPathContentFn (and similarly in the block around lines 529-536), treat this as "path not found" instead of falling through to the generic retry branch: return a shape-safe not-found decision (e.g., a buildAllowDecision or buildReadDecision that communicates a not-found result consistent with the existing decision shapes) so the agent gets a concrete missing-file response rather than a "[RETRY REQUIRED]" rewrite; update the handling in the virtualPath branch (the code around readCachedIndexContentFn, readVirtualPathContentFn, writeCachedIndexContentFn, buildReadDecision and buildAllowDecision) and replicate the same not-found return behavior in the other similar block.
411-416:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse a synthetic leaf for
/graphlistings too.
writeReadCacheFileFn(input.session_id, "/graph", body)materializes the listing as<cache>/.../read/graph. A later Read of/graph/index.mdor/graph/show/...then tries to create children under a path that is already a file and fails. The generic directory-listing branch already avoids this with/_listing.txt; this branch needs the same treatment.Suggested fix
- const file_path = writeReadCacheFileFn(input.session_id, "/graph", body); + const file_path = writeReadCacheFileFn(input.session_id, "/graph/_listing.txt", body);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/pre-tool-use.ts` around lines 411 - 416, The /graph special-case writes a cache file at "/graph" which makes that path a file and later causes children creation failures; change the branch handling lsDir === "/graph" in pre-tool-use.ts to write the listing to a synthetic leaf (e.g., "/graph/_listing.txt") via writeReadCacheFileFn and then call buildReadDecision with that synthetic leaf path (preserving input.session_id and the same body), mirroring the generic directory-listing behavior so subsequent reads of /graph/index.md or /graph/show/... treat /graph as a directory.src/hooks/codex/pre-tool-use.ts (1)
226-255:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't collapse missing virtual files into generic unsupported guidance.
After
cat/head/tail/wc -lhas already been parsed intovirtualPath, anullread is a normal “No such file or directory” case. Falling through to the final block+guidance return turnscat ~/.deeplake/memory/missing.mdinto “This command is not supported”, which is misleading and prevents the model from learning that the file simply doesn't exist. Emit a not-found block result here instead.Also applies to: 333-345
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hooks/codex/pre-tool-use.ts` around lines 226 - 255, When a read of a parsed virtual path returns null (after calling readCachedIndexContentFn/readVirtualPathContentFn and the index-building fallback via buildIndexContent), treat that as "not found" instead of falling through to the generic unsupported guidance; immediately return a block result indicating the file does not exist (e.g. { action: "block", output: `No such file or directory: ${virtualPath}`, rewrittenCommand: rewritten }) where content === null, taking into account virtualPath === "/index.md" behavior and caching via writeCachedIndexContentFn, and apply the same fix to the analogous logic around the read at the other location noted (the block around lines 333-345) so head/tail/wc -l all report not-found consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/release.yaml:
- Around line 323-326: Update the smoke-test workflow step that currently runs
"npm install -g clawhub" (the step just before clawhub login) to pin the CLI to
the same version used in release.yaml by changing the install to "npm install -g
clawhub@0.18.0"; locate the install step in publish-smoke-test.yaml (the run
command immediately before the clawhub login --token invocation) and replace the
floating install with the fixed version to ensure both workflows use the same
pinned CLI.
- Around line 234-236: The release job's actions/checkout step currently leaves
secrets.GITHUB_TOKEN persisted in .git/config during the entire npm
ci/build/commit sequence; update the release job's actions/checkout invocation
to include persist-credentials: false (same hardening as the publish checkout)
and ensure every other actions/checkout in this workflow also sets
persist-credentials: false; then add a dedicated step right before the final git
push that temporarily injects auth (e.g., set up auth or re-run actions/checkout
with credentials or configure git remote using the token) so only the push step
has write access.
In `@src/hooks/memory-path-utils.ts`:
- Around line 65-86: The current detector and replacers match
MEMORY_PATH/TILDE_PATH/HOME_VAR_PATH only with a trailing boundary
(MEMORY_BOUNDARY) so quoted or literal strings like `"~/.deeplake/memory"` are
falsely detected and rewritten; update MEMORY_PREFIX_RE used by touchesMemory to
require a leading boundary (e.g., start-of-string or non-path char) before the
alternation (using escapeRe and MEMORY_BOUNDARY) and update rewritePaths to use
the same leading-boundary in each RegExp builder (for the three patterns using
MEMORY_PATH, TILDE_PATH, HOME_VAR_PATH and escapeRe) so only tokens that are
standalone or path components (not inside quoted literals or sibling names) are
matched and replaced.
---
Outside diff comments:
In `@src/hooks/codex/pre-tool-use.ts`:
- Around line 226-255: When a read of a parsed virtual path returns null (after
calling readCachedIndexContentFn/readVirtualPathContentFn and the index-building
fallback via buildIndexContent), treat that as "not found" instead of falling
through to the generic unsupported guidance; immediately return a block result
indicating the file does not exist (e.g. { action: "block", output: `No such
file or directory: ${virtualPath}`, rewrittenCommand: rewritten }) where content
=== null, taking into account virtualPath === "/index.md" behavior and caching
via writeCachedIndexContentFn, and apply the same fix to the analogous logic
around the read at the other location noted (the block around lines 333-345) so
head/tail/wc -l all report not-found consistently.
In `@src/hooks/pre-tool-use.ts`:
- Around line 420-452: When content === null after calling
readVirtualPathContentFn (and similarly in the block around lines 529-536),
treat this as "path not found" instead of falling through to the generic retry
branch: return a shape-safe not-found decision (e.g., a buildAllowDecision or
buildReadDecision that communicates a not-found result consistent with the
existing decision shapes) so the agent gets a concrete missing-file response
rather than a "[RETRY REQUIRED]" rewrite; update the handling in the virtualPath
branch (the code around readCachedIndexContentFn, readVirtualPathContentFn,
writeCachedIndexContentFn, buildReadDecision and buildAllowDecision) and
replicate the same not-found return behavior in the other similar block.
- Around line 411-416: The /graph special-case writes a cache file at "/graph"
which makes that path a file and later causes children creation failures; change
the branch handling lsDir === "/graph" in pre-tool-use.ts to write the listing
to a synthetic leaf (e.g., "/graph/_listing.txt") via writeReadCacheFileFn and
then call buildReadDecision with that synthetic leaf path (preserving
input.session_id and the same body), mirroring the generic directory-listing
behavior so subsequent reads of /graph/index.md or /graph/show/... treat /graph
as a directory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: ecdf8618-0e6c-4b1a-b162-5e58003eaf59
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (11)
.github/workflows/ci.yaml.github/workflows/release.yamlsrc/cli/update.tssrc/hooks/codex/pre-tool-use.tssrc/hooks/memory-path-utils.tssrc/hooks/pre-tool-use.tstests/claude-code/pre-tool-use-branches.test.tstests/claude-code/pre-tool-use.test.tstests/cli/cli-update.test.tstests/codex/codex-integration.test.tstests/codex/codex-pre-tool-use-branches.test.ts
| # Publish reads a pinned SHA and never pushes; don't leave | ||
| # GITHUB_TOKEN in .git/config. | ||
| persist-credentials: false |
There was a problem hiding this comment.
Release checkout still persists the write-scoped token.
This hardens the publish checkout, but the release job's checkout at Lines 33-36 still leaves secrets.GITHUB_TOKEN in .git/config for the whole npm ci/build/commit sequence. That keeps the highest-privilege checkout in this workflow outside the hardening and widens exposure if any install/build step is compromised. Please set persist-credentials: false there too and inject auth only for the final git push.
Based on learnings, every actions/checkout step in .github/workflows should set persist-credentials: false explicitly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/release.yaml around lines 234 - 236, The release job's
actions/checkout step currently leaves secrets.GITHUB_TOKEN persisted in
.git/config during the entire npm ci/build/commit sequence; update the release
job's actions/checkout invocation to include persist-credentials: false (same
hardening as the publish checkout) and ensure every other actions/checkout in
this workflow also sets persist-credentials: false; then add a dedicated step
right before the final git push that temporarily injects auth (e.g., set up auth
or re-run actions/checkout with credentials or configure git remote using the
token) so only the push step has write access.
| # Pin the CLI version: it runs immediately before `clawhub login --token` | ||
| # with the real CLAWHUB_TOKEN in env, so a floating `@latest` would let an | ||
| # unexpected publish read/exfiltrate the token. Bump deliberately. | ||
| run: npm install -g clawhub@0.18.0 |
There was a problem hiding this comment.
Mirror the ClawHub pin into the smoke-test workflow too.
This step is pinned now, but .github/workflows/publish-smoke-test.yaml:77-84 still does npm install -g clawhub immediately before clawhub login --token "$CLAWHUB_TOKEN". That leaves the same pre-auth supply-chain window open in the dry-run canary and makes the two publish paths diverge. Please pin the same clawhub@0.18.0 there as well.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/release.yaml around lines 323 - 326, Update the smoke-test
workflow step that currently runs "npm install -g clawhub" (the step just before
clawhub login) to pin the CLI to the same version used in release.yaml by
changing the install to "npm install -g clawhub@0.18.0"; locate the install step
in publish-smoke-test.yaml (the run command immediately before the clawhub login
--token invocation) and replace the floating install with the fixed version to
ensure both workflows use the same pinned CLI.
| // A mount prefix only counts when it is the mount root or a descendant — i.e. | ||
| // followed by `/`, end-of-string, or a non-path character. Matching it as a | ||
| // bare substring false-positives on siblings like `~/.deeplake/memory-backup/x` | ||
| // and on literals like `grep "~/.deeplake/memory" README.md`. | ||
| const MEMORY_BOUNDARY = "(?![A-Za-z0-9._-])"; | ||
| const MEMORY_PREFIX_RE = new RegExp( | ||
| "(?:" + [MEMORY_PATH, TILDE_PATH, HOME_VAR_PATH].map(escapeRe).join("|") + ")" + MEMORY_BOUNDARY, | ||
| ); | ||
|
|
||
| export function touchesMemory(p: string): boolean { | ||
| return p.includes(MEMORY_PATH) || p.includes(TILDE_PATH) || p.includes(HOME_VAR_PATH); | ||
| return MEMORY_PREFIX_RE.test(p); | ||
| } | ||
|
|
||
| export function rewritePaths(cmd: string): string { | ||
| // Consume a trailing slash if present, otherwise require a boundary so a | ||
| // sibling like `memory-backup` is left untouched. | ||
| const tail = "(?:\\/|" + MEMORY_BOUNDARY + ")"; | ||
| return cmd | ||
| .replace(new RegExp(MEMORY_PATH.replace(/[.*+?^${}()|[\]\\]/g, "\\$&") + "/?", "g"), "/") | ||
| .replace(/~\/.deeplake\/memory\/?/g, "/") | ||
| .replace(/\$HOME\/.deeplake\/memory\/?/g, "/") | ||
| .replace(/"\$HOME\/.deeplake\/memory\/?"/g, '"/"'); | ||
| .replace(new RegExp(escapeRe(MEMORY_PATH) + tail, "g"), "/") | ||
| .replace(new RegExp(escapeRe(TILDE_PATH) + tail, "g"), "/") | ||
| .replace(new RegExp('"' + escapeRe(HOME_VAR_PATH) + tail + '"', "g"), '"/"') | ||
| .replace(new RegExp(escapeRe(HOME_VAR_PATH) + tail, "g"), "/"); |
There was a problem hiding this comment.
Prefix matching still rewrites plain string literals.
These regexes only enforce a trailing boundary. A command like grep "~/.deeplake/memory" README.md still makes touchesMemory() return true, and rewritePaths() mutates the quoted pattern to /, so an unrelated host-file command gets intercepted as a memory operation. Add a leading boundary/capture to both the detector and the replacers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/hooks/memory-path-utils.ts` around lines 65 - 86, The current detector
and replacers match MEMORY_PATH/TILDE_PATH/HOME_VAR_PATH only with a trailing
boundary (MEMORY_BOUNDARY) so quoted or literal strings like
`"~/.deeplake/memory"` are falsely detected and rewritten; update
MEMORY_PREFIX_RE used by touchesMemory to require a leading boundary (e.g.,
start-of-string or non-path char) before the alternation (using escapeRe and
MEMORY_BOUNDARY) and update rewritePaths to use the same leading-boundary in
each RegExp builder (for the three patterns using MEMORY_PATH, TILDE_PATH,
HOME_VAR_PATH and escapeRe) so only tokens that are standalone or path
components (not inside quoted literals or sibling names) are matched and
replaced.
Summary
Version Bump
Test plan
npm test)package.json, or no release needed for this changeSummary by CodeRabbit