Skip to content

feat: adding Storage, FileService, and FileServiceResolver - BED-8314#2801

Open
mykeelium wants to merge 5 commits into
mainfrom
mcuomo/BED-8314
Open

feat: adding Storage, FileService, and FileServiceResolver - BED-8314#2801
mykeelium wants to merge 5 commits into
mainfrom
mcuomo/BED-8314

Conversation

@mykeelium
Copy link
Copy Markdown
Contributor

@mykeelium mykeelium commented May 20, 2026

Description

This change sets up and builds out the implementation of a centralized file service and storage mechanism that can be used in future updated.

Motivation and Context

Resolves BED-8314

How Has This Been Tested?

This change does not add any functionality to current workflows. Tests were added and previous tests pass.

Screenshots (optional):

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

Summary by CodeRabbit

  • New Features

    • Added a local filesystem-backed storage backend with sandboxing, atomic writes, safe directory handling, listing, copy/move semantics, and content-type detection.
    • Added backend-agnostic file service abstractions and a resolver for managing named file services.
  • Bug Fixes

    • Validate presence of upload validation callback and return an error if missing to avoid runtime panics.
  • Tests

    • Added extensive unit/integration tests and mocks covering storage, file services, path-safety, error and cancellation cases.
  • Chores

    • Bumped AWS SDK-related dependency patch versions.

Review Change Stack

@mykeelium mykeelium self-assigned this May 20, 2026
@mykeelium mykeelium added enhancement New feature or request api A pull request containing changes affecting the API code. labels May 20, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2f3ae45c-f803-472d-bc7d-798d7ab11ab8

📥 Commits

Reviewing files that changed from the base of the PR and between 92891ed and 5f21b2d.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (7)
  • cmd/api/src/services/storage/localstore.go
  • cmd/api/src/services/storage/localstore_test.go
  • cmd/api/src/services/storage/mocks/fs.go
  • cmd/api/src/services/storage/storage.go
  • cmd/api/src/services/storage/storage_test.go
  • cmd/api/src/services/upload/upload.go
  • go.mod
✅ Files skipped from review due to trivial changes (1)
  • cmd/api/src/services/storage/mocks/fs.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • cmd/api/src/services/upload/upload.go
  • go.mod
  • cmd/api/src/services/storage/localstore.go
  • cmd/api/src/services/storage/localstore_test.go

📝 Walkthrough

Walkthrough

Adds backend-agnostic storage interfaces and a sandboxed LocalStore filesystem implementation with atomic operations, path-safety, service wrappers/resolver, generated mocks, comprehensive tests, dependency bumps, and a nil-check in upload validation.

Changes

Storage Abstraction and LocalStore Implementation

Layer / File(s) Summary
Storage abstraction interfaces and data contracts
cmd/api/src/services/storage/storage.go
Exports Storage and FileService interfaces, FileInfo, WriteOptions, ListOptions, FileServiceName constants, MoveFileBetweenServices, resolver and default service wiring.
LocalStore sandboxed filesystem backend
cmd/api/src/services/storage/localstore.go
Implements LocalStore with os.Root sandboxing, context-aware reads, writeAtomic (temp file → Sync → Link/Rename), Open/Stat/Get/Put/Exists/Delete/List/Copy/Move, ErrIsDirectory, and content-type detection.
LocalStore operation and safety testing
cmd/api/src/services/storage/localstore_test.go
Extensive tests for LocalStore initialization, atomic writes (including cancellation and reader errors), reads/stats/exists/delete/list semantics, Copy/Move behavior, path safety (absolute/parent traversal/symlink), and temp-file cleanup assertions.
GoMock test doubles for storage interfaces
cmd/api/src/services/storage/mocks/fs.go
Generated mock implementations for Storage, FileService, and FileServiceResolver with recorder helpers for unit tests.
StorageFileService and resolver tests
cmd/api/src/services/storage/storage_test.go
Tests for StorageFileService methods (GetFile/ReadFile/Write/Delete/Temp/Move/List), MoveFileBetweenServices orchestration and error joining, FileServiceResolver validation and immutability, and NewDefaultFileServices initialization and failure cases.
Dependency updates and upload validation
go.mod, cmd/api/src/services/upload/upload.go
Indirect AWS SDK v2 and smithy-go dependency patch bumps; added nil-check to WriteAndValidateFileWithPrefix.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • cweidenkeller
  • wes-mil
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.91% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding Storage, FileService, and FileServiceResolver, with the associated Jira ticket BED-8314.
Description check ✅ Passed The PR description follows the template structure with all major sections completed: Description, Motivation and Context (with ticket reference), How Has This Been Tested, Types of changes, and Checklist with items marked as complete.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mcuomo/BED-8314

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mykeelium mykeelium changed the base branch from main to mcuomo/BED-8313 May 20, 2026 18:06
Comment thread cmd/api/src/services/storage/storage.go Outdated
Base automatically changed from mcuomo/BED-8313 to main May 27, 2026 18:20
@mykeelium mykeelium marked this pull request as ready for review May 27, 2026 18:25
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/api/src/services/upload/upload.go (1)

96-103: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return an error when temp file close fails.

Line 96 only logs close failures, but Line 117 can still return success. That can surface partially written files as valid.

Proposed fix
-	if closeErr := tempFile.Close(); closeErr != nil {
-		slog.Error(
-			"Error closing temp file",
-			slog.String("file", tempFileName),
-			attr.Error(closeErr),
-		)
-	}
+	closeErr := tempFile.Close()
+	if closeErr != nil {
+		slog.Error(
+			"Error closing temp file",
+			slog.String("file", tempFileName),
+			attr.Error(closeErr),
+		)
+	}
@@
 	if validationErr != nil {
 		if removeErr := os.Remove(tempFileName); removeErr != nil {
@@
 		}
-		return "", validationErr
+		if closeErr != nil {
+			return "", errors.Join(validationErr, fmt.Errorf("error closing temp file: %w", closeErr))
+		}
+		return "", validationErr
 	}
+
+	if closeErr != nil {
+		if removeErr := os.Remove(tempFileName); removeErr != nil {
+			slog.Error(
+				"Error deleting temp file after close failure",
+				slog.String("file", tempFileName),
+				attr.Error(removeErr),
+			)
+		}
+		return "", fmt.Errorf("error closing temp file: %w", closeErr)
+	}

Also applies to: 116-117

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/api/src/services/upload/upload.go` around lines 96 - 103, The temp file
close error is only logged but the function can still return success; update the
upload flow so that after calling tempFile.Close() (check the closeErr from
tempFile.Close) you return or propagate that error instead of continuing; locate
the tempFile.Close() call and the subsequent success return around tempFileName
(and where the code later returns success near lines 116-117) and make the
function return an error (or wrap and return closeErr) when closeErr != nil so
partially written temp files are not treated as successful uploads.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/api/src/services/graphify/tasks.go`:
- Around line 51-52: The current early return in tasks.go drops any successful
per-file results from ExtractIngestFiles and leaves temp files behind; change
the logic around the call to ExtractIngestFiles so that when it returns
(fileData, err) you do not discard fileData: either return fileData along with
the error (i.e., return fileData, err) so the caller can handle successes, or,
if you must fail-fast, iterate the returned IngestFileData slice and perform
explicit cleanup of their temp paths before returning the error; update the
calling code that expects []IngestFileData so it handles non-empty slices with
non-nil errors.

In `@cmd/api/src/services/storage/storage.go`:
- Around line 151-159: In StorageFileService.ReadFile, preserve and return any
rc.Close() error instead of dropping it: after calling s.Storage.Get and
deferring/avoiding an unhandled defer, read bytes with io.ReadAll(rc) into
variables (data, readErr), then call closeErr := rc.Close(); if both readErr and
closeErr are non-nil return a wrapped/joined error (e.g., wrap readErr and
include closeErr), otherwise return whichever non-nil error (or data,nil).
Update the function around s.Storage.Get, io.ReadAll and rc.Close to implement
this error-joining logic.
- Around line 277-299: In NewDefaultFileServices, if any NewLocalStore call
fails you must close previously opened stores (workStore, ingestStore,
retainStore, collectorsStore) to avoid descriptor leaks; update the error paths
so that when, for example, NewLocalStore(cfg.TempDirectory()) returns an error
you call Close() on workStore before returning, and similarly close
workStore+ingestStore if retainStore creation fails, etc.; implement this by
checking err after each NewLocalStore and calling Close() on all non-nil
previously created stores (or use a small helper/cleanup slice to iterate Close
on failure) before returning the error.

In `@cmd/api/src/services/upload/upload.go`:
- Around line 75-93: The function WriteAndValidateFileWithPrefix currently calls
validationFunc without checking for nil; add a nil guard at the top of the
function (before invoking validationFunc) that returns a clear error if
validationFunc == nil, and ensure any opened tempFile is properly closed and
removed when returning due to this error (use tempFile.Close() and
os.Remove(tempFileName) if the temp file was created). Update references to
validationFunc, tempFile and tempFileName to ensure no nil dereference or
resource leak occurs when returning early.

---

Outside diff comments:
In `@cmd/api/src/services/upload/upload.go`:
- Around line 96-103: The temp file close error is only logged but the function
can still return success; update the upload flow so that after calling
tempFile.Close() (check the closeErr from tempFile.Close) you return or
propagate that error instead of continuing; locate the tempFile.Close() call and
the subsequent success return around tempFileName (and where the code later
returns success near lines 116-117) and make the function return an error (or
wrap and return closeErr) when closeErr != nil so partially written temp files
are not treated as successful uploads.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2531ec7b-1941-4c05-8647-aa849b70a559

📥 Commits

Reviewing files that changed from the base of the PR and between ccb3ed5 and f54015d.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • cmd/api/src/services/graphify/ingest_storage.go
  • cmd/api/src/services/graphify/ingest_storage_test.go
  • cmd/api/src/services/graphify/tasks.go
  • cmd/api/src/services/storage/localstore.go
  • cmd/api/src/services/storage/localstore_test.go
  • cmd/api/src/services/storage/mocks/fs.go
  • cmd/api/src/services/storage/storage.go
  • cmd/api/src/services/storage/storage_test.go
  • cmd/api/src/services/upload/upload.go
  • cmd/api/src/services/upload/upload_test.go
  • go.mod

Comment on lines +51 to 52
if fileData, err := ExtractIngestFiles(s.ctx, s.cfg, task.StoredFileName, task.OriginalFileName, task.FileType, "bh"); err != nil {
return []IngestFileData{}, err
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve partial extraction results on helper errors.

ExtractIngestFiles can return successful fileData entries alongside a non-nil aggregated error. Returning an empty slice here drops those per-file statuses and leaves any already-extracted temp files unprocessed and unremoved. Keep the partial results visible to the caller, and either continue ingesting the successful entries or explicitly clean up their paths before returning.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/api/src/services/graphify/tasks.go` around lines 51 - 52, The current
early return in tasks.go drops any successful per-file results from
ExtractIngestFiles and leaves temp files behind; change the logic around the
call to ExtractIngestFiles so that when it returns (fileData, err) you do not
discard fileData: either return fileData along with the error (i.e., return
fileData, err) so the caller can handle successes, or, if you must fail-fast,
iterate the returned IngestFileData slice and perform explicit cleanup of their
temp paths before returning the error; update the calling code that expects
[]IngestFileData so it handles non-empty slices with non-nil errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in this changeset. plus was existing functionality on a refactor PR. If desired I can add a change for this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mykeelium, that's a fair point — this behavior predates the refactor and the PR is intentionally not changing semantics. Would you like me to open a GitHub issue to track the follow-up improvement (returning partial fileData alongside errors and cleaning up temp files on failure)?


🧠 Learnings used
Learnt from: mvlipka
Repo: SpecterOps/BloodHound PR: 2382
File: cmd/api/src/api/v2/ad_related_entity.go:44-57
Timestamp: 2026-02-23T17:06:23.991Z
Learning: In BloodHound, AD nodes are represented by a base kind (ad.Entity) plus specific kinds (ad.User, ad.Computer, ad.Group, etc.), and Azure nodes by az.Entity. The query.KindIn function matches if a node has any of the specified kinds, so using the base kind (ad.Entity) will correctly include all AD nodes regardless of their specific subtype. During reviews, ensure code that filters or queries graph nodes via KindIn uses the base kind to capture all inherited subkinds for AD nodes, and similarly recognizes base kinds for Azure nodes. This design pattern should be consistently respected across files that model or query graph schema kinds.

Comment thread cmd/api/src/services/storage/storage.go
Comment thread cmd/api/src/services/storage/storage.go
Comment thread cmd/api/src/services/upload/upload.go
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api A pull request containing changes affecting the API code. enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants