Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
223 changes: 223 additions & 0 deletions 021-bucket-folders.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Folder Support for storage Buckets

- Implementation Owner: @loks0n
- Start Date: 06-10-2025
- Target Date: TBD
- Appwrite Issue: TBD

## Summary

[summary]: #summary

Add folder support to Appwrite Storage for organizing files. Folders are lightweight organizational containers - they don't have independent permissions, don't support nesting initially, and exist purely to help users sort and filter files within buckets.

## Problem Statement (Step 1)

[problem-statement]: #problem-statement

**What problem are you trying to solve?**

Large buckets become unmanageable. Users with hundreds or thousands of files have no way to organize them. Every file appears in a flat list. Finding specific files requires searching by name or using complex filtering.

## Design proposal (Step 2)

[design-proposal]: #design-proposal

### Design decisions

1. **Folders are organizational** - No independent permissions, compression, or encryption settings
2. **Flat structure initially** - No nested folders (folders can't contain folders)
3. **Backward compatible** - Existing files stay at root level
4. **Minimal overhead** - Folders are lightweight documents, not filesystem constructs

### API Endpoints

**POST /v1/storage/buckets/:bucketId/folders**
Create a folder.
```
bucketId: string (required)
folderId: string (required) - Custom ID or ID.unique()
name: string (required) - Folder name (max 255 chars)
permissions: array (optional) - Inherits bucket permissions if null
```

**GET /v1/storage/buckets/:bucketId/folders**
List folders in a bucket.
```
bucketId: string (required)
queries: array (optional) - Standard query support
search: string (optional)
```

**GET /v1/storage/buckets/:bucketId/folders/:folderId**
Get folder by ID.

**PUT /v1/storage/buckets/:bucketId/folders/:folderId**
Update folder name/permissions.

**DELETE /v1/storage/buckets/:bucketId/folders/:folderId**
Delete folder. Fails if folder contains files unless `force=true`.

**Modified: POST /v1/storage/buckets/:bucketId/files**
Add `folderId` parameter (optional, defaults to null = root level).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this really be folder ID? maybe path for better DX? we can generate the hierarchy on creation or simpler throw 404 if path doesn't exists and send to create a folder..


**Modified: GET /v1/storage/buckets/:bucketId/files**
Add `folderId` query parameter to filter files by folder.

### Data Structure

**New: Folders collection**
Stored in same `bucket_{sequence}` collection as files, differentiated by type.

```php
// Folder document
[
'$id' => 'unique_folder_id',
'type' => 'folder', // discriminator field
'bucketId' => 'bucket_id',
'bucketInternalId' => 123,
'name' => 'Invoices',
'$permissions' => [...], // same as bucket unless overridden
'search' => 'folder_id Invoices',
'filesCount' => 0, // cached count
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cached filesCount field lacks consistency guarantees for force delete.

Line 82 introduces a denormalized filesCount field on folder documents. However, the delete logic on lines 120–125 does not mention updating filesCount when files are moved to root during a force delete.

Additionally, there is no strategy described for keeping filesCount in sync:

  • When files are uploaded to a folder, who increments the count?
  • When files are deleted, who decrements it?
  • What happens if a count becomes stale due to errors?

Recommendation: Clarify in the Implementation Details section how filesCount is maintained during all file operations (create, delete, move between folders). Consider whether denormalization is worth the complexity, or rely on a count query instead for Phase 1.

Also applies to: 120-125

🤖 Prompt for AI Agents
In 021-bucket-folders.md around lines 82 and 120-125, the denormalized
'filesCount' field is introduced but no update/consistency strategy is specified
and force-delete file moves do not adjust the count; update the Implementation
Details to (1) state which operations and services update filesCount (uploads
increment, deletes decrement, moves decrement source/increment dest,
force-delete moving files to root must decrement original folder and increment
root if root tracks count), (2) require updates be done in the same transaction
where possible or via an atomic two-step (update document + compensating retry)
and describe a retry/backfill job to repair stale counts, (3) list failure modes
and mitigation (idempotent updates, optimistic locking or DB increment ops), and
(4) include a justification to drop denormalization for Phase 1 in favor of
realtime count queries if the added complexity is undesirable.

]

// Modified file document - add single field
[
// ... all existing fields ...
'folderId' => 'folder_id_or_null', // null = root level
]
```

**Required indexes:**
- `(bucketInternalId, type, name)` - unique constraint for folder names
- `(bucketInternalId, folderId)` - list files in folder
- `(bucketInternalId, type)` - list all folders
Comment on lines +93 to +95
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Unique index will break existing file uploads

The proposed unique index (bucketInternalId, type, name) applies to all documents in the bucket collection. Once the migration sets type = 'file' on every file, this index will forbid two files in the same bucket sharing the same name—something Appwrite currently allows. That’s a backward-incompatible change that will cause uploads (and even the migration itself) to fail in any bucket that already contains duplicate filenames. Please limit the uniqueness to folders only (e.g., partial index type = 'folder') or pick another discriminator so files remain unaffected.

🤖 Prompt for AI Agents
In 021-bucket-folders.md around lines 93-95, the suggested unique index
`(bucketInternalId, type, name)` would make filenames unique across all
documents once files get `type = 'file'`, breaking existing uploads; change the
index to only enforce uniqueness for folders by creating a partial/filtered
unique index that applies when `type = 'folder'` (or use a folder-specific
discriminator field), and update the migration steps to set `type` only for
folder documents (or populate the discriminator safely) so files are not
affected; ensure the migration creates the partial unique index after ensuring
all folder documents meet the constraint and include a fallback plan to detect
and resolve existing duplicate folder names before index creation.


Comment on lines +93 to +96
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

⚠️ Unresolved critical issue: unique index will break existing file uploads.

The proposed unique index (bucketInternalId, type, name) applies to all documents in the bucket collection. Once the migration sets type = 'file' on existing files (line 99), this index will forbid two files with the same name—something Appwrite currently allows. This will cause the migration itself to fail in any bucket with duplicate filenames, and future uploads will be blocked.

Recommendation: Create a partial/filtered unique index that applies only when type = 'folder', or separate the folder name constraint from the file namespace. Update line 99's migration to set type only for new folder documents, not existing files.

This was flagged in previous review comments but remains unaddressed. Please confirm the approach before proceeding.


I'll examine the RFC file and verify the critical index uniqueness concern.


---

🏁 Script executed:

```shell
#!/bin/bash
# Locate the RFC file
fd "021-bucket-folders.md" --type f

Length of output: 21


🏁 Script executed:

#!/bin/bash
# Extract and examine the index definitions and migration strategy
fd "021-bucket-folders.md" --type f --exec cat {} \; | head -120 | tail -40

Length of output: 1204


🏁 Script executed:

#!/bin/bash
# Get the full data structure and implementation sections
fd "021-bucket-folders.md" --type f --exec cat {} \; | grep -n -A 20 "Data Structure\|Implementation\|migration"

Length of output: 2247


🏁 Script executed:

#!/bin/bash
# Look for any index, constraint, or database-related documentation
fd "021-bucket-folders.md" --type f --exec cat {} \;

Length of output: 7168


Critical index design will break migrations on buckets with duplicate filenames.

The unique index (bucketInternalId, type, name) on lines 93-95 combined with the migration logic on lines 97-101 creates an unsolvable constraint conflict:

  • Existing files receive type = 'file' during migration
  • Unique constraint on (bucketInternalId, type, name) forbids two files with identical names in the same bucket
  • Appwrite currently permits duplicate file names (by design)
  • Result: Migration fails on any production bucket with duplicate filenames; future uploads are blocked

Fix: Implement a partial/filtered unique index applying only to folders (type = 'folder'), or separate folder names into a distinct namespace. Update migration to exclude setting type on existing files—apply it only to new folder documents.

This concern was flagged previously and remains unaddressed.

🤖 Prompt for AI Agents
In 021-bucket-folders.md around lines 93-96, the proposed unique index
`(bucketInternalId, type, name)` will break migrations because existing files
get `type='file'` and Appwrite allows duplicate filenames; adjust the schema and
migration so folder name uniqueness applies only to folders: create a
partial/filtered unique index that includes `WHERE type = 'folder'` (or maintain
a separate namespace/column for folder names) and change the migration to avoid
assigning `type` to existing file documents (only set `type='folder'` for newly
created folder records), ensuring existing duplicate file names are not
constrained during migration or afterwards.

**Migration:**
1. Add `folderId` field to all existing file documents (default null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get DB team (specifically Shmuel to help with this), we have a few migrations upcoming, could save time.

2. Add `type` field to all existing file documents (default 'file')
3. Create indexes
4. Deploy
Comment on lines +97 to +101
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Migration strategy lacks validation and rollback plan.

The migration on lines 97–101 is too brief:

  • No validation that all files will have folderId = null after migration
  • No check for existing duplicate folder names (the unique index depends on this)
  • No rollback procedure if the migration fails mid-deployment
  • No data consistency checks post-migration

Recommendation: Expand the migration plan to include:

  1. Pre-migration validation (check for duplicate folder names, if any exist pre-migration)
  2. Atomic application of folderId and type fields
  3. Index creation with constraint validation
  4. Post-migration verification (sample check that all files have folderId and type)
  5. Rollback steps if indexes fail to create
🤖 Prompt for AI Agents
In 021-bucket-folders.md around lines 97–101, the migration plan is
insufficient: add a pre-migration validation step to scan for duplicate folder
names and report/resolve conflicts before proceeding; implement an atomic update
strategy (e.g., staged updates with a transactional batch or write a new temp
field and switch pointers) to set folderId=null and type='file' together; when
creating indexes, use non-blocking/validated creation (build indexes in the
background or validate constraints against a sample) and include a pre-check
that the uniqueness constraint won’t be violated; add post-migration
verification scripts to sample and assert that all documents have folderId and
type set and that counts match pre-migration totals; and document explicit
rollback steps (how to drop partial fields, revert to backups or restore from
snapshot, and drop newly created indexes) to execute if any step fails
mid-deployment.


### Implementation Details

**Creating a folder:**
1. Validate bucket exists and user has CREATE permission
1. Check for duplicate folder name in bucket (unique constraint)
1. Generate ID if unique() passed
1. Inherit bucket permissions if none specified
1. Create folder document with type discriminator
1. Return folder

**Listing files with folder filter:**
1. If folderId param present:
1. folderId='root' → query where folderId IS NULL
1. Otherwise → verify folder exists, query where folderId = X
1. Apply standard queries/pagination
1. Return filtered files

**Deleting folder:**
1. Check folder exists and user has DELETE permission
1. Count files in folder
1. If files exist and force=false → error FOLDER_NOT_EMPTY
1. If force=true → set all child files' folderId to null (move to root)
1. Delete folder document

### Supporting Libraries

No new libraries required. Uses existing:
- Utopia Database for folder documents
- Existing validation/authorization stack
- Current storage device layer (untouched)

### Breaking Changes

**None.** Fully backward compatible:
- Migration applies existing files have `folderId = null` (root level)
- New optional parameters don't affect existing API calls

### Reliability (Tests & Benchmarks)

#### Benchmarks

Measure:
- Folder creation latency
- File listing with/without folder filter
- Moving files between folders
- Deleting folder with 1000+ files

#### Tests (UI, Unit, E2E)

**Unit tests:**
- Create folder with/without custom ID
- Duplicate folder name prevention
- List folders with queries/search
- Delete empty folder
- Delete folder with force flag
- File upload to folder
- List files filtered by folder
- Move file between folders

**E2E tests:**
- Create bucket → create folder → upload file → list by folder
- Delete folder with files (should fail)
- Delete folder with force (files move to root)
- Permission inheritance from bucket
- Search across folders

**Console UI:**
- Folder view in bucket files list
- Create folder button
- Drag-drop files into folders
- Breadcrumb navigation
- Folder delete confirmation

### Documentation & Content

**Docs needed:**
1. API reference for new folder endpoints
1. SDK examples for folder operations

### Prior art

[prior-art]: #prior-art

**AWS S3:** Uses prefixes, not real folders. Files with `/` in name create virtual hierarchy. Works but confusing - users expect folders to be entities they can rename/list.

**Google Drive API:** Folders are files with special MIME type. Can nest infinitely. Complex permission inheritance. Over-engineered for basic use cases.

**Dropbox API:** Clear folder vs file distinction. Simple listing. Nesting supported but not required. Our pattern is similiar.

**Firebase Storage:** Prefix-based like S3. No folder metadata. Awkward for apps that need folder operations.

### Unresolved questions

[unresolved-questions]: #unresolved-questions

1. **Folder permissions:** Initially inherit from bucket. Future: independent folder permissions?
2. **Nested folders:** Defer or include v1? Adds complexity (validation, infinite loops, path resolution).
3. **Moving folders:** If we add nesting later, do we support moving folders between folders?
4. **Folder metadata:** Do folders need custom metadata like files? Size limits?

### Future possibilities

[future-possibilities]: #future-possibilities

**Phase 2 - Nested folders:**
- Add `parentFolderId` to folder documents
- Validate against circular references
- Path reconstruction for breadcrumbs
- Max depth limit (e.g., 10 levels)

**Phase 3 - Advanced features:**
- Folder-level permissions (override bucket)
- Bulk operations (move all files in folder)
- Folder templates (create with predefined structure)
- Shared folders (special permission model)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory this would just be a feature of the permission model, no? just adding a team id or a role in permissions

- Folder hooks/events for automation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be similar to our current events? Can't we already expose them in phase 1?


**Phase 4 - Performance:**
- Materialized paths for fast hierarchy queries
- Denormalized file counts (already included)
- Folder statistics (total size, last modified)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would actually be nice in step 1, should be simple to implement and will save data migration later.