Skip to content

Conversation

@zhangzuo21
Copy link

A unified storage backend interface is implemented, it is called by FileStorage class, which manages upper put and load activities. Currently two types of storage backend inherit from the storage backend interface:

  • BucketStorageBackend: Multiple key-value pairs are grouped into a single file ("bucket").
  • FilePerKeyStorageBackend: Original storage backend supported, which storages each key-value pair is stored in a separate file. An adaptor is implemented to adapt its interface into the unified storage backend interface.

zhuxinjie-nz and others added 30 commits November 7, 2025 01:53
…le of key-value data, including creation, reading, and synchronizing metadata with the master.
# Conflicts:
#	mooncake-store/include/master_service.h
#	mooncake-store/include/storage_backend.h
#	mooncake-store/src/master_client.cpp
#	mooncake-store/src/master_service.cpp
#	mooncake-store/src/storage_backend.cpp
# Conflicts:
#	mooncake-store/include/real_client.h
#	mooncake-store/src/real_client.cpp
#	mooncake-store/src/real_client_main.cpp
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @zhangzuo21, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the storage layer by introducing a unified interface for storage backends. This architectural change allows the FileStorage component to be more flexible and extensible, supporting both bucket-based and file-per-key storage strategies through a common API. The refactoring involves creating a new interface, adapting existing storage logic to it, and updating the main FileStorage class to leverage this new design, ultimately improving the system's maintainability and adaptability to different storage needs.

Highlights

  • Unified Storage Interface: A new StorageBackendInterface has been introduced, providing a common contract for different storage backend implementations. This promotes modularity and simplifies future extensions to the storage system.
  • Backend Implementations and Adaptor: The existing BucketStorageBackend now inherits from the new interface. Additionally, a StorageBackendAdaptor has been created to wrap the original file-per-key storage logic, allowing it to conform to the StorageBackendInterface.
  • FileStorage Refactoring: The FileStorage class has been updated to depend on the StorageBackendInterface, enabling it to dynamically select and utilize different storage backends (bucket-based or file-per-key) based on configuration.
  • Enhanced Configuration: The FileStorageConfig struct has been moved to storage_backend.h and extended with new fields like storage_backend_desciptor, fsdir, and enable_eviction to support the new unified architecture and specific backend requirements.
  • Updated Initialization and Offloading Logic: The FileStorage::Init and FileStorage::IsEnableOffloading methods now delegate their responsibilities to the StorageBackendInterface implementations, streamlining the core logic and removing the standalone BucketIterator class.
  • Comprehensive Testing: New test cases have been added for the StorageBackendAdaptor to ensure correct functionality for batch offload, batch load, existence checks, metadata scanning, and offloading enablement, including scenarios involving persistence across restarts.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ykwd ykwd requested a review from zhuxinjie-nz December 9, 2025 07:47
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new StorageBackendInterface to abstract different storage implementations, specifically adding a StorageBackendAdaptor for a file-per-key storage mechanism alongside the existing BucketStorageBackend. The FileStorageConfig struct was moved to storage_backend.h and extended with backend-specific parameters like storage_backend_desciptor, fsdir, and enable_eviction, allowing FileStorage to dynamically select and initialize the appropriate backend. The BucketIterator logic was refactored and integrated directly into BucketStorageBackend's ScanMeta, HandleNext, and HasNext methods. Review comments primarily focused on improving code robustness and maintainability: ensuring FileStorageConfig::Validate() correctly propagates validation failures, making StorageBackendAdaptor::ScanMeta()'s directory traversal more flexible, adding missing override keywords to virtual function implementations in StorageBackendAdaptor and BucketStorageBackend, and addressing concerns about FileStorageConfig containing backend-specific fields. Additionally, comments noted that StorageBackendInterface's type_descriptor and config_ members should be private, StorageBackendAdaptor::ScanMeta's total_size calculation might be inconsistent with StorageObjectMetadata, and FileStorageConfig parameters in ScanMeta methods should be const references if not modified.

}
if(storage_backend_desciptor == "FilePerKeyBackend") {
auto full_path = std::filesystem::path(storage_filepath) / fsdir;
ValidatePath(full_path.string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The return value of ValidatePath(full_path.string()) is not checked here. If ValidatePath returns false, the Validate() function will still proceed and return true, which could lead to an invalid configuration being accepted. It should return false if ValidatePath fails.

        if(!ValidatePath(full_path.string())) {
            return false;
        }

Comment on lines 1063 to 1064
for (char d1 = 'a'; d1 <= 'p'; ++d1) {
for (char d2 = 'a'; d2 <= 'p'; ++d2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The loops for (char d1 = 'a'; d1 <= 'p'; ++d1) and for (char d2 = 'a'; d2 <= 'p'; ++d2) hardcode the directory structure based on the ResolvePath logic. This creates a tight coupling between ScanMeta and ResolvePath. If the hashing scheme or character range in ResolvePath ever changes, this scanning logic will break. It would be more robust to derive the iteration range dynamically or use a more abstract way to traverse the directory structure if possible.

Comment on lines +488 to +489
tl::expected<void, ErrorCode> BatchLoad(
const std::unordered_map<std::string, Slice>& batched_slices);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The BatchLoad method in StorageBackendAdaptor is declared without override. Explicitly using override improves code clarity and helps prevent subtle bugs.

    tl::expected<void, ErrorCode> BatchLoad(
        const std::unordered_map<std::string, Slice>& batched_slices) override;

tl::expected<void, ErrorCode> BatchLoad(
const std::unordered_map<std::string, Slice>& batched_slices);

tl::expected<bool, ErrorCode> IsExist(const std::string& key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The IsExist method in StorageBackendAdaptor is declared without override. It's recommended to use override for virtual function implementations.

    tl::expected<bool, ErrorCode> IsExist(const std::string& key) override;


// Limits for scanning and iteration operations, required by bucket backend only
int64_t bucket_iterator_keys_limit =
20000; // Max number of keys returned per Scan call, required by bucket backend only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This field is specific to the bucket backend. Consider if FileStorageConfig should contain only truly general parameters, with backend-specific settings nested or defined elsewhere.

Comment on lines 971 to 974
metadatas.emplace_back(StorageObjectMetadata{
-1, 0, static_cast<int64_t>(kv.key.size()),
static_cast<int64_t>(kv.value.size())
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The bucket_id is set to -1 for StorageObjectMetadata. While this might be an acceptable placeholder for the StorageBackendAdaptor (which doesn't use buckets), it's a magic number. Consider defining a named constant (e.g., kNoBucketId) to improve readability and indicate its specific meaning.

        metadatas.emplace_back(StorageObjectMetadata{
            /*bucket_id=*/-1, 0, static_cast<int64_t>(kv.key.size()), 
            static_cast<int64_t>(kv.value.size())
        });

}

tl::expected<void, ErrorCode> StorageBackendAdaptor::ScanMeta(
FileStorageConfig& cfg,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The FileStorageConfig& cfg parameter is passed by non-const reference to ScanMeta. This implies that the function might modify the configuration object. However, scanning metadata is typically a read-only operation. If cfg is not intended to be modified, it should be passed as const FileStorageConfig& cfg to ensure immutability and improve safety.

tl::expected<void, ErrorCode> StorageBackendAdaptor::ScanMeta(
    const FileStorageConfig& cfg,

// Path where data files are stored on disk
std::string storage_filepath = "/data/file_storage";

// Subdirectory name, required by file per key backend only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the previous comment, this field is specific to the file-per-key backend. A truly unified configuration might abstract these details or provide clear separation.

struct_pb::from_pb(kv, buf);

total_keys++;
total_size += buf.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The total_size += buf.size(); line adds the size of the serialized KV object (buf) to total_size. However, the StorageObjectMetadata (created on line 1089) stores kv.value.size(). This discrepancy means total_size will reflect the serialized size (including key and protobuf overhead) rather than just the value size, which might not align with the intended metric for total_size if it's meant to represent only the data payload.

return enable_offloading;
}

tl::expected<void, ErrorCode> BucketStorageBackend::ScanMeta(FileStorageConfig& config_, const std::function<ErrorCode(const std::vector<std::string>& keys,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The FileStorageConfig& config_ parameter is passed by non-const reference to ScanMeta. Similar to the StorageBackendAdaptor's ScanMeta, if the configuration is not intended to be modified during the scan, it should be passed as const FileStorageConfig& config_ to enforce read-only access.

tl::expected<void, ErrorCode> BucketStorageBackend::ScanMeta(const FileStorageConfig& config_, const std::function<ErrorCode(const std::vector<std::string>& keys,

@ykwd
Copy link
Collaborator

ykwd commented Dec 9, 2025

@maheshrbapatu Could you help take a look? Thanks

bool enable_eviction = true;

// Size of the local client-side buffer (used for caching or batching)
int64_t local_buffer_size = 1280 * 1024 * 1024; // ~1.2 GB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the constants here if possible.
where kKB = 1024
kMB = kkB * 1024
kGB = kMB * 1024

And all of these can be made constexpr

enum class FileMode { Read, Write };

struct FileStorageConfig {
// type of the storage backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, Just a suggestion,

Maybe we can break this big structure into child structures.

struct FilePerKeyConfig {
    std::string fsdir = "file_per_key_dir";
    bool enable_eviction = true;

    bool Validate() const {
        return !fsdir.empty();
    }
};

struct BucketBackendConfig {
    int64_t bucket_iterator_keys_limit = 20'000;    // tunable
    int64_t bucket_size_limit          = 256 * kMB; // tunable but bounded

    bool Validate() const {
        if (bucket_iterator_keys_limit <= 0) return false;
        if (bucket_size_limit <= 0 || bucket_size_limit > kMaxBucketSize) return false;
        return true;
    }
};


struct FileStorageConfig {
    // Type: "BucketBackend" or "FilePerKeyBackend"
    std::string storage_backend_descriptor = "BucketBackend";

    // Root storage path
    std::string storage_filepath = "/data/file_storage";

    // Optional backend-specific configs
    std::optional<FilePerKeyConfig> file_per_key;
    std::optional<BucketBackendConfig> bucket_backend;

    int64_t local_buffer_size = 1280 * kMB;  // ~1.2GB, tunable

    int64_t total_keys_limit  = kMaxTotalKeys;
    int64_t total_size_limit  = kMaxTotalStorageSize;

    uint32_t heartbeat_interval_seconds = 10;
     bool Validate();
     bool ValidatePath(const std::string& path);
    
    
        static FileStorageConfig FromEnvironment();

    static std::string GetEnvStringOr(const char* name,
                                      const std::string& default_value);

    template <typename T>
    static T GetEnvOr(const char* name, T default_value);
};


bool FileStorageConfig::Validate() const {
if (storage_filepath.empty()) {
bool FileStorageConfig::ValidatePath(std::string path) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also how about providing default implementaion for this in the interface?
do we expect different implementaiton for ValidatePath in BucketStorageBackend, FilePerObject backend?

}

bool FileStorageConfig::Validate() const {
if (storage_backend_descriptor != "FilePerKeyBackend" &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum class StorageBackendType {
kFilePerKey,
kBucket
};

Enum might be a good idea here,
And we can keep extending this enum as needed for new backends.

if (!ValidatePath(storage_filepath)) {
return false;
}
if (storage_backend_descriptor == "FilePerKeyBackend") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also switch case might be better over here

bool FileStorageConfig::Validate() const {
  if (!ValidatePath(storage_filepath)) {
      LOG(ERROR) << "Invalid storage_filepath: " << storage_filepath;
      return false;
  }

  if (total_keys_limit <= 0) {
      LOG(ERROR) << "total_keys_limit must be > 0";
      return false;
  }

  if (local_buffer_size <= 0) {
      LOG(ERROR) << "local_buffer_size must be > 0";
      return false;
  }

  switch (backend_type) {
      case StorageBackendType::kFilePerKey: {
          auto full_path =
              std::filesystem::path(storage_filepath) / fsdir;

          if (!ValidatePath(full_path.string())) {
              LOG(ERROR) << "Invalid FilePerKey full path: " << full_path;
              return false;
          }
          break;
      }

      case StorageBackendType::kBucket: {
          if (bucket_keys_limit <= 0) {
              LOG(ERROR) << "bucket_keys_limit must be > 0";
              return false;
          }

          if (bucket_size_limit <= 0) {
              LOG(ERROR) << "bucket_size_limit must be > 0";
              return false;
          }
          break;
      }

      default:
          LOG(ERROR) << "Unknown storage backend type";
          return false;
  }

  return true;
}

if (!config.Validate()) {
throw std::invalid_argument("Invalid FileStorage configuration");
}
if (config.storage_backend_descriptor == "BucketBackend") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create central factory pattern for creating this storage backend

std::shared_ptr<IStorageBackend>
CreateStorageBackend(const FileStorageConfig& config) {
    switch (config.backend_type) {
        case StorageBackendType::kBucket:
            return std::make_shared<BucketStorageBackend>(config);

        case StorageBackendType::kFilePerKey:
            return std::make_shared<StorageBackendAdaptor>(config);

        default:
            LOG(FATAL) << "Unsupported backend type";
    }
}

And we can intialize it using factory pattern rather than using strings.

FileStorage::FileStorage(std::shared_ptr<Client> client,
                         const std::string& local_rpc_addr,
                         const FileStorageConfig& config)
    : client_(client),
      local_rpc_addr_(local_rpc_addr),
      config_(config),
      client_buffer_allocator_(
          ClientBufferAllocator::create(config.local_buffer_size, "")) {

    if (!config_.Validate()) {
        throw std::invalid_argument("Invalid FileStorage configuration");
    }

    storage_backend_ = CreateStorageBackend(config_);
}


out.append(reinterpret_cast<const char*>(s.ptr), s.size);
}
return out;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an empty line here


int64_t total_size GUARDED_BY(mutex_);

struct KV {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a constructor for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also KV seems like an ambiguous name for production code.

struct KVEntry {
    std::string key;      // K tensor or its storage identifier
    std::string value;    // V tensor or its storage block

    KVEntry() = default;

    KVEntry(std::string k, std::string v)
        : key(std::move(k)), value(std::move(v)) {}

    YLT_REFL(KVEntry, key, value);
};

metadatas.reserve(batch_object.size());
keys.reserve(batch_object.size());
MutexLocker lock(&mutex_);
for (auto object : batch_object) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently unnecessary copy of object.
for (const auto& object : batch_object)


std::string kv_buf;
struct_pb::to_pb(kv, kv_buf);
auto store_result = storage_backend_->StoreObject(path, kv_buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to acquire and keep mutex during the entire duration of IO operation.
what is the exact critical section over here?

If the critical section is only total_keys we can do this.

Keeping the lock for the entire duration of IO will cause scalability issues
{
MutexLocker lock(&mutex_);
total_keys++;
total_size += kv_buf.size();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants