feat: Enhance filecache by Yundi339 · Pull Request #823 · ithewei/libhv

Yundi339 · 2026-03-29T09:13:27Z

代码方面，我提供了技术方向讨论和指导，实现由Claude辅助完成，PR仅供参考，若不合适可随时close。

背景

我加入了 FileCacheEx 作为 FileCache 的增强版。
最初发现的问题，是 HTTP 响应头超过 1KB（比如多个 Set-Cookie、Content-Security-Policy、CORS 头等）时， FileCache 会静默截断，于是我加入了 FileCacheEx 用来正确处理和错误报错提示。
虽然可以修改源码来增加 HEADER 长度，但是需要重新编译 libhv，不方便自动化引入。

FileCacheEx vs FileCache 对比

差异	FileCache	FileCacheEx
header 预留空间	硬编码 1KB (`HTTP_HEADER_MAX_LENGTH 1024`)	可配置，默认 4KB
`prepend_header()`	无返回值，溢出时静默丢弃	返回 `bool`，溢出返回 `false`
`is_modified()`	调用 `stat()` 与 POSIX `stat` 函数名冲突	使用 `::stat()` 修复命名冲突
最大缓存文件	硬编码 4MB	运行时可配 `SetMaxFileSize()`
最大缓存数量	硬编码 100	运行时可配（构造函数参数）
header 指标	无	暴露 `get_header_used()` / `get_header_remaining()`
线程安全	无 mutex	每个 entry 有 `std::mutex`，`prepend_header()` 和指标读取加锁
`resize_buf()`	无法调整 header 预留	支持自定义 `reserved` 参数

解决可能存在的问题

1：`stat()` 函数名冲突

// file_cache_s::is_modified()
bool is_modified() {
    time_t mtime = st.st_mtime;
    stat(filepath.c_str(), &st);   // ← 成员变量 st 遮蔽了 POSIX stat()
    return mtime != st.st_mtime;
}

struct stat st 是成员变量，而 stat() 是 POSIX 函数。在某些编译器下非限定调用 stat() 可能会解析到成员自身，可能导致编译错误或非预期行为。

处理方式：使用 ::stat() 显式指定全局作用域。

2：`read()` 短读未处理

// FileCache::Open()
int nread = read(fd, fc->filebuf.base, fc->filebuf.len);
if (nread != fc->filebuf.len) {   // ← 只调一次 read，可能短读

POSIX read() 不保证一次读完所有数据（信号中断 EINTR 等都可能导致短读）。大多数情况下单次调用没问题，但特定条件下可能返回不完整数据。

处理方式：循环读取直到全部读完：

while (remaining > 0) {
    int nread = read(fd, dst, remaining);
    if (nread <= 0) { /* 错误处理 */ }
    dst += nread;
    remaining -= nread;
}

3：缓存条目在初始化完成前就加入 LRU

// FileCache::Open()
fc = std::make_shared<file_cache_t>();
put(filepath, fc);   // ← 立即放入缓存
// ... 后面才读文件内容、设置 ETag 等

在多线程场景下，如果另一个线程在 put() 之后、文件读取完成之前调用 Get()，可能会取到一个内容为空的缓存条目。

处理方式：延迟 put() 到条目完全初始化之后（文件读取 + ETag + Content-Type 全部设好再入缓存）。

4：`resize_buf()` 后 `httpbuf` 可能悬空

// file_cache_s::resize_buf()
void resize_buf(int filesize) {
    buf.resize(HTTP_HEADER_MAX_LENGTH + filesize);
    filebuf.base = buf.base + HTTP_HEADER_MAX_LENGTH;
    filebuf.len = filesize;
    // httpbuf.base 仍然指向旧的（已释放的）内存！
}

HBuf::resize() 可能重新分配内存，如果 httpbuf.base 之前被设置过，可能变成悬空指针。

处理方式：resize_buf() 之后显式置空 httpbuf：

httpbuf.base = NULL;
httpbuf.len = 0;
header_used = 0;

5：`prepend_header()` 静默失败

void prepend_header(const char* header, int len) {
    if (len > HTTP_HEADER_MAX_LENGTH) return;  // ← 返回 void，调用方不知道失败了

header 超过预留空间时静默丢弃，调用方无法感知。在响应头较多的场景下（多个 Set-Cookie、CSP、CORS 等），可能导致响应发送不完整。

处理方式：返回 bool，失败返回 false，调用方可以回退到非缓冲发送。

6：所有参数编译期固定，无法运行时调整

#define HTTP_HEADER_MAX_LENGTH  1024       // 生产环境常常不够用
#define FILE_CACHE_MAX_SIZE     (1 << 22)  // 4MB，不可调

1024 字节的 HTTP 头预留空间，在添加 CSP、CORS、缓存控制、Cookie 等响应头后可能不够用
最大缓存文件大小无法根据服务器内存状况调整

处理方式：运行时可配置：

filecache.SetMaxHeaderLength(8192);   // 8K 头部预留
filecache.SetMaxFileSize(1 << 24);    // 16MB 最大缓存文件

默认头部预留从 1024 提升到 4096 字节。

…length - New FileCacheEx module alongside original FileCache (non-invasive) - Configurable max_header_length (was hardcoded HTTP_HEADER_MAX_LENGTH) - Configurable max_file_size, max_cache_num at runtime - prepend_header() returns bool to report success/failure - Expose header metrics: get_header_reserve(), get_header_used(), header_fits() - Fix stat() name collision in is_modified() using ::stat() - Add FileCacheEx.h to exported headers in cmake/vars.cmake

- Add per-entry mutex to file_cache_ex_s for thread-safe mutations - Hold fc->mutex during is_modified() / stat check to prevent data race - Hold fc->mutex during resize_buf() / read to prevent use-after-free - Make prepend_header() thread-safe with lock_guard - Defer put() into LRU cache until entry is fully initialized (prevents stale/incomplete cache entries on ERR_OVER_LIMIT) - Use read() loop to handle partial reads (EINTR) - Invalidate httpbuf pointers after resize_buf() reallocation - Thread-safe get_header_used() / get_header_remaining() accessors

- Revert _read/_close to plain read/close (POSIX compat via <io.h>) - Keep is_modified() using ::stat() (only called on non-Windows path) - Consistent with original FileCache.cpp Windows handling patterns

- Add docs/FileCacheEx.md (English API documentation) - Add docs/cn/FileCacheEx.md (Chinese API documentation) - Fix include comments to follow libhv convention (add 'import' keyword)

Prevent integer overflow when reserved < 0 is passed to resize_buf(), which would cause (size_t)reserved to become a very large value.

…erver

Copilot

Pull request overview

This PR introduces FileCacheEx as an enhanced replacement for the HTTP server’s existing FileCache, aiming to avoid silent HTTP header truncation, improve thread-safety, and make cache limits configurable at runtime.

Changes:

Replace HTTP server usage of FileCache with the new FileCacheEx implementation.
Add FileCacheEx module with configurable header reserve/max file size, safer buffer resizing, deferred cache insertion, and partial-read handling.
Add English/Chinese documentation and install the new header via CMake.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
http/server/HttpServer.cpp	Switch server privdata cache type to `FileCacheEx` and adjust timer callback casts.
http/server/HttpHandler.h	Replace `FileCache` pointers/types with `FileCacheEx` equivalents.
http/server/HttpHandler.cpp	Use `FileCacheEx::OpenParam`; adjust max-file-size eviction logic to runtime getter.
http/server/FileCacheEx.h	New enhanced cache entry + cache class API (mutex per entry, header reserve/metrics).
http/server/FileCacheEx.cpp	New cache implementation (deferred `put`, read loop, configurable reserve/limits).
docs/cn/FileCacheEx.md	Add Chinese documentation for `FileCacheEx`.
docs/FileCacheEx.md	Add English documentation for `FileCacheEx`.
cmake/vars.cmake	Add `FileCacheEx.h` to installed HTTP server headers list.

Comments suppressed due to low confidence (1)

http/server/HttpHandler.cpp:806

fc->prepend_header(...) now returns bool, but the return value is ignored. If the generated response header exceeds the reserved space, prepend_header returns false and leaves httpbuf unchanged (potentially stale/NULL), so this code can send an incorrect header or invalid buffer. Please handle the failure path (e.g., fall back to sending header and file content separately, or rebuild a contiguous buffer) instead of unconditionally using fc->httpbuf.

                header = pResp->Dump(true, false);
                fc->prepend_header(header.c_str(), header.size());
                *data = fc->httpbuf.base;
                *len = fc->httpbuf.len;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-29T09:17:49Z

+                // Loop to handle partial reads (EINTR, etc.)
+                char* dst = fc->filebuf.base;
+                size_t remaining = fc->filebuf.len;
+                while (remaining > 0) {
+                    int nread = read(fd, dst, remaining);
+                    if (nread <= 0) {
+                        hloge("Failed to read file: %s", filepath);
+                        param->error = ERR_READ_FILE;
+                        return NULL;
+                    }


The file read loop uses int nread = read(...) and treats any nread <= 0 as fatal. For POSIX read, the return type is ssize_t, and -1 with errno == EINTR should be retried rather than failing the request. Please switch to ssize_t and explicitly handle EINTR (and any other retryable errors you want) inside the loop.

Copilot · 2026-03-29T09:17:50Z

+        fd = open(filepath, flags);
+#endif
+        if (fd < 0) {
+            param->error = ERR_OPEN_FILE;
+            return NULL;
+        }
+        defer(if (fd > 0) { close(fd); })
+        if (fc == NULL) {


defer(if (fd > 0) { close(fd); }) can leak a valid file descriptor if open() returns 0 (possible if stdin is closed). Since this is new code, it would be better to close any fd >= 0 that actually represents an opened file, and avoid using 0 as a directory sentinel (or use a separate boolean) to keep the close logic correct.

Copilot · 2026-03-29T09:17:50Z

+void FileCacheEx::RemoveExpiredFileCache() {
+    time_t now = time(NULL);
+    remove_if([this, now](const std::string& filepath, const file_cache_ex_ptr& fc) {
+        return (now - fc->stat_time > expired_time);


RemoveExpiredFileCache reads fc->stat_time without taking the per-entry mutex. Since Open() updates stat_time under fc->mutex, this is a data race and also contradicts the thread-safety guarantees described in the docs. Please lock fc->mutex (or make stat_time atomic) when evaluating expiry in the remove_if predicate.

Suggested change

return (now - fc->stat_time > expired_time);

std::lock_guard<std::mutex> lock(fc->mutex);

time_t stat_time = fc->stat_time;

return (now - stat_time > expired_time);

Copilot · 2026-03-29T09:17:50Z

+    int  get_header_reserve()  const { return header_reserve; }
+    int  get_header_used()     const { std::lock_guard<std::mutex> lock(mutex); return header_used; }
+    int  get_header_remaining() const { std::lock_guard<std::mutex> lock(mutex); return header_reserve - header_used; }
+    bool header_fits(int len)  const { return len > 0 && len <= header_reserve; }


These are labeled as "thread-safe accessors", but get_header_reserve() and header_fits() read header_reserve without locking, even though resize_buf() can modify header_reserve under the mutex. This introduces a data race and makes the thread-safety claim inaccurate. Please either take the mutex in these accessors as well, or document them as requiring the caller to hold mutex.

Suggested change

int get_header_reserve() const { return header_reserve; }

int get_header_used() const { std::lock_guard<std::mutex> lock(mutex); return header_used; }

int get_header_remaining() const { std::lock_guard<std::mutex> lock(mutex); return header_reserve - header_used; }

bool header_fits(int len) const { return len > 0 && len <= header_reserve; }

int get_header_reserve() const { std::lock_guard<std::mutex> lock(mutex); return header_reserve; }

int get_header_used() const { std::lock_guard<std::mutex> lock(mutex); return header_used; }

int get_header_remaining() const { std::lock_guard<std::mutex> lock(mutex); return header_reserve - header_used; }

bool header_fits(int len) const { std::lock_guard<std::mutex> lock(mutex); return len > 0 && len <= header_reserve; }

Copilot · 2026-03-29T09:17:50Z

+ *   1. Configurable max_header_length (no more hardcoded 4096)
+ *   2. prepend_header() returns bool to report success/failure
+ *   3. Exposes header/buffer metrics via accessors
+ *   4. Fixes stat() name collision in is_modified()
+ *   5. max_cache_num / max_file_size configurable at runtime
+ *   6. Reserved header space can be tuned per-instance
+ *   7. Fully backward-compatible struct layout


The header comment says "Configurable max_header_length (no more hardcoded 4096)" and "Fully backward-compatible struct layout", but FILECACHE_EX_DEFAULT_HEADER_LENGTH is still 4096 by default and the struct layout is not compatible with file_cache_t (added mutex + new fields). Please clarify this comment to avoid misleading readers about ABI/layout compatibility and defaults.

Suggested change

* 1. Configurable max_header_length (no more hardcoded 4096)

* 2. prepend_header() returns bool to report success/failure

* 3. Exposes header/buffer metrics via accessors

* 4. Fixes stat() name collision in is_modified()

* 5. max_cache_num / max_file_size configurable at runtime

* 6. Reserved header space can be tuned per-instance

* 7. Fully backward-compatible struct layout

* 1. Configurable max_header_length (default 4096, tunable per-instance)

* 2. prepend_header() returns bool to report success/failure

* 3. Exposes header/buffer metrics via accessors

* 4. Fixes stat() name collision in is_modified()

* 5. max_cache_num / max_file_size configurable at runtime

* 6. Reserved header space can be tuned per-instance

* 7. Source-level API compatible; struct layout differs from file_cache_t (no ABI/layout compatibility)

Copilot · 2026-03-29T09:17:51Z

 set(HTTP_SERVER_HEADERS
    http/server/HttpServer.h
    http/server/HttpService.h
    http/server/HttpContext.h
    http/server/HttpResponseWriter.h
    http/server/WebSocketServer.h
+    http/server/FileCacheEx.h
 )


FileCacheEx.h is added to the CMake installed header list, but the repo also has a Makefile-based install flow (Makefile.vars) with its own HTTP_SERVER_HEADERS list. To keep both build systems consistent, please add http/server/FileCacheEx.h to Makefile.vars as well (otherwise make install won’t install the new public header).

ithewei · 2026-03-30T10:17:51Z

直接删除原来的FileCache，然后重命名FileCacheEx为FileCache吧，没必要还保留原来的实现了

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

http/server/FileCache.cpp:76

In the slow path you stat() into local st, but for already-cached entries you never assign fc->st = st under fc->mutex before later using fc->st (size/type) when param->need_read is true. If the file changes within stat_interval (so modified stays false), a follow-up GET can re-read using a stale fc->st.st_size and mis-handle reads/ETag. Consider updating fc->st from the freshly obtained st whenever you take this path (especially when re-reading content).

    if (fc == NULL || modified || param->need_read) {
        struct stat st;
        int flags = O_RDONLY;
#ifdef O_BINARY
        flags |= O_BINARY;
#endif
        int fd = -1;
#ifdef OS_WIN
        if (wfilepath.empty()) wfilepath = hv::utf8_to_wchar(filepath);
        if (_wstat(wfilepath.c_str(), (struct _stat*)&st) != 0) {
            param->error = ERR_OPEN_FILE;
            return NULL;
        }
        if (S_ISREG(st.st_mode)) {
            fd = _wopen(wfilepath.c_str(), flags);
        } else if (S_ISDIR(st.st_mode)) {
            fd = 0;
        }
#else
        if (::stat(filepath, &st) != 0) {
            param->error = ERR_OPEN_FILE;
            return NULL;
        }
        fd = open(filepath, flags);
#endif

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-05T16:20:47Z

+    bool prepend_header(const char* header, int len) {
+        std::lock_guard<std::mutex> lock(mutex);
+        if (len <= 0 || len > header_reserve) return false;
        httpbuf.base = filebuf.base - len;
-        httpbuf.len = len + filebuf.len;
+        httpbuf.len = (size_t)len + filebuf.len;


file_cache_t::prepend_header now returns false when the header doesn’t fit, but it leaves httpbuf unchanged (and resize_buf() explicitly invalidates httpbuf). Callers that still read fc->httpbuf unconditionally can end up sending an empty/invalid response. Consider either (1) always setting httpbuf to a safe fallback (e.g., point at filebuf / clear and document) on failure, and/or (2) requiring/updating callers to check the return value and fall back to non-cached header+body sending.

Copilot · 2026-04-05T16:20:47Z

+    void SetMaxHeaderLength(int len)    { max_header_length = len; }
+    void SetMaxFileSize(int size)       { max_file_size = size; }


SetMaxHeaderLength / SetMaxFileSize accept negative values, which can lead to surprising behavior (e.g., reserved header space clamped to 0 at resize time, or max_file_size < 0). It would be safer to validate/clamp in the setters (e.g., minimum 0/1) so misconfiguration fails fast and predictably.

Suggested change

void SetMaxHeaderLength(int len) { max_header_length = len; }

void SetMaxFileSize(int size) { max_file_size = size; }

void SetMaxHeaderLength(int len) { max_header_length = len < 0 ? 0 : len; }

void SetMaxFileSize(int size) { max_file_size = size < 1 ? 1 : size; }

Copilot · 2026-04-05T16:20:47Z

        gmtime_fmt(fc->st.st_mtime, fc->last_modified);
-        snprintf(fc->etag, sizeof(fc->etag), ETAG_FMT, (size_t)fc->st.st_mtime, (size_t)fc->st.st_size);
+        snprintf(fc->etag, sizeof(fc->etag), ETAG_FMT,
+                 (size_t)fc->st.st_mtime, (size_t)fc->st.st_size);
+        // Cache the fully initialized entry
+        put(filepath, fc);


Potential deadlock: Open() holds fc->mutex for the remainder of initialization and then calls put(filepath, fc), which acquires the LRUCache internal mutex. Meanwhile RemoveExpiredFileCache() acquires the LRUCache mutex (via remove_if) and then locks fc->mutex. This lock-order inversion can deadlock under concurrent traffic. Avoid calling put/remove/contains/remove_if while holding fc->mutex, or change the expiration logic to avoid locking fc->mutex while LRUCache is locked (e.g., snapshot fields needed for eviction without taking fc->mutex, or collect keys to remove and erase them after releasing the cache lock).

Copilot · 2026-04-05T16:20:48Z

 void FileCache::RemoveExpiredFileCache() {
    time_t now = time(NULL);
    remove_if([this, now](const std::string& filepath, const file_cache_ptr& fc) {
+        std::lock_guard<std::mutex> lock(fc->mutex);


remove_if runs the predicate while holding the LRUCache internal mutex; locking fc->mutex inside the predicate introduces a lock-order inversion with Open() (which locks fc->mutex and later calls put()). This can deadlock. Consider avoiding per-entry locking inside remove_if predicates (e.g., store stat_time as an atomic, or do a two-phase approach: gather expired keys under cache lock without locking entries, then remove them).

Suggested change

std::lock_guard<std::mutex> lock(fc->mutex);

std::unique_lock<std::mutex> lock(fc->mutex, std::try_to_lock);

if (!lock.owns_lock()) {

return false;

}

Copilot · 2026-04-05T16:20:48Z

+#ifdef OS_WIN
+        defer(if (fd > 0) { close(fd); })  // fd=0 is Windows directory sentinel
+#else


On Windows, using fd=0 as a directory sentinel combined with defer(if (fd > 0) close(fd);) can leak a real file descriptor if _wopen() happens to return 0 (possible if stdin is closed). Consider using a non-valid sentinel (e.g., -1/-2) for directories or tracking is_dir separately, and close any real fd with fd >= 0 when appropriate.

Copilot · 2026-04-05T16:20:48Z

+            // NOTE: remove file cache if > max_file_size
+            if (fc && fc->filebuf.len > files->GetMaxFileSize()) {
                files->Close(fc->filepath.c_str());
            }


files->GetMaxFileSize() is used here to decide whether to evict the cached entry, but FileCache::max_file_size is never wired up to HttpService::max_file_cache_size (which controls OpenParam.max_read). If max_file_cache_size is increased, entries may still be evicted immediately due to the default 4MB FileCache::max_file_size. Consider initializing FileCache::max_file_size from service->max_file_cache_size when the server starts (same place stat_interval/expired_time are configured) so caching behavior is consistent.

Suggested change

// NOTE: remove file cache if > max_file_size

if (fc && fc->filebuf.len > files->GetMaxFileSize()) {

files->Close(fc->filepath.c_str());

}

// Avoid immediately evicting the just-served cached file based on a

// potentially stale FileCache max_file_size setting. Cache size policy

// should be enforced where the cache is configured/populated so it stays

// consistent with the service-level max_file_cache_size.

…add max_file_size setter

…r handling

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-05T17:05:43Z

                // FileCache
                // NOTE: no copy filebuf, more efficient
                header = pResp->Dump(true, false);
-                fc->prepend_header(header.c_str(), header.size());
-                *data = fc->httpbuf.base;
-                *len = fc->httpbuf.len;
-                state = SEND_DONE;
-                return *len;
+                if (fc->prepend_header(header.c_str(), header.size())) {
+                    *data = fc->httpbuf.base;
+                    *len = fc->httpbuf.len;
+                    state = SEND_DONE;
+                    return *len;
+                }


fc->prepend_header() writes the response header into the shared cache buffer (fc->buf) and then GetSendData() returns a pointer into that shared buffer. Since FileCache is shared across worker threads and multiple requests can serve the same file_cache_t concurrently, another thread can call prepend_header() and overwrite the header region while the current thread is still inside hio_write() copying/issuing the write, leading to a data race and potentially corrupted responses. To make this safe, avoid mutating shared cache memory for per-request headers (e.g., send header separately and then send fc->filebuf), or refactor so the entry mutex is held across the entire write/copy window (would require changing the GetSendData/SendHttpResponse interface).

Copilot · 2026-04-05T17:05:44Z

+        std::lock_guard<std::mutex> lock(mutex);
+        if (len <= 0 || len > header_reserve) {
+            // Safe fallback: point httpbuf at filebuf so callers always get valid data
+            httpbuf = filebuf;


On prepend_header() failure (len <= 0 || len > header_reserve), header_used is not reset. If a previous call succeeded, metrics returned by get_header_used() / get_header_remaining() will be stale/misleading after a failed prepend. Consider setting header_used = 0 in the failure path as well.

Suggested change

httpbuf = filebuf;

httpbuf = filebuf;

header_used = 0;

Copilot · 2026-04-05T17:05:44Z

+    // --- configurable parameters (were hardcoded macros before) ---
+    int stat_interval;      // seconds between stat() checks
+    int expired_time;       // seconds before cache entry expires
+    int max_header_length;  // reserved header bytes per entry
+    int max_file_size;      // max cached file size (larger = large-file path)

-    FileCache(size_t capacity = FILE_CACHE_MAX_NUM);
+    explicit FileCache(size_t capacity = FILE_CACHE_DEFAULT_MAX_NUM);

    struct OpenParam {
-        bool need_read;
-        int  max_read;
-        const char* path;
-        size_t  filesize;
-        int  error;
+        bool    need_read;
+        int     max_read;       // per-request override for max file size
+        const char* path;       // URL path (for directory listing)
+        size_t  filesize;       // [out] actual file size
+        int     error;          // [out] error code if Open returns NULL


max_file_size / OpenParam::max_read are declared as int, but they represent a byte size and are compared against st.st_size (typically off_t, potentially >2GB). Using int can overflow/truncate on large files and makes it harder to configure sizes beyond INT_MAX. Consider switching these fields and related APIs to size_t (or uint64_t) so large-file thresholds work correctly on 64-bit platforms.

Copilot · 2026-04-05T17:05:44Z

    struct OpenParam {
-        bool need_read;
-        int  max_read;
-        const char* path;
-        size_t  filesize;
-        int  error;
+        bool    need_read;
+        int     max_read;       // per-request override for max file size
+        const char* path;       // URL path (for directory listing)
+        size_t  filesize;       // [out] actual file size
+        int     error;          // [out] error code if Open returns NULL

        OpenParam() {
            need_read = true;
-            max_read = FILE_CACHE_MAX_SIZE;
+            max_read = FILE_CACHE_DEFAULT_MAX_FILE_SIZE;
            path = "/";
            filesize = 0;
            error = 0;
        }


OpenParam::max_read defaults to FILE_CACHE_DEFAULT_MAX_FILE_SIZE, while FileCache now has a runtime-configurable max_file_size (and HttpServer sets it from service->max_file_cache_size). Callers that don't explicitly set param.max_read will silently ignore the instance’s configured max_file_size, which can lead to inconsistent caching behavior across call sites. Consider defaulting OpenParam::max_read from the owning FileCache instance (e.g., in Open() when param->max_read is unset/0), or remove one of these knobs to keep a single source of truth.

Copilot · 2026-04-05T17:05:45Z

        FileCache* filecache = &privdata->filecache;
        filecache->stat_interval = service->file_cache_stat_interval;
        filecache->expired_time = service->file_cache_expired_time;
+        filecache->max_file_size = service->max_file_cache_size;


This assigns filecache->max_file_size directly, bypassing the new SetMaxFileSize() clamping/validation logic. If service->max_file_cache_size can be 0 or negative (e.g., misconfig to disable caching), this can cause surprising behavior in later comparisons/removals. Prefer calling filecache->SetMaxFileSize(service->max_file_cache_size) here to keep invariants consistent.

Suggested change

filecache->max_file_size = service->max_file_cache_size;

filecache->SetMaxFileSize(service->max_file_cache_size);

…use setter for max_file_size

Yundi339 · 2026-04-05T17:14:31Z

基本问题已修复
遗留问题：
1、prepend_header 多线程竞争写共享 buffer
原理正确，但这是 libhv 原始设计的固有问题。两个线程同时服务同一个缓存文件，都调用 prepend_header，第二个会覆盖第一个正在发送的 header 区域。解决需要重构整个 GetSendData/SendHttpResponse 接口（复制到 per-request buffer 或持锁跨越整个写操作）。超出本 PR 范围，不修。

2、int 溢出（>2GB 文件）
理论正确，但 HttpService::max_file_cache_size 本身就是 int，缓存 >2GB 文件不合理。改 size_t 需改 HttpService 公共 API。不修。

3、OpenParam::max_read 默认值与实例配置不一致
理论正确，但主要调用者都已显式设置 param.max_read = service->max_file_cache_size。error page 路径默认 4MB 也足够。不修。

ithewei · 2026-04-07T03:34:32Z

    http/server/HttpContext.h
    http/server/HttpResponseWriter.h
    http/server/WebSocketServer.h
+    http/server/FileCache.h


这个头文件不需要暴露出去吧

ithewei · 2026-04-07T03:36:53Z

+                    char* dst = fc->filebuf.base;
+                    size_t remaining = fc->filebuf.len;
+                    while (remaining > 0) {
+                        ssize_t nread = read(fd, dst, remaining);


windows下ssize_t未定义，编译报错

…n Open method

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T05:43:38Z

+                    while (remaining > 0) {
+                        int nread = read(fd, dst, remaining);
+                        if (nread < 0) {
+                            if (errno == EINTR) continue;
+                            hloge("Failed to read file: %s", filepath);
+                            param->error = ERR_READ_FILE;
+                            return NULL;
+                        }
+                        if (nread == 0) {
+                            hloge("Unexpected EOF reading file: %s", filepath);
+                            param->error = ERR_READ_FILE;
+                            return NULL;
+                        }
+                        dst += nread;
+                        remaining -= nread;
+                    }


read(2) returns ssize_t, but the loop stores it in an int. If max_read is configured above INT_MAX (or on platforms where ssize_t is wider), this can truncate and break the loop logic. Use ssize_t for nread (and pass a size_t count that is capped to SSIZE_MAX if needed).

暂不考虑超过INT_MAX的情况

Copilot · 2026-04-07T05:43:38Z

+    // Thread-safe: prepend header into reserved space.
+    // Returns true on success, false if header exceeds reserved space.
+    // On failure, httpbuf falls back to filebuf (body only, no header).
+    bool prepend_header(const char* header, int len) {
+        std::lock_guard<std::mutex> lock(mutex);
+        if (len <= 0 || len > header_reserve) {
+            // Safe fallback: point httpbuf at filebuf so callers always get valid data
+            httpbuf = filebuf;
+            header_used = 0;
+            return false;
+        }
        httpbuf.base = filebuf.base - len;
-        httpbuf.len = len + filebuf.len;
+        httpbuf.len = (size_t)len + filebuf.len;
        memcpy(httpbuf.base, header, len);
+        header_used = len;
+        return true;
    }


prepend_header() writes into the shared reserved space of the cached entry and updates httpbuf to point at that region. Even with the mutex, callers read/use fc->httpbuf after the lock is released, so concurrent requests can observe httpbuf changing underneath them or have their header bytes overwritten. Consider redesigning to keep the cache entry immutable for serving (store only the body) and construct headers per request without modifying the shared entry.

暂不处理

Yundi339 · 2026-04-10T13:03:08Z

心情复杂，虽然我本地使用的是opus4.6，但是这个AI审查感觉 bug 越修越多，
我听说 rabbit 可以提高比较高质量的代码审查，后面我试试看

… with file body Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T13:28:39Z

                fc = std::make_shared<file_cache_t>();
                fc->filepath = filepath;
                fc->st = st;
+                fc->header_reserve = max_header_length;
                time(&fc->open_time);
                fc->stat_time = fc->open_time;
                fc->stat_cnt = 1;
-                put(filepath, fc);
-            }
-            else {
+                // NOTE: do NOT put() into cache yet — defer until fully initialized
+            } else {


Deferring put() until after initialization avoids exposing partially-filled entries, but it also allows concurrent cache misses for the same filepath to start duplicate reads/allocations and then race to put() the final entry. Consider adding an in-flight load coordination mechanism (eg, insert a placeholder entry early plus a ‘loading’ state/condvar, or a per-key mutex) so only one thread loads a given file at a time while others wait or reuse the same entry.

Copilot · 2026-04-10T13:28:40Z

        OpenParam() {
            need_read = true;
-            max_read = FILE_CACHE_MAX_SIZE;
+            max_read = FILE_CACHE_DEFAULT_MAX_FILE_SIZE;
            path = "/";


OpenParam::max_read defaults to FILE_CACHE_DEFAULT_MAX_FILE_SIZE, so calling FileCache::SetMaxFileSize() does not affect Open() for callers that rely on the default OpenParam (they’ll still use 4MB unless they manually override max_read). Consider making Open() apply max_file_size as the default/upper bound when max_read is unset, or initialize OpenParam::max_read from the owning FileCache’s max_file_size to keep the API behavior consistent.

Copilot · 2026-04-10T13:28:40Z

-            // NOTE: remove file cache if > FILE_CACHE_MAX_SIZE
-            if (fc && fc->filebuf.len > FILE_CACHE_MAX_SIZE) {
+            // NOTE: remove file cache if > max_file_size
+            if (fc && fc->filebuf.len > files->GetMaxFileSize()) {


This compares size_t (fc->filebuf.len) with int (GetMaxFileSize()). To avoid signed/unsigned warnings and edge cases when values exceed INT_MAX, consider making GetMaxFileSize() return size_t (or cast the return value to size_t at the call site).

Suggested change

if (fc && fc->filebuf.len > files->GetMaxFileSize()) {

if (fc && fc->filebuf.len > static_cast<size_t>(files->GetMaxFileSize())) {

…NFINISHED (ithewei#828) * Initial plan * fix: prevent postprocessor/errorHandler from overriding HTTP_STATUS_UNFINISHED When a handler returns HTTP_STATUS_UNFINISHED (e.g., sendLargeFile), the postprocessor's return value was overwriting status_code, causing the connection to be treated as complete. This resulted in a premature empty response being sent and the connection being reset/closed. The fix uses local variables for the postprocessor and errorHandler return values, only checking for HTTP_STATUS_CLOSE without modifying status_code. This preserves the UNFINISHED status while still supporting the HTTP_STATUS_CLOSE feature added in commit 62cd137. Agent-Logs-Url: https://github.com/ithewei/libhv/sessions/aa478496-9e54-4c62-b0f7-810817b3867f Co-authored-by: ithewei <26049660+ithewei@users.noreply.github.com> * Refactor postprocessor variable name for clarity --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ithewei <26049660+ithewei@users.noreply.github.com> Co-authored-by: ithewei <ithewei@163.com>

* feat: optimize http router using trie and fix wildcard match (ithewei#833) * docs: update PLAN.md * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix: apply suggestions for base dir from ai code review * fix: apply suggestions for cpputil dir from ai code review * fix: apply suggestions for ssl dir from ai code review * fix: apply suggestions for event dir from ai code review

Yundi339 added 6 commits March 26, 2026 13:18

refactor: align Windows compat with original libhv patterns

f14db65

- Revert _read/_close to plain read/close (POSIX compat via <io.h>) - Keep is_modified() using ::stat() (only called on non-Windows path) - Consistent with original FileCache.cpp Windows handling patterns

docs: add FileCacheEx documentation (en/cn) and fix comment style

f9bae97

- Add docs/FileCacheEx.md (English API documentation) - Add docs/cn/FileCacheEx.md (Chinese API documentation) - Fix include comments to follow libhv convention (add 'import' keyword)

fix: add bounds check for negative reserved in resize_buf

f5e9a67

Prevent integer overflow when reserved < 0 is passed to resize_buf(), which would cause (size_t)reserved to become a very large value.

refactor: replace FileCache with FileCacheEx in HttpHandler and HttpS…

7c7d63e

…erver

Copilot AI review requested due to automatic review settings March 29, 2026 09:13

Copilot started reviewing on behalf of Yundi339 March 29, 2026 09:13 View session

Copilot AI reviewed Mar 29, 2026

View reviewed changes

Yundi339 added 2 commits April 5, 2026 23:46

Merge branch 'ithewei:master' into feature/enhanced-filecache

1f83fce

Refactor FileCache and remove FileCacheEx

a9790f6

Copilot AI review requested due to automatic review settings April 5, 2026 16:15

Copilot started reviewing on behalf of Yundi339 April 5, 2026 16:16 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

Yundi339 added 2 commits April 6, 2026 00:40

refactor: enhance FileCache with safe fallback in prepend_header and …

16c24e7

…add max_file_size setter

refactor: improve FileCache Open method and enhance HttpHandler heade…

4fd86b4

…r handling

Yundi339 requested a review from Copilot April 5, 2026 16:59

Copilot started reviewing on behalf of Yundi339 April 5, 2026 16:59 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

refactor: update FileCache to reset header_used on safe fallback and …

f1850b9

…use setter for max_file_size

ithewei reviewed Apr 7, 2026

View reviewed changes

refactor: update FileCache to change nread type from ssize_t to int i…

8cd32ab

…n Open method

Copilot AI review requested due to automatic review settings April 7, 2026 05:35

Copilot AI reviewed Apr 7, 2026

View reviewed changes

rm deprecated comment

4c1d30c

Header too large for reserved space: send header first, then continue…

731a0a8

… with file body Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 10, 2026 13:17

Copilot started reviewing on behalf of ithewei April 10, 2026 13:17 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Copilot AI and others added 7 commits May 24, 2026 02:26

fix: thread unsafe call to localtime on linux (ithewei#835)

15809f4

chore: add AGENTS.md for Coding Agents (ithewei#826)

19f3e28

Apply suggestions from ai code review (ithewei#837)

9c13599

* fix: apply suggestions for base dir from ai code review * fix: apply suggestions for cpputil dir from ai code review * fix: apply suggestions for ssl dir from ai code review * fix: apply suggestions for event dir from ai code review

Apply suggestions for evpp dir from ai code review (ithewei#838)

33ce9fc

Apply suggestions for http dir from ai code review (ithewei#839)

51089c6

		void SetMaxHeaderLength(int len) { max_header_length = len; }
		void SetMaxFileSize(int size) { max_file_size = size; }

-            // NOTE: remove file cache if > max_file_size
-            if (fc && fc->filebuf.len > files->GetMaxFileSize()) {
-                files->Close(fc->filepath.c_str());
-            }
+            // Avoid immediately evicting the just-served cached file based on a
+            // potentially stale FileCache max_file_size setting. Cache size policy
+            // should be enforced where the cache is configured/populated so it stays
+            // consistent with the service-level max_file_cache_size.

	filecache->max_file_size = service->max_file_cache_size;
	filecache->SetMaxFileSize(service->max_file_cache_size);

	if (fc && fc->filebuf.len > files->GetMaxFileSize()) {
	if (fc && fc->filebuf.len > static_cast<size_t>(files->GetMaxFileSize())) {

Conversation

Yundi339 commented Mar 29, 2026

背景

FileCacheEx vs FileCache 对比

解决可能存在的问题

1：stat() 函数名冲突

2：read() 短读未处理

3：缓存条目在初始化完成前就加入 LRU

4：resize_buf() 后 httpbuf 可能悬空

5：prepend_header() 静默失败

6：所有参数编译期固定，无法运行时调整

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

ithewei commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Yundi339 commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1：`stat()` 函数名冲突

2：`read()` 短读未处理

4：`resize_buf()` 后 `httpbuf` 可能悬空

5：`prepend_header()` 静默失败

Yundi339 commented Apr 5, 2026 •

edited

Loading