-
-
Notifications
You must be signed in to change notification settings - Fork 1
Optimize bench server for performance benchmarking #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Refactors the bench_server to minimize allocations and output, using static/pre-serialized responses and minimal logging for accurate HTTP benchmarking. Updates dependencies to enable simd-json and disables tracing and swagger-ui by default for the benchmark. Improves server internals for better parallelism and disables request logging unless the tracing feature is enabled. Adds new workspace members and custom release/bench profiles for maximum performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces performance optimizations for benchmarking by creating a minimal benchmark server, refactoring server internals for higher throughput, and configuring aggressive build profiles. The changes focus on reducing allocations, disabling optional features, and optimizing connection handling for maximum performance during load testing.
Changes:
- Added benchmark server with minimal feature set (no tracing, no swagger-ui) and simd-json for faster parsing
- Refactored server connection handling to use independent spawned tasks instead of JoinSet and added biased select for prioritizing accepts over shutdown
- Configured aggressive build profiles (release, bench, release-with-debug) with LTO, codegen-units=1, and panic=abort
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml | Added benchmark workspace members, configured aggressive release profiles, and set default-features=false for internal crates |
| crates/rustapi-rs/Cargo.toml | Added tracing and simd-json as optional features, made tracing a default feature |
| benches/bench_server/Cargo.toml | Configured minimal feature set with simd-json enabled and default features disabled |
| benches/bench_server/src/main.rs | Optimized handlers to use static string references and added unused pre-serialized byte constants |
| crates/rustapi-core/src/server.rs | Removed JoinSet for graceful shutdown, added biased select, enabled keep-alive, and conditionally disabled request logging |
| _ = &mut signal => { | ||
| info!("Shutdown signal received, draining connections..."); | ||
| info!("Shutdown signal received"); | ||
| break; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Wait for all connections to finish | ||
| while (connections.join_next().await).is_some() {} | ||
| info!("Server shutdown complete"); | ||
|
|
||
| Ok(()) |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the JoinSet and the shutdown draining logic eliminates graceful shutdown capability. When a shutdown signal is received, the server immediately exits without waiting for in-flight connections to complete. This can lead to abrupt connection terminations and incomplete request processing. For a production-ready server, consider using a mechanism to track active connections (such as a JoinSet or a counter with graceful timeout) to allow them to finish processing before shutdown completes. While this may be acceptable for a benchmark server focused on raw throughput, it's a significant behavioral change that could affect production deployments.
benches/bench_server/src/main.rs
Outdated
| // Static JSON bytes - avoid allocation on every request | ||
| static JSON_HELLO_BYTES: &[u8] = br#"{"message":"Hello, World!"}"#; | ||
| static USER_RESPONSE_BYTES: &[u8] = br#"{"id":1,"name":"Test User","email":"test@example.com","created_at":"2024-01-01T00:00:00Z","is_active":true}"#; | ||
|
|
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These static pre-serialized byte arrays are defined but never used in the code. The handlers still perform JSON serialization on every request via the Json wrapper. To actually use these pre-computed responses and achieve the claimed performance benefits, the handlers would need to return raw bytes directly (e.g., using a custom response type that returns these static slices). Consider either removing these unused constants or refactoring the handlers to actually use them.
| // Static JSON bytes - avoid allocation on every request | |
| static JSON_HELLO_BYTES: &[u8] = br#"{"message":"Hello, World!"}"#; | |
| static USER_RESPONSE_BYTES: &[u8] = br#"{"id":1,"name":"Test User","email":"test@example.com","created_at":"2024-01-01T00:00:00Z","is_active":true}"#; |
|
|
||
| loop { | ||
| tokio::select! { | ||
| biased; // Prioritize accept over shutdown for better throughput |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using biased select prioritizes accepting new connections over processing the shutdown signal. Under sustained high load, this could delay shutdown indefinitely as the accept branch will always be ready. While this improves throughput during benchmarking, it may prevent timely shutdown in production scenarios. Consider documenting this tradeoff or providing a configuration option to disable biased selection for production use cases where responsive shutdown is important.
|
|
||
| # Internal crates | ||
| rustapi-rs = { path = "crates/rustapi-rs", version = "0.1.188", default-features = false } | ||
| rustapi-core = { path = "crates/rustapi-core", version = "0.1.188", default-features = false } |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting default-features = false for rustapi-core in the workspace dependencies may break other workspace members that depend on rustapi-core and expect default features (swagger-ui and tracing) to be enabled. Workspace members like rustapi-extras, rustapi-testing, rustapi-toon, rustapi-view, and rustapi-ws use "rustapi-core = { workspace = true }" and will inherit this setting. These crates may fail to compile if they use tracing macros or expect swagger-ui features. Each affected workspace member should either explicitly enable required features or handle the absence of these features gracefully.
| rustapi-core = { path = "crates/rustapi-core", version = "0.1.188", default-features = false } | |
| rustapi-core = { path = "crates/rustapi-core", version = "0.1.188" } |
| use std::sync::Arc; | ||
| use tokio::net::TcpListener; | ||
| use tokio::task::JoinSet; | ||
| use tracing::{error, info}; |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tracing import is not feature-gated, but tracing is now an optional feature. This will cause compilation errors when building with default-features = false. The import should be conditionally compiled using #[cfg(feature = "tracing")], and the code should use the existing trace_info! and trace_error! macros from tracing_macros.rs instead of directly calling info! and error! macros.
| let addr: SocketAddr = addr.parse()?; | ||
| let listener = TcpListener::bind(addr).await?; | ||
|
|
||
| info!("🚀 RustAPI server running on http://{}", addr); |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The info! macro call is not feature-gated but tracing is now optional. This will cause compilation errors when building without the tracing feature. Use the trace_info! macro from tracing_macros.rs instead, which provides a no-op implementation when the tracing feature is disabled.
| error!("Connection error: {}", err); | ||
| // Only log actual errors, not client disconnects | ||
| if !err.is_incomplete_message() { | ||
| error!("Connection error: {}", err); |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error! macro calls within this accept loop are not feature-gated but tracing is now optional. This will cause compilation errors when building without the tracing feature. Use the trace_error! macro from tracing_macros.rs instead, which provides a no-op implementation when the tracing feature is disabled.
| lto = "fat" | ||
| codegen-units = 1 |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bench profile has redundant settings. It inherits from release and then re-specifies lto = "fat" and codegen-units = 1, which are already set in the release profile. These redundant settings can be removed since they're inherited anyway.
| lto = "fat" | |
| codegen-units = 1 |
| } | ||
| _ = &mut signal => { | ||
| info!("Shutdown signal received, draining connections..."); | ||
| info!("Shutdown signal received"); |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The info! macro call is not feature-gated but tracing is now optional. This will cause compilation errors when building without the tracing feature. Use the trace_info! macro from tracing_macros.rs instead, which provides a no-op implementation when the tracing feature is disabled.
Refactored server.rs to reduce per-request allocations by introducing a connection-level service and custom future, and enabled TCP_NODELAY for lower latency. Improved JSON serialization in json.rs to use simd-json when available, and updated dependencies to include socket2. Updated README performance numbers and added a test JSON body for benchmarks.
This pull request introduces significant performance optimizations and benchmarking improvements to the RustAPI project. The main focus is on creating a minimal, high-performance benchmark server, refining server internals for throughput, and tuning build profiles for optimized releases. Additionally, the default feature set and dependency management have been updated to better support benchmarking and performance testing.
Benchmark Server and Performance Optimizations
Added two new benchmark servers (
benches/bench_serverandbenches/actix_bench_server) to the workspace, and created a minimalbench_serverimplementation that uses pre-allocated static responses and disables unnecessary features for maximum throughput. (Cargo.toml[1]benches/bench_server/Cargo.toml[2]benches/bench_server/src/main.rs[3] [4] [5] [6] [7] [8] [9] [10]Refactored the benchmark server handlers to use static string references and pre-serialized byte slices, reducing allocations and improving response efficiency. (
benches/bench_server/src/main.rs[1] [2] [3] [4] [5] [6] [7]Server Core Improvements
Simplified the server's connection handling by removing the
JoinSet, spawning independent tasks for each connection, and usingtokio::select!withbiased;to prioritize accepting new connections. This improves parallelism and throughput under heavy load. (crates/rustapi-core/src/server.rs[1] [2] [3]Enabled HTTP keep-alive for connections and improved error logging to suppress benign disconnect messages, further optimizing the server for benchmarks. (
crates/rustapi-core/src/server.rscrates/rustapi-core/src/server.rsR96-L108)Disabled request logging in release builds unless the
tracingfeature is enabled, reducing overhead during performance testing. (crates/rustapi-core/src/server.rs[1] [2]Build and Feature Configuration
Added new release, bench, and release-with-debug profiles to
Cargo.tomlfor maximum performance, including aggressive optimizations such aslto = "fat",opt-level = 3, and disabling debug symbols for release. (Cargo.tomlCargo.tomlR124-R146)Updated internal crate dependencies and features, including a new
simd-jsonfeature for faster JSON parsing and makingtracinga default feature forrustapi-rs. (Cargo.toml[1]crates/rustapi-rs/Cargo.toml[2]Feature and Dependency Management
rustapi-rsto allow for minimal builds (disabling tracing and swagger-ui) and to enable performance features likesimd-jsonandtracingas needed. (crates/rustapi-rs/Cargo.tomlcrates/rustapi-rs/Cargo.tomlL42-R48)