Skip to content

Commit ae11f10

Browse files
Add result server rate and upload limits
Signed-off-by: Yoshifumi Nakamura <nakamura@riken.jp>
1 parent 0a28cd5 commit ae11f10

11 files changed

Lines changed: 492 additions & 5 deletions

File tree

docs/cx/BENCHKIT_GAP_ANALYSIS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ Continuous estimation has now moved beyond a mere entry point: a common estimati
5858
However, estimation is still not yet broadly deployed across multiple applications, and AI-driven optimization integration remains mostly at the integration-point stage.
5959

6060
As of the current repository survey, BenchKit has six benchmark applications with `build.sh`/`run.sh`, but only `qws` has an `estimate.sh`.
61-
The result portal also already has a meaningful test base (`result_server/tests`: 30 `test_*.py` modules), and the repository now has a repo-local Python dependency manifest, a standard portal test entrypoint under `result_server/tests`, and a lightweight GitHub Actions verification path for portal-oriented changes.
61+
The result portal also already has a meaningful test base (`result_server/tests`: 32 `test_*.py` modules), and the repository now has a repo-local Python dependency manifest, a standard portal test entrypoint under `result_server/tests`, and a lightweight GitHub Actions verification path for portal-oriented changes.
6262
The main GitLab pipeline still intentionally skips heavy benchmark execution when a direct or manually triggered GitLab pipeline sees changes limited to `result_server/**/*` or portal display metadata such as `config/system_info.csv`. Protected-branch synchronization itself uses `ci.skip`, so the dedicated lightweight GitHub Actions path should continue to be kept in sync as portal-side files evolve.
6363

6464
## 2.1 現時点で明示しておく設計負債 / Explicit Design Debts to Keep Visible
@@ -296,7 +296,7 @@ Once the estimation specification is clarified, many other design decisions beco
296296

297297
今回のコードベース調査では、性能推定に次ぐ実務上の詰まりどころとして、`result_server` の検証導線が見えた。
298298

299-
- `result_server/tests` には 30 個の `test_*.py` モジュールがあり、portal 側はすでに「検証すべき対象」になっている
299+
- `result_server/tests` には 32 個の `test_*.py` モジュールがあり、portal 側はすでに「検証すべき対象」になっている
300300
- repo-local な依存関係定義として `requirements-result-server.txt` があり、`result_server/tests/run_result_server_tests.py` が標準 test entrypoint として使える
301301
- portal-oriented 変更向けの lightweight GitHub Actions として `.github/workflows/result-server-tests.yml` が用意されている
302302
- `.gitlab-ci.yml` は直接または手動起動されたGitLab pipelineで `result_server/**/*``config/system_info.csv` 変更時に重い benchmark pipeline を skip する。保護ブランチ同期自体は `ci.skip` を使うため、GitHub Actions 側の path filter を portal 周辺の実ファイルに追従させ続ける必要がある

docs/deploy/hardening-guide.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Result Portal Hardening Guide
2+
3+
This checklist covers production-facing `result_server` deployments.
4+
5+
## Request Limits
6+
7+
The portal enforces an application-level request body limit:
8+
9+
```text
10+
RESULT_SERVER_MAX_UPLOAD_MB=512
11+
```
12+
13+
Large estimation input archives are also checked per member:
14+
15+
```text
16+
RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB=1024
17+
```
18+
19+
Set these values to match the largest expected PA Data or estimation input
20+
archive. Keep the reverse proxy body limit at or below the Flask limit so that
21+
oversized uploads are rejected before they consume worker memory.
22+
23+
## Rate Limits
24+
25+
API ingest/query routes and admin write routes use Redis-backed fixed-window
26+
rate limits. Production deployments must keep Redis monitored and available;
27+
when Redis is required but unavailable, protected operations fail closed with a
28+
503 response.
29+
30+
Default limits:
31+
32+
- API ingest: 120 requests per runner per minute
33+
- API query: 60 requests per runner per minute
34+
- Admin write actions: 20 requests per admin user per minute
35+
36+
## Reverse Proxy
37+
38+
Run the Flask app behind a reverse proxy that terminates TLS and forwards only
39+
loopback traffic to the app. Keep `/admin/` and `/auth/` protected by portal
40+
authentication; `robots.txt` only reduces crawler noise and is not an access
41+
control mechanism.
42+
43+
## Gunicorn
44+
45+
Use the repository `gunicorn.conf.py` as the baseline process manager
46+
configuration. It binds to `127.0.0.1:8800` by default, sets worker timeouts,
47+
and enables `max_requests` recycling to reduce long-running worker risk.

docs/guides/developer-reference.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,7 @@ For production portal deployments:
220220
- The legacy `RESULT_SERVER_KEY` variable is still accepted as runner `default` for compatibility, but should be rotated to `RESULT_SERVER_KEYS`.
221221
- See `docs/deploy/key-management.md` for generation and rotation guidance.
222222
- `REDIS_URL` must point to a monitored Redis instance; production authentication refuses login when Redis is unavailable.
223+
- API ingest and query endpoints use Redis-backed rate limits by default; set `RESULT_SERVER_MAX_UPLOAD_MB` and `RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB` when deployment-specific upload limits are needed.
223224
- `app_dev.py` is localhost-only, uses ephemeral development secrets when none are provided, and enables the Werkzeug debugger only with `RESULT_SERVER_DEV_DEBUG=1`.
224225

225226
### Result Quality Visibility

gunicorn.conf.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
"""Reference Gunicorn configuration for result_server deployments."""
2+
3+
import multiprocessing
4+
import os
5+
6+
7+
bind = os.environ.get("RESULT_SERVER_BIND", "127.0.0.1:8800")
8+
workers = int(
9+
os.environ.get("RESULT_SERVER_WORKERS", str(multiprocessing.cpu_count() * 2 + 1))
10+
)
11+
worker_class = "sync"
12+
timeout = int(os.environ.get("RESULT_SERVER_TIMEOUT", "60"))
13+
graceful_timeout = int(os.environ.get("RESULT_SERVER_GRACEFUL_TIMEOUT", "30"))
14+
keepalive = int(os.environ.get("RESULT_SERVER_KEEPALIVE", "5"))
15+
max_requests = int(os.environ.get("RESULT_SERVER_MAX_REQUESTS", "1000"))
16+
max_requests_jitter = int(os.environ.get("RESULT_SERVER_MAX_REQUESTS_JITTER", "50"))
17+
limit_request_line = int(os.environ.get("RESULT_SERVER_LIMIT_REQUEST_LINE", "8190"))
18+
limit_request_field_size = int(
19+
os.environ.get("RESULT_SERVER_LIMIT_REQUEST_FIELD_SIZE", "8190")
20+
)
21+
accesslog = "-"
22+
errorlog = "-"
23+
loglevel = os.environ.get("RESULT_SERVER_LOG_LEVEL", "info")

result_server/app.py

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import sys
33
from datetime import timedelta
44

5-
from flask import Flask, render_template
5+
from flask import Flask, jsonify, render_template
66
from flask_session import Session
77

88
from routes.api import api_bp
@@ -14,6 +14,8 @@
1414
from utils.preflight import validate_production_config
1515

1616

17+
DEFAULT_MAX_UPLOAD_MB = 512
18+
DEFAULT_MAX_ARCHIVE_MEMBER_MB = 1024
1719
INGEST_KEYS = parse_ingest_keys()
1820
PREFLIGHT_ERRORS = validate_production_config(os.environ, INGEST_KEYS)
1921

@@ -75,6 +77,23 @@ def _configure_result_directories(app, base_dir):
7577
app.config.update(dir_map)
7678

7779

80+
def _configure_upload_limits(app):
81+
"""Configure request and archive size limits for ingest endpoints."""
82+
max_upload_mb = int(os.environ.get("RESULT_SERVER_MAX_UPLOAD_MB", DEFAULT_MAX_UPLOAD_MB))
83+
max_member_mb = int(
84+
os.environ.get("RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB", DEFAULT_MAX_ARCHIVE_MEMBER_MB)
85+
)
86+
app.config["MAX_CONTENT_LENGTH"] = max_upload_mb * 1024 * 1024
87+
app.config["MAX_ARCHIVE_MEMBER_SIZE"] = max_member_mb * 1024 * 1024
88+
89+
@app.errorhandler(413)
90+
def payload_too_large(_error):
91+
return jsonify(
92+
error="Payload too large",
93+
limit_mb=app.config["MAX_CONTENT_LENGTH"] // 1024 // 1024,
94+
), 413
95+
96+
7897
def _register_portal_blueprints(app, prefix):
7998
"""Register all portal blueprints using the given URL prefix."""
8099
from routes.admin import admin_bp
@@ -107,6 +126,7 @@ def create_app(prefix="", base_dir=None):
107126
_configure_user_store(app)
108127
_configure_totp_issuer(app, prefix)
109128
_configure_result_directories(app, base_dir)
129+
_configure_upload_limits(app)
110130
init_csrf(app, exempt_blueprints=(api_bp,))
111131

112132
register_home_routes(app, prefix=prefix)

result_server/app_dev.py

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@
2323
from datetime import datetime, timedelta
2424

2525
LOOPBACK_HOSTS = {"127.0.0.1", "localhost", "::1"}
26+
DEFAULT_MAX_UPLOAD_MB = 512
27+
DEFAULT_MAX_ARCHIVE_MEMBER_MB = 1024
2628

2729

2830
def setup_dev_environment(base_dir):
@@ -155,7 +157,7 @@ def create_dev_app(base_dir):
155157
sys.modules["redis"] = types.ModuleType("redis")
156158
sys.modules["utils.totp_manager"] = _create_stub_totp_manager()
157159

158-
from flask import Flask, render_template
160+
from flask import Flask, jsonify, render_template
159161
from flask_session import Session
160162

161163
from routes.home import register_home_routes
@@ -173,9 +175,26 @@ def create_dev_app(base_dir):
173175
SESSION_PERMANENT=False,
174176
AUTH_REQUIRES_REDIS=False,
175177
INGEST_KEYS=parse_ingest_keys(),
178+
MAX_CONTENT_LENGTH=(
179+
int(os.environ.get("RESULT_SERVER_MAX_UPLOAD_MB", DEFAULT_MAX_UPLOAD_MB))
180+
* 1024
181+
* 1024
182+
),
183+
MAX_ARCHIVE_MEMBER_SIZE=(
184+
int(os.environ.get("RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB", DEFAULT_MAX_ARCHIVE_MEMBER_MB))
185+
* 1024
186+
* 1024
187+
),
176188
)
177189
Session(app)
178190

191+
@app.errorhandler(413)
192+
def payload_too_large(_error):
193+
return jsonify(
194+
error="Payload too large",
195+
limit_mb=app.config["MAX_CONTENT_LENGTH"] // 1024 // 1024,
196+
), 413
197+
179198
# Register a default local admin user.
180199
stub_store = _StubUserStore()
181200
stub_store.create_user("admin@localhost", "DEVDEVDEVDEVDEVDEV", ["dev", "admin"])

result_server/routes/admin.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
)
1616

1717
from utils.user_store import get_user_store
18+
from utils.rate_limit import rate_limited
1819

1920
admin_bp = Blueprint("admin", __name__, url_prefix="/admin")
2021

@@ -33,6 +34,11 @@ def _render_users_page(invitation_url=None):
3334
return render_template("admin_users.html", users=all_users, invitation_url=invitation_url)
3435

3536

37+
def _admin_rate_key(_request):
38+
"""Return the session-scoped admin rate-limit key."""
39+
return f"admin:{session.get('user_email', 'anon')}"
40+
41+
3642
def admin_required(f):
3743
"""Allow access only to authenticated users with the admin affiliation."""
3844

@@ -57,6 +63,7 @@ def users():
5763

5864
@admin_bp.route("/users/add", methods=["POST"])
5965
@admin_required
66+
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
6067
def add_user():
6168
"""Create a user invitation and show the generated invitation URL."""
6269
store = get_user_store()
@@ -81,6 +88,7 @@ def add_user():
8188

8289
@admin_bp.route("/users/<path:email>/delete", methods=["POST"])
8390
@admin_required
91+
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
8492
def delete_user(email):
8593
"""Delete a user unless the current admin targets their own account."""
8694
if email == session.get("user_email"):
@@ -94,6 +102,7 @@ def delete_user(email):
94102

95103
@admin_bp.route("/users/<path:email>/affiliations", methods=["POST"])
96104
@admin_required
105+
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
97106
def update_affiliations(email):
98107
"""Update the affiliations stored for a user."""
99108
store = get_user_store()
@@ -106,6 +115,7 @@ def update_affiliations(email):
106115

107116
@admin_bp.route("/users/<path:email>/reinvite", methods=["POST"])
108117
@admin_required
118+
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
109119
def reinvite_user(email):
110120
"""Generate a new invitation link after clearing the current TOTP secret."""
111121
store = get_user_store()

result_server/routes/api.py

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,11 @@
1212
from datetime import datetime
1313

1414
from utils.auth import verify_ingest_key
15+
from utils.rate_limit import rate_limited
1516

1617
api_bp = Blueprint("api", __name__)
1718
_TIMESTAMP_RE = re.compile(r"^\d{8}_\d{6}$")
19+
DEFAULT_MAX_ARCHIVE_MEMBER_SIZE = 1024 * 1024 * 1024
1820

1921

2022
# ==========================================
@@ -37,6 +39,12 @@ def require_api_key():
3739
return runner_id
3840

3941

42+
def _api_rate_key(req):
43+
"""Return the runner-scoped API rate-limit key for a request."""
44+
runner_id = verify_ingest_key(req.headers.get("X-API-Key", "")) or "unknown"
45+
return f"runner:{runner_id}"
46+
47+
4048
def save_json_file(data, prefix, out_dir, given_uuid=None):
4149
"""Persist a JSON payload using atomic file replacement."""
4250
if given_uuid is not None and not is_valid_uuid(given_uuid):
@@ -173,8 +181,14 @@ def _find_result_file_by_uuid(received_dir, uuid_value):
173181
def _safe_extract_tar_bytes(file_storage, target_dir):
174182
"""Extract uploaded tar bytes with path and member-type checks."""
175183
os.makedirs(target_dir, exist_ok=True)
184+
max_member_size = current_app.config.get(
185+
"MAX_ARCHIVE_MEMBER_SIZE",
186+
DEFAULT_MAX_ARCHIVE_MEMBER_SIZE,
187+
)
176188
with tarfile.open(fileobj=file_storage.stream, mode="r:*") as tar:
177189
for member in tar.getmembers():
190+
if member.size > max_member_size:
191+
abort(400, description="Archive member too large")
178192
normalized = os.path.normpath(member.name)
179193
drive, _ = os.path.splitdrive(normalized)
180194
if (
@@ -233,6 +247,7 @@ def _replace_directory_after_success(source_dir, target_dir):
233247
# ==========================================
234248

235249
@api_bp.route("/api/ingest/result", methods=["POST"])
250+
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
236251
def ingest_result():
237252
"""Receive and persist a collected result JSON payload."""
238253
require_api_key()
@@ -245,6 +260,7 @@ def ingest_result():
245260

246261

247262
@api_bp.route("/api/ingest/estimate", methods=["POST"])
263+
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
248264
def ingest_estimate():
249265
"""Receive and persist an estimated-result JSON payload."""
250266
require_api_key()
@@ -262,6 +278,7 @@ def ingest_estimate():
262278

263279

264280
@api_bp.route("/api/ingest/padata", methods=["POST"])
281+
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
265282
def ingest_padata():
266283
"""Receive and store a PA Data archive."""
267284
require_api_key()
@@ -296,7 +313,7 @@ def ingest_padata():
296313

297314
tmp_path = save_path + ".tmp"
298315
with open(tmp_path, "wb") as f:
299-
f.write(uploaded_file.read())
316+
shutil.copyfileobj(uploaded_file.stream, f, length=1024 * 1024)
300317
f.flush()
301318
os.fsync(f.fileno())
302319
os.rename(tmp_path, save_path)
@@ -312,6 +329,7 @@ def ingest_padata():
312329

313330

314331
@api_bp.route("/api/ingest/estimation-inputs", methods=["POST"])
332+
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
315333
def ingest_estimation_inputs():
316334
"""Estimation input archive (tgz) upload and expansion."""
317335
require_api_key()
@@ -356,6 +374,7 @@ def ingest_estimation_inputs():
356374
# ==========================================
357375

358376
@api_bp.route("/api/query/result", methods=["GET"])
377+
@rate_limited(max_per_minute=60, key_fn=_api_rate_key, scope="api_query")
359378
def query_result():
360379
"""Search results by uuid or by system/code/exp and return one result.
361380
@@ -433,6 +452,7 @@ def query_result():
433452

434453

435454
@api_bp.route("/api/query/estimation-inputs", methods=["GET"])
455+
@rate_limited(max_per_minute=60, key_fn=_api_rate_key, scope="api_query")
436456
def query_estimation_inputs():
437457
"""Return estimation input artifacts for a result UUID as a tar.gz archive."""
438458
require_api_key()
@@ -472,6 +492,7 @@ def query_estimation_inputs():
472492

473493

474494
@api_bp.route("/api/query/estimate", methods=["GET"])
495+
@rate_limited(max_per_minute=60, key_fn=_api_rate_key, scope="api_query")
475496
def query_estimate():
476497
"""Return one estimate JSON document identified by UUID."""
477498
require_api_key()

0 commit comments

Comments
 (0)