Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/cx/BENCHKIT_GAP_ANALYSIS.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Continuous estimation has now moved beyond a mere entry point: a common estimati
However, estimation is still not yet broadly deployed across multiple applications, and AI-driven optimization integration remains mostly at the integration-point stage.

As of the current repository survey, BenchKit has six benchmark applications with `build.sh`/`run.sh`, but only `qws` has an `estimate.sh`.
The result portal also already has a meaningful test base (`result_server/tests`: 30 `test_*.py` modules), and the repository now has a repo-local Python dependency manifest, a standard portal test entrypoint under `result_server/tests`, and a lightweight GitHub Actions verification path for portal-oriented changes.
The result portal also already has a meaningful test base (`result_server/tests`: 32 `test_*.py` modules), and the repository now has a repo-local Python dependency manifest, a standard portal test entrypoint under `result_server/tests`, and a lightweight GitHub Actions verification path for portal-oriented changes.
The main GitLab pipeline still intentionally skips heavy benchmark execution when a direct or manually triggered GitLab pipeline sees changes limited to `result_server/**/*` or portal display metadata such as `config/system_info.csv`. Protected-branch synchronization itself uses `ci.skip`, so the dedicated lightweight GitHub Actions path should continue to be kept in sync as portal-side files evolve.

## 2.1 現時点で明示しておく設計負債 / Explicit Design Debts to Keep Visible
Expand Down Expand Up @@ -296,7 +296,7 @@ Once the estimation specification is clarified, many other design decisions beco

今回のコードベース調査では、性能推定に次ぐ実務上の詰まりどころとして、`result_server` の検証導線が見えた。

- `result_server/tests` には 30 個の `test_*.py` モジュールがあり、portal 側はすでに「検証すべき対象」になっている
- `result_server/tests` には 32 個の `test_*.py` モジュールがあり、portal 側はすでに「検証すべき対象」になっている
- repo-local な依存関係定義として `requirements-result-server.txt` があり、`result_server/tests/run_result_server_tests.py` が標準 test entrypoint として使える
- portal-oriented 変更向けの lightweight GitHub Actions として `.github/workflows/result-server-tests.yml` が用意されている
- `.gitlab-ci.yml` は直接または手動起動されたGitLab pipelineで `result_server/**/*` や `config/system_info.csv` 変更時に重い benchmark pipeline を skip する。保護ブランチ同期自体は `ci.skip` を使うため、GitHub Actions 側の path filter を portal 周辺の実ファイルに追従させ続ける必要がある
Expand Down
47 changes: 47 additions & 0 deletions docs/deploy/hardening-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Result Portal Hardening Guide

This checklist covers production-facing `result_server` deployments.

## Request Limits

The portal enforces an application-level request body limit:

```text
RESULT_SERVER_MAX_UPLOAD_MB=512
```

Large estimation input archives are also checked per member:

```text
RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB=1024
```

Set these values to match the largest expected PA Data or estimation input
archive. Keep the reverse proxy body limit at or below the Flask limit so that
oversized uploads are rejected before they consume worker memory.

## Rate Limits

API ingest/query routes and admin write routes use Redis-backed fixed-window
rate limits. Production deployments must keep Redis monitored and available;
when Redis is required but unavailable, protected operations fail closed with a
503 response.

Default limits:

- API ingest: 120 requests per runner per minute
- API query: 60 requests per runner per minute
- Admin write actions: 20 requests per admin user per minute

## Reverse Proxy

Run the Flask app behind a reverse proxy that terminates TLS and forwards only
loopback traffic to the app. Keep `/admin/` and `/auth/` protected by portal
authentication; `robots.txt` only reduces crawler noise and is not an access
control mechanism.

## Gunicorn

Use the repository `gunicorn.conf.py` as the baseline process manager
configuration. It binds to `127.0.0.1:8800` by default, sets worker timeouts,
and enables `max_requests` recycling to reduce long-running worker risk.
1 change: 1 addition & 0 deletions docs/guides/developer-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,7 @@ For production portal deployments:
- The legacy `RESULT_SERVER_KEY` variable is still accepted as runner `default` for compatibility, but should be rotated to `RESULT_SERVER_KEYS`.
- See `docs/deploy/key-management.md` for generation and rotation guidance.
- `REDIS_URL` must point to a monitored Redis instance; production authentication refuses login when Redis is unavailable.
- API ingest and query endpoints use Redis-backed rate limits by default; set `RESULT_SERVER_MAX_UPLOAD_MB` and `RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB` when deployment-specific upload limits are needed.
- `app_dev.py` is localhost-only, uses ephemeral development secrets when none are provided, and enables the Werkzeug debugger only with `RESULT_SERVER_DEV_DEBUG=1`.

### Result Quality Visibility
Expand Down
23 changes: 23 additions & 0 deletions gunicorn.conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
"""Reference Gunicorn configuration for result_server deployments."""

import multiprocessing
import os


bind = os.environ.get("RESULT_SERVER_BIND", "127.0.0.1:8800")
workers = int(
os.environ.get("RESULT_SERVER_WORKERS", str(multiprocessing.cpu_count() * 2 + 1))
)
worker_class = "sync"
timeout = int(os.environ.get("RESULT_SERVER_TIMEOUT", "60"))
graceful_timeout = int(os.environ.get("RESULT_SERVER_GRACEFUL_TIMEOUT", "30"))
keepalive = int(os.environ.get("RESULT_SERVER_KEEPALIVE", "5"))
max_requests = int(os.environ.get("RESULT_SERVER_MAX_REQUESTS", "1000"))
max_requests_jitter = int(os.environ.get("RESULT_SERVER_MAX_REQUESTS_JITTER", "50"))
limit_request_line = int(os.environ.get("RESULT_SERVER_LIMIT_REQUEST_LINE", "8190"))
limit_request_field_size = int(
os.environ.get("RESULT_SERVER_LIMIT_REQUEST_FIELD_SIZE", "8190")
)
accesslog = "-"
errorlog = "-"
loglevel = os.environ.get("RESULT_SERVER_LOG_LEVEL", "info")
22 changes: 21 additions & 1 deletion result_server/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import sys
from datetime import timedelta

from flask import Flask, render_template
from flask import Flask, jsonify, render_template
from flask_session import Session

from routes.api import api_bp
Expand All @@ -14,6 +14,8 @@
from utils.preflight import validate_production_config


DEFAULT_MAX_UPLOAD_MB = 512
DEFAULT_MAX_ARCHIVE_MEMBER_MB = 1024
INGEST_KEYS = parse_ingest_keys()
PREFLIGHT_ERRORS = validate_production_config(os.environ, INGEST_KEYS)

Expand Down Expand Up @@ -75,6 +77,23 @@ def _configure_result_directories(app, base_dir):
app.config.update(dir_map)


def _configure_upload_limits(app):
"""Configure request and archive size limits for ingest endpoints."""
max_upload_mb = int(os.environ.get("RESULT_SERVER_MAX_UPLOAD_MB", DEFAULT_MAX_UPLOAD_MB))
max_member_mb = int(
os.environ.get("RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB", DEFAULT_MAX_ARCHIVE_MEMBER_MB)
)
app.config["MAX_CONTENT_LENGTH"] = max_upload_mb * 1024 * 1024
app.config["MAX_ARCHIVE_MEMBER_SIZE"] = max_member_mb * 1024 * 1024

@app.errorhandler(413)
def payload_too_large(_error):
return jsonify(
error="Payload too large",
limit_mb=app.config["MAX_CONTENT_LENGTH"] // 1024 // 1024,
), 413


def _register_portal_blueprints(app, prefix):
"""Register all portal blueprints using the given URL prefix."""
from routes.admin import admin_bp
Expand Down Expand Up @@ -107,6 +126,7 @@ def create_app(prefix="", base_dir=None):
_configure_user_store(app)
_configure_totp_issuer(app, prefix)
_configure_result_directories(app, base_dir)
_configure_upload_limits(app)
init_csrf(app, exempt_blueprints=(api_bp,))

register_home_routes(app, prefix=prefix)
Expand Down
21 changes: 20 additions & 1 deletion result_server/app_dev.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
from datetime import datetime, timedelta

LOOPBACK_HOSTS = {"127.0.0.1", "localhost", "::1"}
DEFAULT_MAX_UPLOAD_MB = 512
DEFAULT_MAX_ARCHIVE_MEMBER_MB = 1024


def setup_dev_environment(base_dir):
Expand Down Expand Up @@ -155,7 +157,7 @@ def create_dev_app(base_dir):
sys.modules["redis"] = types.ModuleType("redis")
sys.modules["utils.totp_manager"] = _create_stub_totp_manager()

from flask import Flask, render_template
from flask import Flask, jsonify, render_template
from flask_session import Session

from routes.home import register_home_routes
Expand All @@ -173,9 +175,26 @@ def create_dev_app(base_dir):
SESSION_PERMANENT=False,
AUTH_REQUIRES_REDIS=False,
INGEST_KEYS=parse_ingest_keys(),
MAX_CONTENT_LENGTH=(
int(os.environ.get("RESULT_SERVER_MAX_UPLOAD_MB", DEFAULT_MAX_UPLOAD_MB))
* 1024
* 1024
),
MAX_ARCHIVE_MEMBER_SIZE=(
int(os.environ.get("RESULT_SERVER_MAX_ARCHIVE_MEMBER_MB", DEFAULT_MAX_ARCHIVE_MEMBER_MB))
* 1024
* 1024
),
)
Session(app)

@app.errorhandler(413)
def payload_too_large(_error):
return jsonify(
error="Payload too large",
limit_mb=app.config["MAX_CONTENT_LENGTH"] // 1024 // 1024,
), 413

# Register a default local admin user.
stub_store = _StubUserStore()
stub_store.create_user("admin@localhost", "DEVDEVDEVDEVDEVDEV", ["dev", "admin"])
Expand Down
10 changes: 10 additions & 0 deletions result_server/routes/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
)

from utils.user_store import get_user_store
from utils.rate_limit import rate_limited

admin_bp = Blueprint("admin", __name__, url_prefix="/admin")

Expand All @@ -33,6 +34,11 @@ def _render_users_page(invitation_url=None):
return render_template("admin_users.html", users=all_users, invitation_url=invitation_url)


def _admin_rate_key(_request):
"""Return the session-scoped admin rate-limit key."""
return f"admin:{session.get('user_email', 'anon')}"


def admin_required(f):
"""Allow access only to authenticated users with the admin affiliation."""

Expand All @@ -57,6 +63,7 @@ def users():

@admin_bp.route("/users/add", methods=["POST"])
@admin_required
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
def add_user():
"""Create a user invitation and show the generated invitation URL."""
store = get_user_store()
Expand All @@ -81,6 +88,7 @@ def add_user():

@admin_bp.route("/users/<path:email>/delete", methods=["POST"])
@admin_required
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
def delete_user(email):
"""Delete a user unless the current admin targets their own account."""
if email == session.get("user_email"):
Expand All @@ -94,6 +102,7 @@ def delete_user(email):

@admin_bp.route("/users/<path:email>/affiliations", methods=["POST"])
@admin_required
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
def update_affiliations(email):
"""Update the affiliations stored for a user."""
store = get_user_store()
Expand All @@ -106,6 +115,7 @@ def update_affiliations(email):

@admin_bp.route("/users/<path:email>/reinvite", methods=["POST"])
@admin_required
@rate_limited(max_per_minute=20, key_fn=_admin_rate_key, scope="admin_write")
def reinvite_user(email):
"""Generate a new invitation link after clearing the current TOTP secret."""
store = get_user_store()
Expand Down
23 changes: 22 additions & 1 deletion result_server/routes/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,11 @@
from datetime import datetime

from utils.auth import verify_ingest_key
from utils.rate_limit import rate_limited

api_bp = Blueprint("api", __name__)
_TIMESTAMP_RE = re.compile(r"^\d{8}_\d{6}$")
DEFAULT_MAX_ARCHIVE_MEMBER_SIZE = 1024 * 1024 * 1024


# ==========================================
Expand All @@ -37,6 +39,12 @@ def require_api_key():
return runner_id


def _api_rate_key(req):
"""Return the runner-scoped API rate-limit key for a request."""
runner_id = verify_ingest_key(req.headers.get("X-API-Key", "")) or "unknown"
return f"runner:{runner_id}"


def save_json_file(data, prefix, out_dir, given_uuid=None):
"""Persist a JSON payload using atomic file replacement."""
if given_uuid is not None and not is_valid_uuid(given_uuid):
Expand Down Expand Up @@ -173,8 +181,14 @@ def _find_result_file_by_uuid(received_dir, uuid_value):
def _safe_extract_tar_bytes(file_storage, target_dir):
"""Extract uploaded tar bytes with path and member-type checks."""
os.makedirs(target_dir, exist_ok=True)
max_member_size = current_app.config.get(
"MAX_ARCHIVE_MEMBER_SIZE",
DEFAULT_MAX_ARCHIVE_MEMBER_SIZE,
)
with tarfile.open(fileobj=file_storage.stream, mode="r:*") as tar:
for member in tar.getmembers():
if member.size > max_member_size:
abort(400, description="Archive member too large")
normalized = os.path.normpath(member.name)
drive, _ = os.path.splitdrive(normalized)
if (
Expand Down Expand Up @@ -233,6 +247,7 @@ def _replace_directory_after_success(source_dir, target_dir):
# ==========================================

@api_bp.route("/api/ingest/result", methods=["POST"])
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
def ingest_result():
"""Receive and persist a collected result JSON payload."""
require_api_key()
Expand All @@ -245,6 +260,7 @@ def ingest_result():


@api_bp.route("/api/ingest/estimate", methods=["POST"])
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
def ingest_estimate():
"""Receive and persist an estimated-result JSON payload."""
require_api_key()
Expand All @@ -262,6 +278,7 @@ def ingest_estimate():


@api_bp.route("/api/ingest/padata", methods=["POST"])
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
def ingest_padata():
"""Receive and store a PA Data archive."""
require_api_key()
Expand Down Expand Up @@ -296,7 +313,7 @@ def ingest_padata():

tmp_path = save_path + ".tmp"
with open(tmp_path, "wb") as f:
f.write(uploaded_file.read())
shutil.copyfileobj(uploaded_file.stream, f, length=1024 * 1024)
f.flush()
os.fsync(f.fileno())
os.rename(tmp_path, save_path)
Expand All @@ -312,6 +329,7 @@ def ingest_padata():


@api_bp.route("/api/ingest/estimation-inputs", methods=["POST"])
@rate_limited(max_per_minute=120, key_fn=_api_rate_key, scope="api_ingest")
def ingest_estimation_inputs():
"""Estimation input archive (tgz) upload and expansion."""
require_api_key()
Expand Down Expand Up @@ -356,6 +374,7 @@ def ingest_estimation_inputs():
# ==========================================

@api_bp.route("/api/query/result", methods=["GET"])
@rate_limited(max_per_minute=60, key_fn=_api_rate_key, scope="api_query")
def query_result():
"""Search results by uuid or by system/code/exp and return one result.

Expand Down Expand Up @@ -433,6 +452,7 @@ def query_result():


@api_bp.route("/api/query/estimation-inputs", methods=["GET"])
@rate_limited(max_per_minute=60, key_fn=_api_rate_key, scope="api_query")
def query_estimation_inputs():
"""Return estimation input artifacts for a result UUID as a tar.gz archive."""
require_api_key()
Expand Down Expand Up @@ -472,6 +492,7 @@ def query_estimation_inputs():


@api_bp.route("/api/query/estimate", methods=["GET"])
@rate_limited(max_per_minute=60, key_fn=_api_rate_key, scope="api_query")
def query_estimate():
"""Return one estimate JSON document identified by UUID."""
require_api_key()
Expand Down
Loading
Loading