Skip to content

ffengc/edge-faas-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge FaaS — Cold-Start Mitigation for Edge Serverless

中文 | English

CSCI 599: Network Systems for Cloud Computing — Spring 2026
Advisor: Prof. Ramesh Govindan · USC

📄 Paper (PDF)


Edge inference nodes cannot afford MicroVMs like Firecracker — process pools backed by Copy-on-Write fork are the more realistic isolation choice. This project implements a lightweight FaaS gateway in C++ and Python that significantly reduces cold-start counts at sub-microsecond control-plane overhead.

Headline results

Metric Value (n=5, 95% CI)
Cold-start reduction (vs Reactive baseline) −31% (48 ± 19 vs 70 ± 1)
Cold-start reduction (vs ARIMA baseline) −26% (48 ± 19 vs 64 ± 12)
Worker spin-up speed-up (CoW template) (~900 ms → ~100 ms)
Predictor inference latency p99 191 μs (more than two orders of magnitude below ARIMA's 72 ms)
Inter-burst sweet spot W = 20 – 60 s (both fixed and adaptive predictors near zero cold)

Two core ideas

1. Copy-on-Write template process

At gateway startup we fork a single Python "template" process that imports heavyweight dependencies (Pillow, OpenCV, …) once. Every subsequent worker is created via os.fork() from the template, inheriting its address space through Linux Copy-on-Write — the fork itself is nearly free, and only pages that are actually written get copied.

→ No container runtime, no image registry, no VM snapshot.

2. EWMA + CUSUM + Little's Law predictive control plane

  • EWMA tracks the periodic baseline rate (α = 0.2, τ ≈ 10 s)
  • CUSUM detects sustained deviation: it fires during the pre-spike traffic ramp instead of waiting for the spike to land
  • Little's Law (N = ⌈λ × T⌉ + 1) maps the predicted arrival rate to a target worker count

The full predictor is O(1) in time and space, with sub-microsecond per-tick inference cost.

System architecture

Experimental results

The four figures below come from the paper experiments (n=5 trials, 95% CI). See the Final Report (PDF) for the full analysis.


Main result — Fixed CUSUM 48 ± 19 vs Reactive 70 ± 1 → −31%; Adaptive reaches 0 cold at the W=35 sweet spot but breaks at short W (see lower-left)

CoW × Predictor ablation — disabling CoW costs Reactive +69 cold starts (CoW saves the day); Fixed CUSUM scales conservatively, so the CoW effect is statistically inconclusive

Warmup sweep — Adaptive collapses at W ≤ 10 s (τ_σ cliff, 195 cold starts @ W=5 s); W = 20 – 60 s is the sweet spot

Pareto trade-off — Fixed CUSUM lies on the PSS / cold-start frontier; we use PSS (not RSS) to avoid double-counting CoW-shared pages (~2.3× inflation)

Build & Run

Dependencies:

sudo apt install build-essential
pip install Pillow statsmodels psutil scipy

Build:

make clean && make

Launch the server (pick one of five modes):

./server ewma            # EWMA + Fixed CUSUM — main method (default)
./server ewma_adaptive   # Standardized CUSUM — z-score variant
./server reactive        # Reactive baseline (no prediction)
./server static 15       # Static-15 baseline (15 pinned workers)
./server arima           # ARIMA(2,1,2) baseline

# Disable the CoW template (for ablation experiments)
./server ewma --no-cow

Generate load (defaults to a 4-cycle Bursty-Ramp workload):

python3 load_tester.py

Optional flags:

python3 load_tester.py --warmup-c234 60      # tune inter-burst interval
python3 load_tester.py --spike-rps 100       # tune spike RPS magnitude
python3 load_tester.py --no-ramp             # step workload (no ramp signal)

Reproducing the paper experiments

# (1) Main experiment: 5 modes × n=5 trials × CoW={ON,OFF} = 50 runs, ~3 h
./run_multi_trial.sh 5 --ablation

# (2) Warmup sweep: 6 W × 2 modes × n=5 = 60 runs, ~3 h
./run_sweep.sh --trials 5

# (3) Step workload (no ramp): 5 modes × n=3 = 15 runs, ~50 min
./run_multi_trial.sh 3 --no-ramp --tag step

# (4) Aggregate with 95% CI (emits markdown table + summary.csv + per_cycle.csv)
python3 analyze_trials.py logs/<campaign_dir>/

# (5) Render the paper figures (reads summary.csv)
python3 figures/plot_main_n5.py    logs/<main_campaign>/
python3 figures/plot_ablation.py    logs/<main_campaign>/
python3 figures/plot_pareto_pss.py  logs/<main_campaign>/
python3 figures/plot_sweep_n5.py    logs/<sweep_campaign>/

Project journey

For the full development trace — proposal, Check-in #1, Check-in #2, class presentation — including design decisions and lessons learned:

About

USC CS599/656 | Edge FaaS — Cold-Start Mitigation for Edge Serverless🚀🚀🚀

Topics

Resources

License

Stars

Watchers

Forks

Contributors