You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<h2id="benchmarking-prefilldecode-ratios-fixed-vs-dynamic"><aclass="toclink" href="../benchmarking-pd-ratios/">Benchmarking Prefill–Decode ratios: fixed vs dynamic</a></h2>
3936
+
<p>This benchmark investigates whether the Prefill–Decode worker ratio needs to be managed dynamically at runtime, or if a fixed split can deliver the same performance with simpler orchestration.<br/>
3937
+
We evaluate different ratios across workload profiles and concurrency levels to measure their impact on TTFT, ITL, and throughput, and to see whether fixing the ratio in advance is a practical alternative to dynamic adjustment.</p>
<h2id="benchmarking-prefilldecode-ratios-fixed-vs-dynamic"><aclass="toclink" href="benchmarking-pd-ratios/">Benchmarking Prefill–Decode ratios: fixed vs dynamic</a></h2>
3909
+
<p>This benchmark investigates whether the Prefill–Decode worker ratio needs to be managed dynamically at runtime, or if a fixed split can deliver the same performance with simpler orchestration.<br/>
3910
+
We evaluate different ratios across workload profiles and concurrency levels to measure their impact on TTFT, ITL, and throughput, and to see whether fixing the ratio in advance is a practical alternative to dynamic adjustment.</p>
<h2id="how-ea-uses-dstack-to-fast-track-ai-development"><aclass="toclink" href="ea-gtc25/">How EA uses dstack to fast-track AI development</a></h2>
4321
-
<p>At NVIDIA GTC 2025, Electronic Arts <ahref="https://www.nvidia.com/en-us/on-demand/session/gtc25-s73667/" target="_blank">shared <spanclass="twemoji external"><svgxmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><pathd="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a> how they’re scaling AI development and managing infrastructure across teams. They highlighted using tools like <code>dstack</code> to provision GPUs quickly, flexibly, and cost-efficiently. This case study summarizes key insights from their talk.</p>
<p>EA has over 100+ AI projects running, and the number keeps growing. There are many teams with AI needs—game dev, ML engineers, AI researchers, and platform teams—supported by a central tech team. Some need full MLOps support; others have in-house expertise but need flexible tooling and infrastructure.</p>
0 commit comments