Skip to content

Commit 5545b15

Browse files
Deploying to gh-pages from @ dstackai/dstack@d622241 🚀
1 parent 789b7bf commit 5545b15

File tree

11 files changed

+4616
-326
lines changed

11 files changed

+4616
-326
lines changed
39.2 KB
Loading

blog/benchmarking-pd-ratios/index.html

Lines changed: 4141 additions & 0 deletions
Large diffs are not rendered by default.

blog/benchmarks/index.html

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3529,6 +3529,17 @@
35293529
</label>
35303530
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
35313531

3532+
<li class="md-nav__item">
3533+
<a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
3534+
<span class="md-ellipsis">
3535+
3536+
Benchmarking Prefill–Decode ratios: fixed vs dynamic
3537+
3538+
</span>
3539+
</a>
3540+
3541+
</li>
3542+
35323543
<li class="md-nav__item">
35333544
<a href="#benchmarking-amd-gpus-bare-metal-vms" class="md-nav__link">
35343545
<span class="md-ellipsis">
@@ -3713,6 +3724,17 @@
37133724
</label>
37143725
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
37153726

3727+
<li class="md-nav__item">
3728+
<a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
3729+
<span class="md-ellipsis">
3730+
3731+
Benchmarking Prefill–Decode ratios: fixed vs dynamic
3732+
3733+
</span>
3734+
</a>
3735+
3736+
</li>
3737+
37163738
<li class="md-nav__item">
37173739
<a href="#benchmarking-amd-gpus-bare-metal-vms" class="md-nav__link">
37183740
<span class="md-ellipsis">
@@ -3802,6 +3824,17 @@
38023824
</label>
38033825
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
38043826

3827+
<li class="md-nav__item">
3828+
<a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
3829+
<span class="md-ellipsis">
3830+
3831+
Benchmarking Prefill–Decode ratios: fixed vs dynamic
3832+
3833+
</span>
3834+
</a>
3835+
3836+
</li>
3837+
38053838
<li class="md-nav__item">
38063839
<a href="#benchmarking-amd-gpus-bare-metal-vms" class="md-nav__link">
38073840
<span class="md-ellipsis">
@@ -3875,6 +3908,50 @@ <h1 id="benchmarks">Benchmarks<a class="headerlink" href="#benchmarks" title="Pe
38753908
<article class="md-post md-post--excerpt">
38763909
<header class="md-post__header">
38773910

3911+
<div class="md-post__meta md-meta">
3912+
<ul class="md-meta__list">
3913+
<li class="md-meta__item">
3914+
<time datetime="2025-09-25 00:00:00+00:00">September 25, 2025</time></li>
3915+
3916+
<li class="md-meta__item">
3917+
in
3918+
3919+
<a href="./" class="md-meta__link">Benchmarks</a></li>
3920+
3921+
3922+
3923+
<li class="md-meta__item">
3924+
3925+
6 min read
3926+
3927+
</li>
3928+
3929+
3930+
</ul>
3931+
3932+
</div>
3933+
</header>
3934+
<div class="md-post__content md-typeset">
3935+
<h2 id="benchmarking-prefilldecode-ratios-fixed-vs-dynamic"><a class="toclink" href="../benchmarking-pd-ratios/">Benchmarking Prefill–Decode ratios: fixed vs dynamic</a></h2>
3936+
<p>This benchmark investigates whether the Prefill–Decode worker ratio needs to be managed dynamically at runtime, or if a fixed split can deliver the same performance with simpler orchestration.<br />
3937+
We evaluate different ratios across workload profiles and concurrency levels to measure their impact on TTFT, ITL, and throughput, and to see whether fixing the ratio in advance is a practical alternative to dynamic adjustment.</p>
3938+
<p><img src="https://dstack.ai/static-assets/static-assets/images/benchmarking-pd-ratios.png" width="630" /></p>
3939+
3940+
3941+
<nav class="md-post__action">
3942+
<a href="../benchmarking-pd-ratios/">
3943+
<span>Continue reading</span>
3944+
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
3945+
</a>
3946+
</nav>
3947+
3948+
3949+
</div>
3950+
</article>
3951+
3952+
<article class="md-post md-post--excerpt">
3953+
<header class="md-post__header">
3954+
38783955
<div class="md-post__meta md-meta">
38793956
<ul class="md-meta__list">
38803957
<li class="md-meta__item">

blog/index.html

Lines changed: 66 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -3374,6 +3374,17 @@
33743374
</label>
33753375
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
33763376

3377+
<li class="md-nav__item">
3378+
<a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
3379+
<span class="md-ellipsis">
3380+
3381+
Benchmarking Prefill–Decode ratios: fixed vs dynamic
3382+
3383+
</span>
3384+
</a>
3385+
3386+
</li>
3387+
33773388
<li class="md-nav__item">
33783389
<a href="#nebius-joins-dstack-sky-gpu-marketplace-with-production-ready-gpu-clusters" class="md-nav__link">
33793390
<span class="md-ellipsis">
@@ -3471,17 +3482,6 @@
34713482
</span>
34723483
</a>
34733484

3474-
</li>
3475-
3476-
<li class="md-nav__item">
3477-
<a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
3478-
<span class="md-ellipsis">
3479-
3480-
How EA uses dstack to fast-track AI development
3481-
3482-
</span>
3483-
</a>
3484-
34853485
</li>
34863486

34873487
</ul>
@@ -3747,6 +3747,17 @@
37473747
</label>
37483748
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
37493749

3750+
<li class="md-nav__item">
3751+
<a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
3752+
<span class="md-ellipsis">
3753+
3754+
Benchmarking Prefill–Decode ratios: fixed vs dynamic
3755+
3756+
</span>
3757+
</a>
3758+
3759+
</li>
3760+
37503761
<li class="md-nav__item">
37513762
<a href="#nebius-joins-dstack-sky-gpu-marketplace-with-production-ready-gpu-clusters" class="md-nav__link">
37523763
<span class="md-ellipsis">
@@ -3844,17 +3855,6 @@
38443855
</span>
38453856
</a>
38463857

3847-
</li>
3848-
3849-
<li class="md-nav__item">
3850-
<a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
3851-
<span class="md-ellipsis">
3852-
3853-
How EA uses dstack to fast-track AI development
3854-
3855-
</span>
3856-
</a>
3857-
38583858
</li>
38593859

38603860
</ul>
@@ -3881,6 +3881,50 @@ <h1 id="blog">Blog<a class="headerlink" href="#blog" title="Permanent link">&par
38813881
<article class="md-post md-post--excerpt">
38823882
<header class="md-post__header">
38833883

3884+
<div class="md-post__meta md-meta">
3885+
<ul class="md-meta__list">
3886+
<li class="md-meta__item">
3887+
<time datetime="2025-09-25 00:00:00+00:00">September 25, 2025</time></li>
3888+
3889+
<li class="md-meta__item">
3890+
in
3891+
3892+
<a href="benchmarks/" class="md-meta__link">Benchmarks</a></li>
3893+
3894+
3895+
3896+
<li class="md-meta__item">
3897+
3898+
6 min read
3899+
3900+
</li>
3901+
3902+
3903+
</ul>
3904+
3905+
</div>
3906+
</header>
3907+
<div class="md-post__content md-typeset">
3908+
<h2 id="benchmarking-prefilldecode-ratios-fixed-vs-dynamic"><a class="toclink" href="benchmarking-pd-ratios/">Benchmarking Prefill–Decode ratios: fixed vs dynamic</a></h2>
3909+
<p>This benchmark investigates whether the Prefill–Decode worker ratio needs to be managed dynamically at runtime, or if a fixed split can deliver the same performance with simpler orchestration.<br />
3910+
We evaluate different ratios across workload profiles and concurrency levels to measure their impact on TTFT, ITL, and throughput, and to see whether fixing the ratio in advance is a practical alternative to dynamic adjustment.</p>
3911+
<p><img src="https://dstack.ai/static-assets/static-assets/images/benchmarking-pd-ratios.png" width="630" /></p>
3912+
3913+
3914+
<nav class="md-post__action">
3915+
<a href="benchmarking-pd-ratios/">
3916+
<span>Continue reading</span>
3917+
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
3918+
</a>
3919+
</nav>
3920+
3921+
3922+
</div>
3923+
</article>
3924+
3925+
<article class="md-post md-post--excerpt">
3926+
<header class="md-post__header">
3927+
38843928
<div class="md-post__meta md-meta">
38853929
<ul class="md-meta__list">
38863930
<li class="md-meta__item">
@@ -4290,50 +4334,6 @@ <h5 id="rolling-deployments"><a class="toclink" href="changelog-07-25/#rolling-d
42904334
</div>
42914335
</article>
42924336

4293-
<article class="md-post md-post--excerpt">
4294-
<header class="md-post__header">
4295-
4296-
<div class="md-post__meta md-meta">
4297-
<ul class="md-meta__list">
4298-
<li class="md-meta__item">
4299-
<time datetime="2025-05-22 00:00:00+00:00">May 22, 2025</time></li>
4300-
4301-
<li class="md-meta__item">
4302-
in
4303-
4304-
<a href="case-studies/" class="md-meta__link">Case studies</a></li>
4305-
4306-
4307-
4308-
<li class="md-meta__item">
4309-
4310-
3 min read
4311-
4312-
</li>
4313-
4314-
4315-
</ul>
4316-
4317-
</div>
4318-
</header>
4319-
<div class="md-post__content md-typeset">
4320-
<h2 id="how-ea-uses-dstack-to-fast-track-ai-development"><a class="toclink" href="ea-gtc25/">How EA uses dstack to fast-track AI development</a></h2>
4321-
<p>At NVIDIA GTC 2025, Electronic Arts <a href="https://www.nvidia.com/en-us/on-demand/session/gtc25-s73667/" target="_blank">shared <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a> how they’re scaling AI development and managing infrastructure across teams. They highlighted using tools like <code>dstack</code> to provision GPUs quickly, flexibly, and cost-efficiently. This case study summarizes key insights from their talk.</p>
4322-
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-1.png" width="630" /></p>
4323-
<p>EA has over 100+ AI projects running, and the number keeps growing. There are many teams with AI needs—game dev, ML engineers, AI researchers, and platform teams—supported by a central tech team. Some need full MLOps support; others have in-house expertise but need flexible tooling and infrastructure.</p>
4324-
4325-
4326-
<nav class="md-post__action">
4327-
<a href="ea-gtc25/">
4328-
<span>Continue reading</span>
4329-
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4330-
</a>
4331-
</nav>
4332-
4333-
4334-
</div>
4335-
</article>
4336-
43374337

43384338

43394339

blog/nebius-in-dstack-sky/index.html

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818
<link rel="prev" href="../state-of-cloud-gpu-2025/">
1919

2020

21+
<link rel="next" href="../benchmarking-pd-ratios/">
22+
2123

2224

2325

0 commit comments

Comments
 (0)