dstackai
diff --git a/‎assets/images/social/blog/benchmarking-pd-ratios.png‎
39.2 KB b/‎assets/images/social/blog/benchmarking-pd-ratios.png‎
39.2 KB
diff --git a/‎blog/benchmarking-pd-ratios/index.html‎
Lines changed: 4141 additions & 0 deletions b/‎blog/benchmarking-pd-ratios/index.html‎
Lines changed: 4141 additions & 0 deletions
diff --git a/‎blog/benchmarks/index.html‎
Lines changed: 77 additions & 0 deletions b/‎blog/benchmarks/index.html‎
Lines changed: 77 additions & 0 deletions
diff --git a/‎blog/index.html‎
Lines changed: 66 additions & 66 deletions b/‎blog/index.html‎
Lines changed: 66 additions & 66 deletions
diff --git a/‎blog/nebius-in-dstack-sky/index.html‎
Lines changed: 2 additions & 0 deletions b/‎blog/nebius-in-dstack-sky/index.html‎
Lines changed: 2 additions & 0 deletions
@@ -3529,6 +3529,17 @@
     </label>
     <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
 
+        <li class="md-nav__item">
+  <a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
+    <span class="md-ellipsis">
+      
+        Benchmarking Prefill–Decode ratios: fixed vs dynamic
+      
+    </span>
+  </a>
+  
+</li>
+      
         <li class="md-nav__item">
   <a href="#benchmarking-amd-gpus-bare-metal-vms" class="md-nav__link">
     <span class="md-ellipsis">
@@ -3713,6 +3724,17 @@
     </label>
     <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
 
+        <li class="md-nav__item">
+  <a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
+    <span class="md-ellipsis">
+      
+        Benchmarking Prefill–Decode ratios: fixed vs dynamic
+      
+    </span>
+  </a>
+  
+</li>
+      
         <li class="md-nav__item">
   <a href="#benchmarking-amd-gpus-bare-metal-vms" class="md-nav__link">
     <span class="md-ellipsis">
@@ -3802,6 +3824,17 @@
     </label>
     <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
 
+        <li class="md-nav__item">
+  <a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
+    <span class="md-ellipsis">
+      
+        Benchmarking Prefill–Decode ratios: fixed vs dynamic
+      
+    </span>
+  </a>
+  
+</li>
+      
         <li class="md-nav__item">
   <a href="#benchmarking-amd-gpus-bare-metal-vms" class="md-nav__link">
     <span class="md-ellipsis">
@@ -3875,6 +3908,50 @@ <h1 id="benchmarks">Benchmarks<a class="headerlink" href="#benchmarks" title="Pe
         <article class="md-post md-post--excerpt">
   <header class="md-post__header">
 
+    <div class="md-post__meta md-meta">
+      <ul class="md-meta__list">
+        <li class="md-meta__item">
+          <time datetime="2025-09-25 00:00:00+00:00">September 25, 2025</time></li>
+        
+          <li class="md-meta__item">
+            in
+            
+              <a href="./" class="md-meta__link">Benchmarks</a></li>
+        
+        
+          
+          <li class="md-meta__item">
+            
+              6 min read
+            
+          </li>
+        
+        
+      </ul>
+      
+    </div>
+  </header>
+  <div class="md-post__content md-typeset">
+    <h2 id="benchmarking-prefilldecode-ratios-fixed-vs-dynamic"><a class="toclink" href="../benchmarking-pd-ratios/">Benchmarking Prefill–Decode ratios: fixed vs dynamic</a></h2>
+<p>This benchmark investigates whether the Prefill–Decode worker ratio needs to be managed dynamically at runtime, or if a fixed split can deliver the same performance with simpler orchestration.<br />
+We evaluate different ratios across workload profiles and concurrency levels to measure their impact on TTFT, ITL, and throughput, and to see whether fixing the ratio in advance is a practical alternative to dynamic adjustment.</p>
+<p><img src="https://dstack.ai/static-assets/static-assets/images/benchmarking-pd-ratios.png" width="630" /></p>
+
+    
+      <nav class="md-post__action">
+        <a href="../benchmarking-pd-ratios/">
+            <span>Continue reading</span>
+            <span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
+        </a>
+      </nav>
+    
+    
+  </div>
+</article>
+      
+        <article class="md-post md-post--excerpt">
+  <header class="md-post__header">
+    
     <div class="md-post__meta md-meta">
       <ul class="md-meta__list">
         <li class="md-meta__item">
 
@@ -3374,6 +3374,17 @@
     </label>
     <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
 
+        <li class="md-nav__item">
+  <a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
+    <span class="md-ellipsis">
+      
+        Benchmarking Prefill–Decode ratios: fixed vs dynamic
+      
+    </span>
+  </a>
+  
+</li>
+      
         <li class="md-nav__item">
   <a href="#nebius-joins-dstack-sky-gpu-marketplace-with-production-ready-gpu-clusters" class="md-nav__link">
     <span class="md-ellipsis">
@@ -3471,17 +3482,6 @@
     </span>
   </a>
 
-</li>
-      
-        <li class="md-nav__item">
-  <a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
-    <span class="md-ellipsis">
-      
-        How EA uses dstack to fast-track AI development
-      
-    </span>
-  </a>
-  
 </li>
 
     </ul>
@@ -3747,6 +3747,17 @@
     </label>
     <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
 
+        <li class="md-nav__item">
+  <a href="#benchmarking-prefilldecode-ratios-fixed-vs-dynamic" class="md-nav__link">
+    <span class="md-ellipsis">
+      
+        Benchmarking Prefill–Decode ratios: fixed vs dynamic
+      
+    </span>
+  </a>
+  
+</li>
+      
         <li class="md-nav__item">
   <a href="#nebius-joins-dstack-sky-gpu-marketplace-with-production-ready-gpu-clusters" class="md-nav__link">
     <span class="md-ellipsis">
@@ -3844,17 +3855,6 @@
     </span>
   </a>
 
-</li>
-      
-        <li class="md-nav__item">
-  <a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
-    <span class="md-ellipsis">
-      
-        How EA uses dstack to fast-track AI development
-      
-    </span>
-  </a>
-  
 </li>
 
     </ul>
@@ -3881,6 +3881,50 @@ <h1 id="blog">Blog<a class="headerlink" href="#blog" title="Permanent link">&par
         <article class="md-post md-post--excerpt">
   <header class="md-post__header">
 
+    <div class="md-post__meta md-meta">
+      <ul class="md-meta__list">
+        <li class="md-meta__item">
+          <time datetime="2025-09-25 00:00:00+00:00">September 25, 2025</time></li>
+        
+          <li class="md-meta__item">
+            in
+            
+              <a href="benchmarks/" class="md-meta__link">Benchmarks</a></li>
+        
+        
+          
+          <li class="md-meta__item">
+            
+              6 min read
+            
+          </li>
+        
+        
+      </ul>
+      
+    </div>
+  </header>
+  <div class="md-post__content md-typeset">
+    <h2 id="benchmarking-prefilldecode-ratios-fixed-vs-dynamic"><a class="toclink" href="benchmarking-pd-ratios/">Benchmarking Prefill–Decode ratios: fixed vs dynamic</a></h2>
+<p>This benchmark investigates whether the Prefill–Decode worker ratio needs to be managed dynamically at runtime, or if a fixed split can deliver the same performance with simpler orchestration.<br />
+We evaluate different ratios across workload profiles and concurrency levels to measure their impact on TTFT, ITL, and throughput, and to see whether fixing the ratio in advance is a practical alternative to dynamic adjustment.</p>
+<p><img src="https://dstack.ai/static-assets/static-assets/images/benchmarking-pd-ratios.png" width="630" /></p>
+
+    
+      <nav class="md-post__action">
+        <a href="benchmarking-pd-ratios/">
+            <span>Continue reading</span>
+            <span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
+        </a>
+      </nav>
+    
+    
+  </div>
+</article>
+      
+        <article class="md-post md-post--excerpt">
+  <header class="md-post__header">
+    
     <div class="md-post__meta md-meta">
       <ul class="md-meta__list">
         <li class="md-meta__item">
@@ -4290,50 +4334,6 @@ <h5 id="rolling-deployments"><a class="toclink" href="changelog-07-25/#rolling-d
   </div>
 </article>
 
-        <article class="md-post md-post--excerpt">
-  <header class="md-post__header">
-    
-    <div class="md-post__meta md-meta">
-      <ul class="md-meta__list">
-        <li class="md-meta__item">
-          <time datetime="2025-05-22 00:00:00+00:00">May 22, 2025</time></li>
-        
-          <li class="md-meta__item">
-            in
-            
-              <a href="case-studies/" class="md-meta__link">Case studies</a></li>
-        
-        
-          
-          <li class="md-meta__item">
-            
-              3 min read
-            
-          </li>
-        
-        
-      </ul>
-      
-    </div>
-  </header>
-  <div class="md-post__content md-typeset">
-    <h2 id="how-ea-uses-dstack-to-fast-track-ai-development"><a class="toclink" href="ea-gtc25/">How EA uses dstack to fast-track AI development</a></h2>
-<p>At NVIDIA GTC 2025, Electronic Arts <a href="https://www.nvidia.com/en-us/on-demand/session/gtc25-s73667/" target="_blank">shared <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a> how they’re scaling AI development and managing infrastructure across teams. They highlighted using tools like <code>dstack</code> to provision GPUs quickly, flexibly, and cost-efficiently. This case study summarizes key insights from their talk.</p>
-<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-1.png" width="630" /></p>
-<p>EA has over 100+ AI projects running, and the number keeps growing. There are many teams with AI needs—game dev, ML engineers, AI researchers, and platform teams—supported by a central tech team. Some need full MLOps support; others have in-house expertise but need flexible tooling and infrastructure.</p>
-
-    
-      <nav class="md-post__action">
-        <a href="ea-gtc25/">
-            <span>Continue reading</span>
-            <span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
-        </a>
-      </nav>
-    
-    
-  </div>
-</article>
-      
 
 
 
 
@@ -18,6 +18,8 @@
         <link rel="prev" href="../state-of-cloud-gpu-2025/">
 
 
+        <link rel="next" href="../benchmarking-pd-ratios/">
+
Original file line number	Diff line number	Diff line change
`@@ -18,6 +18,8 @@`
`18`	`18`	`<link rel="prev" href="../state-of-cloud-gpu-2025/">`
`19`	`19`
`20`	`20`
	`21`	`+ <link rel="next" href="../benchmarking-pd-ratios/">`
	`22`	`+`
`21`	`23`
`22`	`24`
`23`	`25`