Skip to content

Commit 9b85cf9

Browse files
authored
Comfy Aimdo 0.2.5 + Fix offload performance in DynamicVram (Comfy-Org#12754)
* ops: dont unpin nothing This was calling into aimdo in the none case (offloaded weight). Whats worse, is aimdo syncs for unpinning an offloaded weight, as that is the corner case of a weight getting evicted by its own use which does require a sync. But this was heppening every offloaded weight causing slowdown. * mp: fix get_free_memory policy The ModelPatcherDynamic get_free_memory was deducting the model from to try and estimate the conceptual free memory with doing any offloading. This is kind of what the old memory_memory_required was estimating in ModelPatcher load logic, however in practical reality, between over-estimates and padding, the loader usually underloaded models enough such that sampling could send CFG +/- through together even when partially loaded. So don't regress from the status quo and instead go all in on the idea that offloading is less of an issue than debatching. Tell the sampler it can use everything.
1 parent d531e3f commit 9b85cf9

3 files changed

Lines changed: 10 additions & 10 deletions

File tree

comfy/model_patcher.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -307,7 +307,13 @@ def lowvram_patch_counter(self):
307307
return self.model.lowvram_patch_counter
308308

309309
def get_free_memory(self, device):
310-
return comfy.model_management.get_free_memory(device)
310+
#Prioritize batching (incl. CFG/conds etc) over keeping the model resident. In
311+
#the vast majority of setups a little bit of offloading on the giant model more
312+
#than pays for CFG. So return everything both torch and Aimdo could give us
313+
aimdo_mem = 0
314+
if comfy.memory_management.aimdo_enabled:
315+
aimdo_mem = comfy_aimdo.model_vbar.vbars_analyze()
316+
return comfy.model_management.get_free_memory(device) + aimdo_mem
311317

312318
def get_clone_model_override(self):
313319
return self.model, (self.backup, self.backup_buffers, self.object_patches_backup, self.pinned)
@@ -1465,12 +1471,6 @@ def loaded_size(self):
14651471
vbar = self._vbar_get()
14661472
return (vbar.loaded_size() if vbar is not None else 0) + self.model.model_loaded_weight_memory
14671473

1468-
def get_free_memory(self, device):
1469-
#NOTE: on high condition / batch counts, estimate should have already vacated
1470-
#all non-dynamic models so this is safe even if its not 100% true that this
1471-
#would all be avaiable for inference use.
1472-
return comfy.model_management.get_total_memory(device) - self.model_size()
1473-
14741474
#Pinning is deferred to ops time. Assert against this API to avoid pin leaks.
14751475

14761476
def pin_weight_to_device(self, key):

comfy/ops.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -269,8 +269,8 @@ def uncast_bias_weight(s, weight, bias, offload_stream):
269269
return
270270
os, weight_a, bias_a = offload_stream
271271
device=None
272-
#FIXME: This is not good RTTI
273-
if not isinstance(weight_a, torch.Tensor):
272+
#FIXME: This is really bad RTTI
273+
if weight_a is not None and not isinstance(weight_a, torch.Tensor):
274274
comfy_aimdo.model_vbar.vbar_unpin(s._v)
275275
device = weight_a
276276
if os is None:

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ alembic
2222
SQLAlchemy
2323
av>=14.2.0
2424
comfy-kitchen>=0.2.7
25-
comfy-aimdo>=0.2.4
25+
comfy-aimdo>=0.2.5
2626
requests
2727

2828
#non essential dependencies:

0 commit comments

Comments
 (0)