-
-
Notifications
You must be signed in to change notification settings - Fork 253
Flux
Calamdor edited this page Nov 15, 2025
·
17 revisions
Table of Contents:
This page is a work in progress as information is learned about Flux.
Flux is a DiT Transformer flowmatching model that has high learning potential but it is a very large and also slow model to work with.
- Like SD3, a hugginfgace key or a local copy of the diffusers model is needed. OneTrainer now has an option for you to put your HF token in the GUI.
- AIO (All in One) safetensor models will work.
- Tested with helheimFlux_v10FP8AIO.safetensors and helheimFlux_v10FP16AIO.safetensors
- NF4 AIO models will not work (not supported by diffusers)
- Turbo model safetensors will likely not work and are likely not trainable with OT regardless due to how a Turbo model is made
- HuggingFace links of finetuned models can work, but the repo must be in diffusers format.
- Tested with:
AlekseyCalvin/Colossus_2.1_dedistilled_by_AfroMan4peace - Web site link: https://huggingface.co/AlekseyCalvin/Colossus_2.1_dedistilled_by_AfroMan4peace
- Tested with:
- AIO (All in One) safetensor models will work.
- Completely De-distilled Flux will not work, as the OneTrainer workflow expects the guidance variable.
- The standard Flux safetensors file on the Black Forest Labs Hugging Face repo is just the transformer and the VAE, and does not include the text encoders.
- BFL (Black Forest Labs) has not given all of the details for the model, so some items are still a black box. However, it is important to note that higher resolutions required a time shift. OneTrainer has an option to automatically shift the time step based upon training resolution specifically for Flux.
- Flux now supports up to 512 tokens for training (originally 77) but you have to decide yourself how many tokens you want to use. There is no correct value unfortunately, but here is what we know:
- The model was likely trained at 512 tokens, but we don't know for sure. It might have been a mix.
- Default training tokens is still 77, maybe not the best choice.
- ComfyUI/Swarm samples Flux at 256 tokens. If you train for use in ComfyUI/Swarm, 256 is probably a good choice.
- Where to set the token limit:

-
Embeddings do not likely work, due to the nature of T5.- Update, Nero has refactored embeddings and output embeddings will now work with Flux.
- These output embeddings are however only usable in OneTrainer. Use in other software will likely require the OMI format to be finalized first.
- Some Lora formats (Full Dora) will not work in all generation software. Forge, with a full dora, is known to produce a purple output.
- FLEX is not currently supported, as it is not quite a Flux.1 dev based model.
- Flux Schnell is not supported, nor has there been any plans to support it. It is better to use the dedicated Flux trainers if you want to work with this model (Flux Gym, AI Toolkit)
- Lora is currently the only recommended training. A finetune will require you to eventually train through the distillation of the model.
- FP8 is the minimum recommended precision to not have artifacts.
- NF4 precision allows Flux to be used with lower VRAM cards, but it should be noted that a grid pattern can be very visible at this precision level.
- The OneTrainer Lora can be used in Comfy in the standard Lora Loader.
- Flux has a robust architecture. It is possible to train a LoRa at 512 or 768 and generate at 1024 with minimal loss in quality.
- It is possible to train a Flux Lora on a GPU with 12GB of VRAM.
- 8GB cards could work with the addition of GGUF support. As basic windows features can take 1GB or more of VRAM, it can be beneficial to use an integrated GPU (iGPU) to run the Windows desktop if using an 8GB graphics card.
- Features such as Torch Compile (now available) help speed up Lora training, and is extremely beneficial for consumer GPUs.
- Black Forest Labs has released the model weights for Flux Krea
- To use this model, ensure you have access to the huggingface repo, as the model is gated.
- In OneTrainer, ensure your hf token is in the field, and use
black-forest-labs/FLUX.1-Krea-devas the model - Original Flux Loras will work with Krea, to a degree, but they lose some quality.
- As Krea is designed as a drop in replacement for Flux.Dev, your training settings from original Flux should apply.