Flux

Table of Contents:

This page is a work in progress as information is learned about Flux.

Flux is a DiT Transformer flowmatching model that has high learning potential but it is a very large and also slow model to work with.

Model Details:

Like SD3, a hugginfgace key or a local copy of the diffusers model is needed. OneTrainer now has an option for you to put your HF token in the GUI.
- AIO (All in One) safetensor models will work.
  - Tested with helheimFlux_v10FP8AIO.safetensors and helheimFlux_v10FP16AIO.safetensors
  - NF4 AIO models will not work (not supported by diffusers)
  - Turbo model safetensors will likely not work and are likely not trainable with OT regardless due to how a Turbo model is made
- HuggingFace links of finetuned models can work, but the repo must be in diffusers format.
  - Tested with: AlekseyCalvin/Colossus_2.1_dedistilled_by_AfroMan4peace
  - Web site link: https://huggingface.co/AlekseyCalvin/Colossus_2.1_dedistilled_by_AfroMan4peace
Completely De-distilled Flux will not work, as the OneTrainer workflow expects the guidance variable.
The standard Flux safetensors file on the Black Forest Labs Hugging Face repo is just the transformer and the VAE, and does not include the text encoders.
BFL (Black Forest Labs) has not given all of the details for the model, so some items are still a black box. However, it is important to note that higher resolutions required a time shift. OneTrainer has an option to automatically shift the time step based upon training resolution specifically for Flux.

Token limit:

Flux now supports up to 512 tokens for training (originally 77) but you have to decide yourself how many tokens you want to use. There is no correct value unfortunately, but here is what we know:
The model was likely trained at 512 tokens, but we don't know for sure. It might have been a mix.
Default training tokens is still 77, maybe not the best choice.
ComfyUI/Swarm samples Flux at 256 tokens. If you train for use in ComfyUI/Swarm, 256 is probably a good choice.
Where to set the token limit:

TE2

Limitations:

~~Embeddings do not likely work, due to the nature of T5.~~
- Update, Nero has refactored embeddings and output embeddings will now work with Flux.
- These output embeddings are however only usable in OneTrainer. Use in other software will likely require the OMI format to be finalized first.
Some Lora formats (Full Dora) will not work in all generation software. Forge, with a full dora, is known to produce a purple output.
FLEX is not currently supported, as it is not quite a Flux.1 dev based model.
Flux Schnell is not supported, nor has there been any plans to support it. It is better to use the dedicated Flux trainers if you want to work with this model (Flux Gym, AI Toolkit)

Current Information:

Lora is currently the only recommended training. A finetune will require you to eventually train through the distillation of the model.
FP8 is the minimum recommended precision to not have artifacts.
NF4 precision allows Flux to be used with lower VRAM cards, but it should be noted that a grid pattern can be very visible at this precision level.
The OneTrainer Lora can be used in Comfy in the standard Lora Loader.
Flux has a robust architecture. It is possible to train a LoRa at 512 or 768 and generate at 1024 with minimal loss in quality.
It is possible to train a Flux Lora on a GPU with 12GB of VRAM.
8GB cards could work with the addition of GGUF support. As basic windows features can take 1GB or more of VRAM, it can be beneficial to use an integrated GPU (iGPU) to run the Windows desktop if using an 8GB graphics card.
Features such as Torch Compile (now available) help speed up Lora training, and is extremely beneficial for consumer GPUs.

Flux Krea:

Black Forest Labs has released the model weights for Flux Krea
To use this model, ensure you have access to the huggingface repo, as the model is gated.
In OneTrainer, ensure your hf token is in the field, and use black-forest-labs/FLUX.1-Krea-dev as the model
Original Flux Loras will work with Krea, to a degree, but they lose some quality.
As Krea is designed as a drop in replacement for Flux.Dev, your training settings from original Flux should apply.

Overview

Home

Overview

Learning

Training

Getting Started

The Program - Tab Explanation

General

Model

Data

Concepts

Validation Datasets

Prior Prediction Datasets

AR Buckets

Training

Optimizers

Advanced Optimizers

Custom Scheduler

Sampling

Backup and Saving

Tools

Additional Embeddings

Cloud

Embedding

Lora

More info

Infos, Guides and Lessons Learnt

Misc Info

Model Support

Guides

One Trainer March 2024 Guide

Manually setup OneTrainer in Runpod

Other Tools - Helpful Links

Lessons Learnt

Frequently Asked Questions

Lessons Learnt and Tutorials

For Developers

Dev Corner

Developing Locally, Training Remotely on Runpod

Quick Start for Developers

CLI Training

Docker Image

Embedding Training

Project Structure

RAM Offloading

Uh oh!

Flux

Model Details:

Token limit:

Limitations:

Current Information:

Flux Krea:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Overview

Training

More info

For Developers

Clone this wiki locally