LatentUnfold

Flux Already Knows – Activating Subject-Driven Image Generation without Training
Hao Kang*, Stathi Fotiadis*, Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Min Jin Chong, and Xin Lu
Bytedance Intelligent Creation

Abstract
We propose a simple yet effective zero-shot framework for subject-driven image generation using a vanilla Flux model. By framing the task as grid-based image completion and simply replicating the subject image(s) in a mosaic layout, we activate strong identity-preserving capabilities without any additional data, training, or inference-time fine-tuning. This “free lunch” approach is further strengthened by a novel cascade attention design and meta prompting technique, boosting fidelity and versatility. Experimental results show that our method outperforms baselines across multiple key metrics in benchmarks and human preference studies, with trade-offs in certain aspects. Additionally, it supports diverse edits, including logo insertion, virtual try-on, and subject replacement or insertion. These results demonstrate that a pre-trained foundational text-to-image model can enable high-quality, resource-efficient subject-driven generation, opening new possibilities for lightweight customization in downstream applications.

Quick Start

Environment setup (may need to modify bootstrap.sh accordingly)

source bootstrap.sh

Run example

# Basic Call
python3 run_latent_unfold.py

# Gradio Demo
python3 app.py

License

This repository is licensed under the Apache 2.0 License.

Acknowledgement

We would like to express our gratitude to the authors of the following repositories, from which we referenced code, model or assets:
https://github.com/huggingface/diffusers
https://github.com/wooyeolbaek/attention-map-diffusers
https://github.com/Yuanshi9815/OminiControl
https://github.com/google/dreambooth
https://huggingface.co/briaai/RMBG-2.0
https://huggingface.co/black-forest-labs/FLUX.1-dev

Citation

If you find this work useful in your research, please consider citing:

@article{kang2025latentunfold,
      title={Flux Already Knows - Activating Subject-Driven Image Generation without Training}, 
      author={Kang, Hao and Fotiadis, Stathi and Jiang, Liming and Yan, Qing and Jia, Yumin and Liu, Zichuan and Chong, Min Jin and Lu, Xin},
      journal={arXiv preprint}, 
      volume={arXiv:2504.11478},
      year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
configs		configs
latent_unfold		latent_unfold
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
bootstrap.sh		bootstrap.sh
requirements.txt		requirements.txt
run_latent_unfold.py		run_latent_unfold.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LatentUnfold

Quick Start

License

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

bytedance/LatentUnfold

Folders and files

Latest commit

History

Repository files navigation

LatentUnfold

Quick Start

License

Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages