TL;DR: Treat outputs as a set; formulate group inference as a Quadratic Integer Programming (QIP) problem; scale efficiently with progressive pruning.
Website | Paper | 🤗 Demo FLUX.1 Schnell | 🤗 Demo FLUX.1 Kontext
Quick start: Running Locally (CLI) | Running Locally (Gradio UI)
The standard practice is to sample from generative models independently. However, many applications require sampling a model multiple times, for example, to present a gallery of 4-8 outputs. This I.I.D. sampling approach often leads to redundant results, limiting creative exploration. In this repository, we introduce Scalable Group Inference, a method to generate diverse and high-quality sets of outputs.
Scaling Group Inference for Diverse and High-Quality Generation
Gaurav Parmar, Or Patashnik, Daniil Ostashev, Kuan-Chieh (Jackson) Wang, Kfir Aberman, Srinivasa Narasimhan, Jun-Yan Zhu
arXiv 2508.15773
CMU and Snap
Given a large number of M candidate noises, we gradually reduce the candidate set through iterative denoising and pruning. At each step, we leverage the diffusion model to denoise the candidates. We then compute a quality metric (unary term) and pairwise distances (binary term), and solve a quadratic integer programming (QIP) problem to progressively prune the set. This ultimately yields a small final group of K diverse and high-quality outputs.
Gallery of outputs. Outputs generated with our proposed group inference method and standard I.I.D. sampling. Top row shows results with FLUX.1 Schnell, the second row uses FLUX.1 Dev, and the last two rows use FLUX.1 Depth.
Using different score functions Our method allows for targeted diversity by defining different pairwise objectives. The second and third rows show results where the unary quality term is identical but the pairwise binary term is varied. The middle row uses a color-based binary term, while the bottom row uses a DINObased binary term to achieve semantic and structural diversity.
Environment Setup
- We provide a conda env file that contains all the required dependencies.
conda env create -f environment.yaml
- Following this, you can activate the conda environment with the command below.
conda activate group-inference
The following command will generate an output of 4 samples for a given prompt with the model flux-schnell
and flux-dev
.
python src/inference.py --prompt "a photo of a dog" --model_name "flux-schnell"
python src/inference.py --prompt "a photo of a playful dog" --model_name "flux-dev"
For a complete list of available arguments, see docs/arguments.md.
Example Outputs:
Input Caption | Output Group Size | Generated Group |
---|---|---|
A photo of a dog. | 4 |
![]() ![]() ![]() ![]() |
A painting of a dog in the style of van gogh. | 4 |
![]() ![]() ![]() ![]() |
The following command will generate an output of 4 samples for a given depth map with the model flux-depth
.
python src/inference.py --prompt "a photo of a fruit" --model_name "flux-depth" --input_depth_map "assets/example_inputs/depth_fruit.png"
Example Outputs:
Input Caption | Input Depth Map | Output Group Size | Generated Group |
---|---|---|---|
A photo of a fruit. |
![]() |
4 |
![]() ![]() ![]() ![]() |
The following command will generate an output of 4 samples for a given canny edge map with the FLUX.1 Canny-dev model.
python src/inference.py --prompt "a photo of a robot" --model_name "flux-canny" --input_canny_edge_map "assets/example_inputs/robot_canny.png"
Example Outputs:
Input Caption | Input Canny Edge Map | Output Group Size | Generated Group |
---|---|---|---|
A photo of a robot. |
![]() |
4 |
![]() ![]() ![]() ![]() |
python src/inference.py --prompt "Cat is playing outside in nature." --model_name "flux-kontext" --input_image "assets/example_inputs/cat.png"
Example Outputs:
Editing Caption | Input Image | Output Group Size | Generated Group |
---|---|---|---|
Cat is playing outside in nature. |
![]() |
4 |
![]() ![]() ![]() ![]() |
Cat is drinking milk. |
![]() |
4 |
![]() ![]() ![]() ![]() |
- Install the dependencies for the Gradio demo.
pip install gradio
- Run the Gradio demo with different models:
# Set the model via environment variable, then run gradio
export MODEL_NAME=flux-schnell && gradio src/gradio_demo.py
export MODEL_NAME=flux-dev && gradio src/gradio_demo.py
export MODEL_NAME=flux-depth && gradio src/gradio_demo.py
export MODEL_NAME=flux-canny && gradio src/gradio_demo.py
export MODEL_NAME=flux-kontext && gradio src/gradio_demo.py
If you find this repository useful for your research, please cite the following work.
@article{Parmar2025group,
title={Scaling Group Inference for Diverse and High-Quality Generation},
author={Gaurav Parmar and Or Patashnik and Daniil Ostashev and Kuan-Chieh (Jackson) Wang and Kfir Aberman and Srinivasa Narasimhan and Jun-Yan Zhu},
year={2025},
journal={arXiv preprint arXiv:2508.15773},
}