Scalable Group Inference

TL;DR: Treat outputs as a set; formulate group inference as a Quadratic Integer Programming (QIP) problem; scale efficiently with progressive pruning.

Website | Paper | 🤗 Demo FLUX.1 Schnell | 🤗 Demo FLUX.1 Kontext

Quick start: Running Locally (CLI) | Running Locally (Gradio UI)

The standard practice is to sample from generative models independently. However, many applications require sampling a model multiple times, for example, to present a gallery of 4-8 outputs. This I.I.D. sampling approach often leads to redundant results, limiting creative exploration. In this repository, we introduce Scalable Group Inference, a method to generate diverse and high-quality sets of outputs.

Paper

Scaling Group Inference for Diverse and High-Quality Generation
Gaurav Parmar, Or Patashnik, Daniil Ostashev, Kuan-Chieh (Jackson) Wang, Kfir Aberman, Srinivasa Narasimhan, Jun-Yan Zhu
arXiv 2508.15773
CMU and Snap

Method

Given a large number of M candidate noises, we gradually reduce the candidate set through iterative denoising and pruning. At each step, we leverage the diffusion model to denoise the candidates. We then compute a quality metric (unary term) and pairwise distances (binary term), and solve a quadratic integer programming (QIP) problem to progressively prune the set. This ultimately yields a small final group of K diverse and high-quality outputs.

Results

Gallery of outputs. Outputs generated with our proposed group inference method and standard I.I.D. sampling. Top row shows results with FLUX.1 Schnell, the second row uses FLUX.1 Dev, and the last two rows use FLUX.1 Depth.

Using different score functions Our method allows for targeted diversity by defining different pairwise objectives. The second and third rows show results where the unary quality term is identical but the pairwise binary term is varied. The middle row uses a color-based binary term, while the bottom row uses a DINObased binary term to achieve semantic and structural diversity.

Getting Started

Environment Setup

We provide a conda env file that contains all the required dependencies.
```
conda env create -f environment.yaml
```
Following this, you can activate the conda environment with the command below.
```
conda activate group-inference
```

Text-to-Image group inference

The following command will generate an output of 4 samples for a given prompt with the model flux-schnell and flux-dev.

python src/inference.py --prompt "a photo of a dog" --model_name "flux-schnell"
python src/inference.py --prompt "a photo of a playful dog" --model_name "flux-dev"

For a complete list of available arguments, see docs/arguments.md.

Example Outputs:

Input Caption	Output Group Size	Generated Group
A photo of a dog.	4
A painting of a dog in the style of van gogh.	4

Depth-to-Image group inference

The following command will generate an output of 4 samples for a given depth map with the model flux-depth.

python src/inference.py --prompt "a photo of a fruit" --model_name "flux-depth" --input_depth_map "assets/example_inputs/depth_fruit.png"

Example Outputs:

Input Caption	Input Depth Map	Output Group Size	Generated Group
A photo of a fruit.		4

Canny Edge-to-Image group inference

The following command will generate an output of 4 samples for a given canny edge map with the FLUX.1 Canny-dev model.

python src/inference.py --prompt "a photo of a robot" --model_name "flux-canny" --input_canny_edge_map "assets/example_inputs/robot_canny.png"

Example Outputs:

Input Caption	Input Canny Edge Map	Output Group Size	Generated Group
A photo of a robot.		4

Image editing (FLUX.1 Kontext)

python src/inference.py --prompt "Cat is playing outside in nature." --model_name "flux-kontext" --input_image "assets/example_inputs/cat.png"

Example Outputs:

Editing Caption	Input Image	Output Group Size	Generated Group
Cat is playing outside in nature.		4
Cat is drinking milk.		4

Local Gradio Demo

Install the dependencies for the Gradio demo.

pip install gradio

Run the Gradio demo with different models:

# Set the model via environment variable, then run gradio
export MODEL_NAME=flux-schnell && gradio src/gradio_demo.py
export MODEL_NAME=flux-dev && gradio src/gradio_demo.py
export MODEL_NAME=flux-depth && gradio src/gradio_demo.py
export MODEL_NAME=flux-canny && gradio src/gradio_demo.py
export MODEL_NAME=flux-kontext && gradio src/gradio_demo.py

Bibtex

If you find this repository useful for your research, please cite the following work.

@article{Parmar2025group,
  title={Scaling Group Inference for Diverse and High-Quality Generation},
  author={Gaurav Parmar and Or Patashnik and Daniil Ostashev and Kuan-Chieh (Jackson) Wang and Kfir Aberman and Srinivasa Narasimhan and Jun-Yan Zhu},
  year={2025},
  journal={arXiv preprint arXiv:2508.15773},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scalable Group Inference

Quick start: Running Locally (CLI) | Running Locally (Gradio UI)

Paper

Method

Results

Getting Started

Text-to-Image group inference

Depth-to-Image group inference

Canny Edge-to-Image group inference

Image editing (FLUX.1 Kontext)

Local Gradio Demo

Bibtex

About

Uh oh!

Contributors 2

Uh oh!

Languages

License

GaParmar/group-inference

Folders and files

Latest commit

History

Repository files navigation

Scalable Group Inference

Quick start: Running Locally (CLI) | Running Locally (Gradio UI)

Paper

Method

Results

Getting Started

Text-to-Image group inference

Depth-to-Image group inference

Canny Edge-to-Image group inference

Image editing (FLUX.1 Kontext)

Local Gradio Demo

Bibtex

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages