What's the best image generation model?

Find out which one is best for your purpose based on the leaderboard below. All results are directly based on feedback from real human raters. The details of the annotation process are described in our blog post.

Rank
model
Bradley-Terry
Elo
Win Rate
Wins
Matches
1
ai logo
flux-1-pro
1135.351046.460.547619981407343
2
ai logo
flux-1.1-pro
1084.491029.760.52434537833869
3
ai logo
imagen-3
1019.891007.390.50381047769477
4
ai logo
dalle-3
950.25981.640.496457861324919
5
ai logo
stable-diffusion-3
937.60976.770.495300161091069
6
ai logo
midjourney-5.2
890.46957.980.475770481234187

What is "Bradley-Terry"?

The Bradley-Terry ranking model is a probabilistic model used to predict outcomes in pairwise comparisons. It assigns a strength parameter (reported score) to each item, indicating its likelihood of winning against another. See the wikipedia article for mathematical details.

What do we consider as "Overall preference"?

Here we evaluate the model across all criteria and determine which model has the best overall performance.

Examples

Visual examples of the annotators’ preferences

Preference
Which image looks better overall?
flux-1-pro_winner
FLUX.1 [pro]
flux-1-pro_winner
Midjourney
Coherence
Which image feels less weird or unnatural for its style when you look closely? I.e. fewer odd or strange-looking objects or elements
flux-1-pro_winner
FLUX.1 [pro]
flux-1-pro_winner
Midjourney
Alignment
Which image is more aligned with and better adheres to the prompt:
A black and white picture of a white man singing a song
flux-1-pro_winner
FLUX.1 [pro]
flux-1-pro_winner
Midjourney