AI Arena

Enter your prompt

Four models compete. One neutral judge scores. The best answer wins.

0 chars ctrl+enter to run

Battle History

Every arena run this session.

No battles yet — enter the arena ⚡

How it works

Three phases. One unbiased judge. The best answer wins.

01

Generate

Your prompt is dispatched to all four models simultaneously, each operating under a unique persona. Analytical precision, step-by-step reasoning, creative insight, and philosophical depth.

02

Judge

A dedicated Gemma 7B model — running on a completely separate API key — scores every response across Accuracy, Clarity, Depth and Creativity. It never competes.

03

Synthesize

The top-ranked model synthesizes all responses into one definitive champion answer, weaving together the strongest insights from every competitor into a single brilliant response.

Full Roster

◈

GPT-OSS 120B

NVIDIA NIM

COMPETITOR

Analytical precision

◈

DeepSeek R1

NVIDIA NIM

COMPETITOR

Step-by-step reasoning

◈

Phi-3 Mini 128K

NVIDIA NIM

COMPETITOR

Sharp & practical

◈

Llama 3.3 70B

Groq ⚡

COMPETITOR

Nuanced & comprehensive

⚖

Gemma 7B

NVIDIA NIM

JUDGE ONLY

Neutral · Never competes