A gateway moves your request; it doesn't know which model is right. Routing on price and availability is solved — routing on answer qualityneeds a measured signal. This is ours: the best model per use-case from the latest benchmark, plus how the market's routing tools actually work.
The cheapest model that still clears the quality bar for each task, from the latest benchmark. Updated 6/12/2026, 7:31:30 PM.
groq/compound-mini
ping, preprocessing, compaction, short_responses
544ms
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B
tool_use, structured_output, agent_decisions
530ms
gpt-oss-120b
long_generation, document_analysis, bulk_processing
245.3 tok/s
groq/compound-mini
fallback, retry
544ms
Every general gateway routes on availability, price, latency, or your declared order — none on whether the answer is correct. That column is the gap our signal fills.
Read the “Quality-aware?” column: most gateways route on availability, price, latency, or order (No / Heuristic). A few route on quality — but via a proprietary, opaque model. Only a measured signal lets you see why a task routes where, on your own tasks. Click any row for the full review.
Pass-rate per task, cost per call, latency, and variance across repeated runs are the inputs a capability-aware router needs. We measure them so your routing decision is one you can defend with numbers — not a black box.