We went to see what the model vendors are building in our lane

Two layers everyone is racing to own

Strip away the branding and the new vendor products fall into the same two buckets every serious AI app needs:

Model selection — given a prompt, which model should answer it? This is the unsolved-quality layer prompt routing is about. It is also exactly what our measured routing does.
Agent runtime — durable execution, sandboxing, approvals, connections, channels, observability. The plumbing agent frameworks package so you write what an agent does, not everything it needs to run.

Layer 1 — the vendors now route, but only over their own shelf

The newest first-party model-selection products are real and worth knowing — but they share two limits that matter for an honest recommendation.

Google · Vertex Model Optimizer

A single meta-endpoint that automatically picks the Gemini model — Pro, Flash, etc. — that best meets a cost/quality preference. Genuinely useful if you are an all-Gemini shop. But it routes only across Google’s own models, on an internal estimate you can’t inspect, and it is still labelled experimental.

OpenAI · AgentKit / Agent Builder

Uses a small classifier (GPT-4.1-mini, upgradeable to GPT-5) to route a request to the right sub-agent— intent routing, not cheapest-capable-model routing — and stays inside OpenAI’s models. Note the visual Agent Builder is being wound down (shutting Nov 30 2026); OpenAI points builders at the code-first Agents SDK instead.

Anthropic · Claude Agent SDK

Lets you override the model per role via environment variables (ANTHROPIC_DEFAULT_SONNET_MODEL / …OPUS_MODEL) and route everything through a gateway for cross-vendor reach — but there is no built-in quality routing. The tiering is manual: exactly the decision our config automates and measures.

OpenRouter · Auto Router

The cross-vendor option — and now confirmed to be powered by Not Diamondunder the hood. So OpenRouter Auto and Not Diamond are one opaque engine, not two: the decision is stochastic, you can’t audit why it chose, and you can’t reproduce the path.

The structural gap. Every provider-native router that shipped this year is either single-vendor (it can only ever recommend its own models) or opaque(you can’t see why, or check it on your own tasks). A vendor is structurally conflicted — it has no incentive to tell you a competitor’s model is the better, cheaper answer. That is the one thing a measured, cross-vendor, inspectable signal does that they can’t. See the full landscape in the Task Routing market table.

Layer 2 — the agent runtime has commoditized

On the runtime side the picture is the opposite of a gap: the whole field shipped the same six primitives we described in agent frameworks — durable execution, sandboxing, approvals, secure connections, multi-channel, tracing.

Cloudflare · Agents SDK (“Project Think”)

The most ambitious infrastructure play — now positioned as a runtime any agent framework can build on: durable fibers with crash recovery, sub-agents with isolated SQLite, sessions with forking and compaction, built-in observability — at the edge. Important enough that we give it its own write-up.

Google · ADK + Agent Engine

A code-first kit (7M+ downloads) plus a managed runtime, with self-healing tool use (auto-retries a failed tool differently), a built-in eval layer with a user simulator, and prompt-injection screening. The most enterprise-complete bundle.

Vercel · Eve

The directory-as-agent framework behind 100+ of Vercel’s production agents — the clean illustration of the pattern we already reviewed.

The takeaway flips the usual build-vs-buy instinct: durable execution, sandboxes and approvals are now off-the-shelf. Hand-rolling that plumbing is increasingly hard to justify. But notice what every one of these runtimes still hardcodes — the model. Eve’s agent.ts pins a string; ADK and the rest do the same. The runtime is solved; the choice of model inside it is not.

What we take from the scan

For model selection, the measured/cross-vendor angle held up. The field consolidated on opaque, single-vendor routing — which makes an inspectable, vendor-neutral signal more differentiated, not less.
For the runtime, adopt rather than rebuild. The honest move is to let a commodity runtime (Cloudflare, ADK) own the plumbing and wire a measured model-selection layer inside it — the input these frameworks lack.
For a client, the recommendation is conditional. An all-one-vendor shop is well served by that vendor’s native optimizer. The moment you want to weigh a competitor’s model, audit a cost/quality decision, or defend a routing choice with numbers — that is where a neutral, measured signal wins, because no vendor can offer it without conflict.

Cloudflare, in depth The routing market table Agent frameworks