Frontier Claude (Opus, Sonnet) holds the upper-right via Bedrock SG. They cost more
and
deliver better answers on complex multi-doc finance work — used sparingly, cached aggressively.
GPT-5.5
sits at the same tier (priciest, top-quality on numeric reasoning) but lives outside Bedrock — included for reference, not in the Phase-1 router.
Mid-tier specialists hold their own corners — Qwen3 Max on Alibaba (Mainland filings), Haiku 4.5 + Nova Pro on Bedrock (fast factual lookups, JSON-schema tool calls).
GPT-5 mini
slots near them on cost-quality but, again, outside Bedrock.
Llama 4 Scout sits in the lower-left: cheap, fast, 10 M context.
Gemma 4 27B
sits below it — slightly weaker on multi-doc reasoning, similar cost via OpenRouter / hosted endpoints. Reference point, not a router lane today. DeepSeek V4 covers cheap CN-EN heavy reasoning via Alibaba.
Self-host points land strictly south-east of the cloud equivalent — same weights, more cost, similar quality.
Gemma 4 on a rented L40S is the most expensive way to be mid-tier.
That's the trade we'd be buying.