Where the verdict flips
The recommendation against on-prem is conditional. Three precise conditions reverse it. The default reading of each is FAILS for an SME; the flip reading is the narrow case under which an 8× H200 stack earns its capital. If any one of the three fails, do not build.
Agentic multi-step, novel reasoning at the edge.
Five-step chains compound error: 80% per-step → 33% end-to-end. Terminal-Bench gap of 14.8 pt is felt.
Open-weight trails 5–15 ptBounded retrieval, summarisation, classification, modest code assist.
No high-fidelity agentic tool use, no edge reasoning. The open-weight gap to frontier collapses.
< 5 pt on relevant axesSovereignty is the operative need.
AU-resident weights and inference. Threat model does not demand physical air-gap.
Bedrock-Sydney sufficesAir-gap is the operative need.
Classified or regulated workload that mandates a physical break from public cloud — not a contractual one.
IRAP-PROTECTED+ bindingNotion, Gmail, Calendar, Word, Excel are core.
OAuth-gated SaaS routes data through US-domiciled cloud regardless of model residency.
Punctured · §VCloud SaaS is out of scope.
LLM restricted to on-prem data — file shares, local DBs, internal wikis, code repositories. Or full self-hosted stack.
Architecturally cleanBuild only if 01 and 02 and 03. Otherwise cloud-sovereign dominates on capability, cost, and coherence.
SourceOn-prem frontier LLM briefing, §Verdict and §Conditions under which the recommendation flips. Prevalence ratios are first-order estimates against the brief's framing — "most SMEs whose threat model does not demand air-gap" (compliance), "most SMEs want high-fidelity agentic work" (workload), and the named-connector default (estate). The flip is rare by design: the brief's centre of gravity is that the default verdict against on-prem is robust to all but a narrow case.