How does context window affect support assistant?
A context window is the maximum tokens a model accepts in one request (prompt + completion). Larger windows enable richer prompts but increase per-call cost linearly with tokens used. For support assistant at 12.0B/mo, context window ties to Up to 88% compliant routing opportunity at a balanced floor.
Up to 638× spread between most and least expensive compliant routes for identical workloads at the same quality floor (o10 State of Inference Spend 2026).