o10Last updated 2026-06-09

Know Your Inference: A Framework for Governing the AI Supply Chain

KYI moves beyond per-token optimization to evaluate whether inference systems create durable value — across performance, economics, integration, strategy, and risk — with a composite score and enforcement levers in o10.

Whitepaper · Shen Pandi · Five-pillar governance framework

Spread observed
638×
Routing modes
shadow → enforce
Framework
KYI

"Cheaper tokens miss the point. Up to 90% of an AI system's operational life is inference — where value, reliability, and risk are decided."

— Shen Pandi, Know Your Inference
Dashboards observe.
o10 enforces.

Cost dashboards tell you what you spent. o10 sits in the request path and changes what you spend — shadow first, then enforce.

SummaryKey takeaways

What you need to know

Short, self-contained answers with cited stats — read the sections below for full context.

What is the KYI whitepaper?

Know Your Inference: A Framework for Governing the AI Supply Chain — by Shen Pandi. Scores inference systems across five pillars with a composite score, confidence level, and board-signable recommendation.

KYI weights performance and economics at 25% each; integration and strategy at 20% each; risk at 10%.

Why KYI beyond per-token optimization?

Cheaper tokens miss the point if the system fails on integration, strategy, or risk. KYI evaluates whether inference creates durable value — not just lower $/1M.

Up to 90% of an AI system's operational life is inference — where value, reliability, and risk are decided.

Is KYI a one-off audit?

No. KYI runs continuously in the o10 control plane — every routed call and eval updates pillar scores and the composite recommendation.

Composite score floor at 65; below triggers cap, rightsizing, or kill criteria per governance policy.

01Deep dive

The five KYI pillars

Each pillar scores 0–100. Weighted composite drives recommendation and enforcement levers.

Performance (25%): latency, accuracy, eval pass rate on representative traffic.

Economics (25%): fully loaded $/request, unit economics, forecast vs actual.

Integration (20%): venue coverage, developer ergonomics, migration risk.

Strategy (20%): alignment with product roadmap, vendor concentration, build vs buy.

Risk (10%): residency, retention, model approval, regulatory exposure.

KYI pillar weights
PillarWeightExample signals
Performance25%Eval pass rate, p95 latency
Economics25%$/outcome, envelope breach
Integration20%Gateway coverage, SDK friction
Strategy20%Roadmap fit, lock-in
Risk10%Residency, audit trail
02Deep dive

Scoring methodology

Continuous scoring from live traffic — not annual spreadsheet exercises.

Eval suites replay production samples weekly. Pillar scores update when routes, models, or policies change.

Confidence level reflects sample size and eval stability. Low confidence triggers more shadow time before enforce.

  • 0–100 per pillar
  • 65 composite floor
  • Board-signable PDF export
  • Enforcement levers tied to score
03Deep dive

Board and CFO reporting

KYI translates inference into governance language finance and directors already use.

Instead of token totals, boards see recommendation: proceed, cap, rightsizing, or sunset — with evidence.

Immutable ledger backs every score change: model, venue, policy, jurisdiction, cost per call.

How-toOperational steps

Running a KYI assessment

  1. 01

    Inventory use cases

    Map workloads, volumes, and current model defaults.

  2. 02

    Run pillar evals

    Performance and economics first; integration and risk follow.

  3. 03

    Shadow then score

    Mirror traffic; KYI updates from live routes and evals.

  4. 04

    Report and enforce

    Export board pack; tie levers to composite floor breaches.

SourceMethodology

Know Your Inference framework by Shen Pandi. Live scoring in o10 control plane. State of Inference Spend 2026 benchmarks.

FAQFrequently asked questions

Common questions

What are the five KYI pillars?

Know Your Inference scores inference systems across Performance (25%), Economics (25%), Integration (20%), Strategy (20%), and Risk (10%). Each pillar rates 0–100 from live evals, ledger data, and governance signals. The weighted composite produces a confidence level and recommendation: proceed, cap, rightsizing, or sunset. A composite floor of 65 is the default governance threshold — below it triggers enforcement levers per policy. Pillars are not independent checkboxes; weak economics can mask strong latency, which is why KYI weights both equally.

Is KYI a one-off audit?

No. KYI runs continuously in the o10 control plane — every routed call, eval result, and policy decision updates pillar scores and the composite recommendation. One-off audits go stale the week prompts, models, or retries change. Continuous KYI gives boards and regulators current evidence: an immutable ledger backs each score change with model, venue, policy, jurisdiction, and cost per call.

Who wrote Know Your Inference?

Shen Pandi authored the Know Your Inference framework, published on o10.io with live scoring in the control plane. The thesis: cheaper tokens miss the point if the system fails on integration, strategy, or risk — up to 90% of an AI system's operational life is inference, where value, reliability, and risk are decided. The whitepaper defines pillars, weights, and board reporting rigor comparable to established IT governance frameworks.

What triggers enforcement?

Composite KYI score below 65 or individual pillar breach per governance policy triggers levers: spend cap, auto-rightsizing to a cheaper eval-passing model, workload sunset, or escalation to board review. Enforcement is tied to measured signals — eval pass rate collapse, envelope breach, residency violation — not subjective review alone. Policy defines which lever applies; o10 executes in the request path.

How is KYI different from FinOps?

FinOps brings financial accountability to cloud spend and reports token totals, forecasts, and allocations — often a month late. KYI governs whether the inference system creates durable value across performance, economics, integration, strategy, and risk — with a board-signable recommendation. FinOps tells you what you spent; KYI tells you whether the workload should continue, scale, or stop — and o10 holds the levers.

Can KYI run without o10?

The KYI framework is portable — pillars, weights, and scoring logic can be applied in spreadsheets or GRC tools. Continuous scoring, enforce-mode levers, and per-call ledger evidence require a control plane in the inference path. Without live routing data, KYI becomes a periodic exercise that drifts from production reality within days of the next model or prompt change.

What evals feed performance?

Workload-specific eval suites replay production samples against candidate models: support QA, RAG faithfulness, code correctness, classification precision, clinical safety, and custom business metrics. Performance pillar score reflects pass rates, latency percentiles, and drift detection — not vendor leaderboard rankings. When pass rate slips below the floor, o10 stops routing to that model until revalidation.

How do boards consume KYI?

Boards receive a PDF export with composite score, pillar breakdown, confidence level, recommendation, and ledger evidence summary — language finance and directors already use for vendor and risk decisions. Instead of raw token charts, directors see proceed/cap/sunset with justification. Immutable audit trail supports regulator and internal audit questions without reconstructing spend from invoices.

What is the relationship to routing?

Routing executes economics and performance — selecting the cheapest eval-passing model per call. KYI scores whether the entire supply chain (venues, integrations, strategy, risk) is sound above those routes. Routing without KYI optimizes cost; KYI without routing is a report. Together they answer: are we spending wisely, and is the system defensible to the board?

Where is the interactive scorecard?

The live KYI interactive scorecard is at o10.io/kyi — five pillar inputs, composite score, confidence level, and recommendation update as you adjust weights and signals. It demonstrates the framework with the same pillar structure documented in this whitepaper. Production deployments run KYI continuously from live traffic rather than manual scorecard entry.

o10Set the envelope. o10 holds it.

See what you're overpaying.

Paste a week of traffic. Get the number that books the audit.

See what you're overpaying
verified savings methodology · State of Inference Spend 2026