AI SaaSBuild in publicCost control

Why we're building BurnCap

AI products are cheap to prototype and unpredictable in production. Provider dashboards log the burn — none of them stop it. The first note in an open build log.

Jun 16, 2026Daniel4 min readBurnCap ↗

BurnCap landing page: "Know which AI feature is burning money," over a live dashboard of spend by model and feature.

There's a specific kind of message that shows up in every AI builder community now. Someone wakes up, checks their provider dashboard, and finds a number that doesn't make sense. A retry loop ran all night. An agent got stuck calling the same tool in a loop. A test script pointed at a paid model and nobody noticed for days.

None of those are edge cases. They're the ordinary ways an LLM bill gets away from you, and they share one nasty property: your provider dashboard reports the damage hours later, in aggregate, after the money is already gone. The tools we have are good at logging the burn. None of them stop it.

That gap is why we're building BurnCap.

BurnCap's alert rules — spend spike, runaway request loop, expensive model, dev/test spend, and unprofitable customer — above a delivery history of fired alerts. — The alerts we wish we'd had — spikes, runaway loops, expensive models, and unprofitable customers, caught as they happen.

The actual problem

AI products are cheap to prototype and unpredictable in production. The cost controls the providers give you are all post-hoc — they notify you after you've spent the money, and none of them can tell you the one thing you actually need to know when the bill spikes: which feature, which customer, or which model caused it.

If you're a big company, you solve this with a FinOps team and a Datadog contract. But the people getting hit hardest aren't big companies. They're indie founders shipping an AI wrapper, agencies building AI features for clients, and small teams of one to ten people. They have a real budget problem and zero appetite for an enterprise cost-management suite.

So the question we kept coming back to was simple: what would it take to catch the spike before the invoice does — for a team too small to build FinOps?

What BurnCap actually does

BurnCap ingests your LLM usage out of band — you don't route your traffic through it. You import a CSV, paste a read-only Admin key from OpenAI or Anthropic, or drop in a small SDK call.

BurnCap Integrations page — connect OpenAI or Anthropic with a read-only Admin key, or import a usage CSV; the SDK lives under Developers. — Getting usage in, out of band — a CSV, a read-only provider key, or a few lines of SDK. Never a proxy in your request path.

From there it does four things:

Attributes cost to the dimension that matters — not just "you spent $4,000," but "the summarizer feature spent $4,000, mostly on one customer."
Forecasts where the month lands, so you find out on the 9th, not the 30th.
Alerts on spikes, runaway loops, expensive models, and dev/test spend.
Enforces budgets — but only through an endpoint your app calls, that fails open if BurnCap is ever unreachable.

BurnCap dashboard: month-to-date spend, a month-end forecast, a burn-readiness checklist, and the daily spend trend. — Month-to-date spend, a month-end forecast, and a burn-readiness checklist that flags what's still unguarded — before the invoice does.

BurnCap architecture diagram: usage enters from provider imports, a CSV, or the SDK; BurnCap observes it out-of-band to forecast, alert, and cap; your app calls the model provider directly. BurnCap never proxies traffic and never stores prompts. — The whole shape in one diagram: usage flows in out of band, BurnCap only ever observes, and your app still talks to the provider directly. Never a proxy, never your prompts.

Why we think it's worth building

There are good observability platforms out there. But most of them are built around traces — they're debugging tools, and they price like debugging tools (often per trace, which punishes exactly the chatty agents most likely to blow up your bill).

"Am I going to lose money this month?" isn't a debugging question. It's a finance question. We haven't found a tool that's budget-first for small teams, that attributes cost per customer, that can tell you "you charge this person $49 a month and they're costing you $63," and that does it without asking you to re-architect your traffic. That's the wedge.

The math is forgiving, too. At a price in the $19–39/month range, BurnCap only has to prevent one surprise bill or surface one unprofitable customer a quarter to have paid for itself. It doesn't need to be magic. It just needs to be there before the invoice.

What the first version intentionally won't do

We want to be honest about the cuts, because the cuts are the strategy:

No proxy. Ever. We're not going to put ourselves on your hot path. Out-of-band only.
No stored prompts. Ever. Token counts and IDs, nothing else.
No guessing. Every number is labeled — estimated, provider-billed, or forecasted. If we don't have a price for a model, we say "unpriced." We'd rather be honestly incomplete than confidently wrong, because the whole product rests on trusting the numbers.
Not every integration on day one. OpenAI and Anthropic imports, CSV, and an SDK ship first. Gemini, LiteLLM, and gateway importers come later.
No AI feature for the sake of an AI feature. The "insights" — cheaper-model suggestions, cache efficiency — are computed deterministically. There's no LLM inside BurnCap making things up about your spend. (That one surprised us too — we'll write a whole post about it.)

What we want to learn

We've built the thing. What we don't know yet is the stuff you can only learn by shipping: do people actually tag their usage with feature and customer IDs, or is that friction too high? Is daily-granularity import enough to be useful, or does the value only show up with per-request SDK data? Is the unit-economics view — "is this customer profitable?" — the hook we think it is, or a nice-to-have? Those are real open questions, and we'd rather find the answers in public than pretend we already have them.

How we're documenting this

We're building BurnCap under Eastbase Studio, and we're going to write the whole thing up — roughly fourteen posts covering the real decisions: the scope cuts, the stack, the database design, the "AI feature with no AI in it," rate limiting without Redis, the launch. Not a polished case study after the fact — the actual reasoning, including the parts we're unsure about.

If you've ever opened a provider dashboard with a sinking feeling, this series is for you.

Next up: the part everyone assumes is the AI — the feature that turned out to use no AI at all.