Real-time AI spend tracking

Stop discovering your AI bill three weeks after the sprint.

Track every LLM call in real time. Set hard caps per environment. Get alerted before you hit the ceiling, not after.

View setup$10/mo, cancel anytime

Built for

OpenAIAnthropicGoogleMistralCohere

Monthly AI spend

$428.1962% of cap

projected

$690

OpenAI

$212

Anthropic

$129

Google

$46

Mistral

$41

prod

80% alert

staging

$32.16

summarizer

70% of bill

Alert: prod is at 80% of cap. Next notification at 100%.

What teams are saying

AI startup

OpenAI

“We were flying blind on LLM costs for months. CostCap showed us our summarizer alias was eating 70% of our bill.”

4-person team

Freelance dev

dev cap

“I set a $50/mo cap on my dev environment and stopped worrying about runaway test loops racking up bills.”

Solo product builder

AI agency

client alias

“Finally able to bill clients accurately. We track each project under its own alias and export the numbers directly.”

Client AI work

What you get

Spend control that looks like a product, not a billing autopsy.

Real-time spend tracking

See every LLM call by model, provider, alias, environment, and project. Token counts, estimates, and duration are captured server-side.

Environment-level caps

Set monthly limits per environment. Dev, staging, and prod get separate budgets. Alerts fire at 80% and 100% of each cap.

Provider breakdown

See spend across OpenAI, Anthropic, Google, Mistral, and more in one view. Know which provider is costing you the most.

One-line SDK

Wrap any LLM call with vibe.track(). Fire-and-forget logging keeps your response path clean.

API key management

Generate and revoke keys. Each key gets its own usage history. Hash-stored and never readable after creation.

Server-side cost recomputation

We never trust client-sent estimates. Costs are recomputed from our pricing table for 40+ models across 5 providers.

Good for

Stop letting experiments, staging, and client work share one mystery bill.

Teams shipping AI features

Know which features are expensive before they go to prod. Track spend by alias so each team owns its costs.

per-feature spendalias breakdownmodel comparison

Budget-conscious projects

Set a monthly ceiling and get warned before you hit it. No more discovering you've blown the budget at month-end.

hard caps80% alertsprojected EOL spend

Multi-environment setups

Dev team can't accidentally run up a prod-scale bill. Each environment gets its own independent budget.

dev capstaging capprod cap

Client work

Bill clients accurately. Track spend by alias per client to match invoices to actual LLM usage.

per-client trackingalias-based billingexport spend

How it works

Wrap the call. Track the spend.

Add CostCap around your LLM call, tag the environment, and see spend update instantly in your dashboard.

Install the SDK

Add the package and connect your API key.

Wrap your LLM call

Use vibe.track() without changing your response flow.

Track spend and caps

Monitor cost by model, provider, alias, project, and environment.

track-summarizer.ts

const result = await vibe.track(
  "summarizer",
  () => openai.responses.create({
    model: "gpt-4.1-mini",
    input: prompt,
  }),
  { environment: "prod", project: "app" }
);

Pricing

Simple pricing. No surprises.

One plan for teams that need real visibility before the provider invoice lands.

$10/ month

Everything included. No usage tiers.

Pro plan

Up to 5 API keys
Unlimited LLM calls tracked
Real-time spend dashboard
Environment-level spend caps
Alerts at 80% and 100% of cap
Provider breakdown (OpenAI, Anthropic, etc.)
Spend by alias and environment
API key management

Cancel anytime. No contracts.

Ready to stop guessing?

One line of code. Real-time visibility.

Add CostCap to your LLM calls and see exactly what your AI features cost by alias, environment, provider, and project.

View setup

FAQ

Details people check before trusting a spend tracker.

What counts as a tracked call?

Any LLM call wrapped with vibe.track(). Supports OpenAI, Anthropic, Google, Mistral, Cohere, and more. Unknown models still log token counts and request counts — cost will show as $0.

Does tracking add latency?

No. CostCap uses a fire-and-forget pattern. Your LLM call completes and returns first. Usage is logged asynchronously — zero latency added to your response times.

Can I set different caps for dev and prod?

Yes. Caps are per-environment. Set a low limit for dev and staging, a higher one for prod. Each environment tracks and alerts independently.

What triggers an alert?

Alerts fire at 80% and 100% of your monthly cap per environment. You can configure an alert email per cap. No alert email? The cap still enforces — you just won't get a notification.

How are costs calculated?

Server-side, from our pricing table. We never trust client-sent estimates. Covers 40+ models across 5 providers. Pricing is updated as providers change their rates.

How do I deploy?

CostCap runs on your own infrastructure. Deploy to Vercel, Railway, or Coolify. Set your DATABASE_URL (Supabase or Postgres), Stripe keys, and BETTER_AUTH_SECRET, then run prisma migrate deploy.