CostCap

Stop discovering your AI bill three weeks after the sprint.

Track every LLM call in real time. Set hard caps per environment. Get alerted before you hit the ceiling, not after.

View setup$10/mo, cancel anytime

Built for

OpenAIAnthropicGoogleMistralCohere

Monthly AI spend

$428.1962% of cap

projected

$690

OpenAI
$212
Anthropic
$129
Google
$46
Mistral
$41

prod

80% alert

staging

$32.16

summarizer

70% of bill

Alert: prod is at 80% of cap. Next notification at 100%.
AS
AI startup
OpenAI

“We were flying blind on LLM costs for months. CostCap showed us our summarizer alias was eating 70% of our bill.”

4-person team

FD
Freelance dev
dev cap

“I set a $50/mo cap on my dev environment and stopped worrying about runaway test loops racking up bills.”

Solo product builder

AG
AI agency
client alias

“Finally able to bill clients accurately. We track each project under its own alias and export the numbers directly.”

Client AI work

Spend control that looks like a product, not a billing autopsy.

Real-time spend tracking

See every LLM call by model, provider, alias, environment, and project. Token counts, estimates, and duration are captured server-side.

Environment-level caps

Set monthly limits per environment. Dev, staging, and prod get separate budgets. Alerts fire at 80% and 100% of each cap.

Provider breakdown

See spend across OpenAI, Anthropic, Google, Mistral, and more in one view. Know which provider is costing you the most.

One-line SDK

Wrap any LLM call with vibe.track(). Fire-and-forget logging keeps your response path clean.

API key management

Generate and revoke keys. Each key gets its own usage history. Hash-stored and never readable after creation.

Server-side cost recomputation

We never trust client-sent estimates. Costs are recomputed from our pricing table for 40+ models across 5 providers.

Stop letting experiments, staging, and client work share one mystery bill.

Teams shipping AI features

Know which features are expensive before they go to prod. Track spend by alias so each team owns its costs.

per-feature spendalias breakdownmodel comparison

Budget-conscious projects

Set a monthly ceiling and get warned before you hit it. No more discovering you've blown the budget at month-end.

hard caps80% alertsprojected EOL spend

Multi-environment setups

Dev team can't accidentally run up a prod-scale bill. Each environment gets its own independent budget.

dev capstaging capprod cap

Client work

Bill clients accurately. Track spend by alias per client to match invoices to actual LLM usage.

per-client trackingalias-based billingexport spend

Wrap the call. Track the spend.

Add CostCap around your LLM call, tag the environment, and see spend update instantly in your dashboard.

01

Install the SDK

Add the package and connect your API key.

02

Wrap your LLM call

Use vibe.track() without changing your response flow.

03

Track spend and caps

Monitor cost by model, provider, alias, project, and environment.

track-summarizer.ts
const result = await vibe.track(
  "summarizer",
  () => openai.responses.create({
    model: "gpt-4.1-mini",
    input: prompt,
  }),
  { environment: "prod", project: "app" }
);

Simple pricing. No surprises.

One plan for teams that need real visibility before the provider invoice lands.

$10/ month

Everything included. No usage tiers.

Pro plan
  • Up to 5 API keys
  • Unlimited LLM calls tracked
  • Real-time spend dashboard
  • Environment-level spend caps
  • Alerts at 80% and 100% of cap
  • Provider breakdown (OpenAI, Anthropic, etc.)
  • Spend by alias and environment
  • API key management

Cancel anytime. No contracts.

One line of code. Real-time visibility.

Add CostCap to your LLM calls and see exactly what your AI features cost by alias, environment, provider, and project.

View setup

Details people check before trusting a spend tracker.

What counts as a tracked call?

Any LLM call wrapped with vibe.track(). Supports OpenAI, Anthropic, Google, Mistral, Cohere, and more. Unknown models still log token counts and request counts — cost will show as $0.

Does tracking add latency?

No. CostCap uses a fire-and-forget pattern. Your LLM call completes and returns first. Usage is logged asynchronously — zero latency added to your response times.

Can I set different caps for dev and prod?

Yes. Caps are per-environment. Set a low limit for dev and staging, a higher one for prod. Each environment tracks and alerts independently.

What triggers an alert?

Alerts fire at 80% and 100% of your monthly cap per environment. You can configure an alert email per cap. No alert email? The cap still enforces — you just won't get a notification.

How are costs calculated?

Server-side, from our pricing table. We never trust client-sent estimates. Covers 40+ models across 5 providers. Pricing is updated as providers change their rates.

How do I deploy?

CostCap runs on your own infrastructure. Deploy to Vercel, Railway, or Coolify. Set your DATABASE_URL (Supabase or Postgres), Stripe keys, and BETTER_AUTH_SECRET, then run prisma migrate deploy.

CostCap — Real-time AI spend tracking and budget caps