“We were flying blind on LLM costs for months. CostCap showed us our summarizer alias was eating 70% of our bill.”
4-person team
Track every LLM call in real time. Set hard caps per environment. Get alerted before you hit the ceiling, not after.
Built for
Monthly AI spend
projected
$690
prod
80% alert
staging
$32.16
summarizer
70% of bill
What teams are saying
“We were flying blind on LLM costs for months. CostCap showed us our summarizer alias was eating 70% of our bill.”
4-person team
“I set a $50/mo cap on my dev environment and stopped worrying about runaway test loops racking up bills.”
Solo product builder
“Finally able to bill clients accurately. We track each project under its own alias and export the numbers directly.”
Client AI work
What you get
See every LLM call by model, provider, alias, environment, and project. Token counts, estimates, and duration are captured server-side.
Set monthly limits per environment. Dev, staging, and prod get separate budgets. Alerts fire at 80% and 100% of each cap.
See spend across OpenAI, Anthropic, Google, Mistral, and more in one view. Know which provider is costing you the most.
Wrap any LLM call with vibe.track(). Fire-and-forget logging keeps your response path clean.
Generate and revoke keys. Each key gets its own usage history. Hash-stored and never readable after creation.
We never trust client-sent estimates. Costs are recomputed from our pricing table for 40+ models across 5 providers.
Good for
Know which features are expensive before they go to prod. Track spend by alias so each team owns its costs.
Set a monthly ceiling and get warned before you hit it. No more discovering you've blown the budget at month-end.
Dev team can't accidentally run up a prod-scale bill. Each environment gets its own independent budget.
Bill clients accurately. Track spend by alias per client to match invoices to actual LLM usage.
How it works
Add CostCap around your LLM call, tag the environment, and see spend update instantly in your dashboard.
Add the package and connect your API key.
Use vibe.track() without changing your response flow.
Monitor cost by model, provider, alias, project, and environment.
const result = await vibe.track(
"summarizer",
() => openai.responses.create({
model: "gpt-4.1-mini",
input: prompt,
}),
{ environment: "prod", project: "app" }
);Pricing
One plan for teams that need real visibility before the provider invoice lands.
Everything included. No usage tiers.
Cancel anytime. No contracts.
Ready to stop guessing?
Add CostCap to your LLM calls and see exactly what your AI features cost by alias, environment, provider, and project.
FAQ
Any LLM call wrapped with vibe.track(). Supports OpenAI, Anthropic, Google, Mistral, Cohere, and more. Unknown models still log token counts and request counts — cost will show as $0.
No. CostCap uses a fire-and-forget pattern. Your LLM call completes and returns first. Usage is logged asynchronously — zero latency added to your response times.
Yes. Caps are per-environment. Set a low limit for dev and staging, a higher one for prod. Each environment tracks and alerts independently.
Alerts fire at 80% and 100% of your monthly cap per environment. You can configure an alert email per cap. No alert email? The cap still enforces — you just won't get a notification.
Server-side, from our pricing table. We never trust client-sent estimates. Covers 40+ models across 5 providers. Pricing is updated as providers change their rates.
CostCap runs on your own infrastructure. Deploy to Vercel, Railway, or Coolify. Set your DATABASE_URL (Supabase or Postgres), Stripe keys, and BETTER_AUTH_SECRET, then run prisma migrate deploy.