guideApr 2, 20265 min read

4 New Ways AISpendGuard Stops Runaway AI Agent Costs

Runaway detection, session caps, graduated enforcement, and per-span cost breakdowns — shipped this week

A Hacker News user lost $187 in 10 minutes to a runaway agent loop. A startup's CrewAI pipeline burned $4,800 in a month with no one noticing which step was responsible. These stories repeat daily across r/OpenAI, r/ClaudeAI, and the LangChain Discord.

The problem isn't AI agents. It's that no tool tells you which step costs what, when costs spike, or how to stop runaway spend before it drains your budget.

This week we shipped four features that change that.

1. Agent Runaway Detection

The problem: An agent enters a retry loop. Each retry costs $0.05–$0.50. Ten retries in 3 minutes = budget gone. No error. No timeout. Just a perfectly functioning agent doing what it was told — repeatedly.

What we built: AISpendGuard now detects traces with 10+ events in under 5 minutes and flags them as potential runaways. You get:

Automatic identification of retry-loop patterns
Monthly waste estimate ("this pattern costs you ~$X/month if it continues")
Actionable advice: set max retries, add exponential backoff, use a cheaper retry model

Why it matters: No competitor detects this. LangSmith, Langfuse, and Helicone track costs — but none of them identify runaway patterns and tell you what to do about them.

2. Agent Session Spend Caps

The problem: You want each agent run to cost at most $2. But the agent decides how many LLM calls to make. You need a kill switch — at the SDK level, not in the dashboard.

What we built: SessionBudget in the AISpendGuard SDK lets you set soft and hard limits per agent session:

import { SessionBudget } from '@aispendguard/sdk';

const budget = new SessionBudget({
  maxBudget: 2.00,          // Hard limit: $2.00
  softLimitPercent: 75,     // Warn at 75% ($1.50)
  onSoftLimit: (info) => console.warn(`Agent spent $${info.currentSpendUsd} of $${info.maxBudgetUsd}`),
  onHardLimit: (info) => { throw new Error(`Budget exceeded: $${info.currentSpendUsd}`); }
});

The SDK tracks cumulative cost per session and fires callbacks at your thresholds. No gateway required — it works alongside your existing code.

Competitive context: AgentBudget, Portkey, and Lava are marketing similar capabilities. Our approach is SDK-cooperative (no proxy lock-in) and works with any provider.

3. Graduated Budget Enforcement

The problem: Binary enforcement (alert or block) is too blunt. You want to warn your team at 50%, throttle at 80%, suggest cheaper models at 95%, and only hard-block at 100%.

What we built: The first graduated enforcement ladder in the market:

Budget %	Action	What happens
50%	Warn	Dashboard + email notification
80%	Throttle	SDK receives slowdown signal
95%	Suggest downgrade	SDK receives model-switch suggestion
100%	Block	SDK receives reject signal

Actions are configurable per threshold. Works with both workspace budgets and tag-level budgets (per team, feature, or route).

Key principle: Your dashboard numbers are always accurate. We track everything — we just tell the SDK to slow down. Events are never dropped.

Competitive gap: Portkey and LiteLLM offer alert-only or hard-block. Nobody else offers a configurable ladder.

4. Per-Span Cost Breakdown in Trace View

The problem: Your agent pipeline costs $4.80 per run. But which step is expensive — the planning call? The retrieval? The synthesis? Without per-step attribution, you're optimizing blind.

What we built: The trace view now breaks down cost per span:

Cost distribution bar — visual breakdown showing which spans consume the most budget
Per-span detail — model, tokens, cost, and duration for every step in the trace
Sort by cost — find the expensive spans instantly

This is the feature agent builders have been asking for. When you see that your "planning" step costs 60% of the run while the "execution" step costs 5%, you know exactly where to optimize.

Competitive context: Braintrust shipped per-span cost in March 2026. We match that capability at 1/13th the price (€19/mo vs $249/mo).

The Full Picture

These four features work together:

Runaway detection catches loops you didn't know about
Session caps prevent any single run from exceeding your budget
Graduated enforcement gives you a smooth ramp from warning to blocking
Per-span cost shows you exactly where to optimize

No other tool combines all four. And none of them require a proxy gateway — our SDK is passive, adds zero latency, and works with OpenAI, Anthropic, Google, and every major provider.

Try It Now

All four features are live today.

Free tier: 50,000 events/month, no credit card required
Pro: €19/month for 500,000 events and full enforcement features

Start tracking for free →

If you're building AI agents with LangChain, CrewAI, or custom loops, these features were built for you.

Covers features: Agent Runaway Detection (F29), Agent Session Spend Caps (F35), Graduated Budget Enforcement (F33), Per-Span Cost Breakdown (F36).