Why Your AI Cost Monitor Should Never Touch Your API Keys
On March 24, 2026, malicious versions of the LiteLLM Python package appeared on PyPI. For approximately three hours, versions 1.82.7 and 1.82.8 contained credential stealers that exfiltrated SSH keys, cloud provider sessions, and Terraform state from every environment that installed them.
LiteLLM is present in an estimated 36% of cloud environments. The blast radius was enormous.
This isn't a post about blaming LiteLLM. Supply chain attacks can happen to any open-source project. The LiteLLM team responded quickly, engaged Mandiant for forensic analysis, and published verified checksums for clean versions. They handled the incident well.
But the breach exposed something deeper: a fundamental architectural risk in how most AI monitoring tools work.
The Gateway Problem
Most AI cost monitoring and observability tools use a gateway (proxy) architecture. You change your API base URL so that every LLM request routes through the tool's servers before reaching OpenAI, Anthropic, or Google.
This means the monitoring tool:
- Sits in your request path — every API call goes through it
- Handles your API keys — it needs them to forward requests to providers
- Sees your prompts and completions — it's a man-in-the-middle by design
- Becomes a single point of failure — if the gateway goes down, your AI features stop working
- Becomes a supply chain attack surface — a compromised gateway has access to everything
Helicone, Portkey, LiteLLM, and most other tools in this space use this pattern. It's simple to set up (change one URL), but the security tradeoff is significant.
When the LiteLLM packages were compromised, the malicious code had access to everything LiteLLM had access to — which, by design, included credentials, API keys, and the full content of every request passing through the proxy.
The Passive SDK Alternative
There's another way to monitor AI costs that doesn't require sitting in the request path.
A passive SDK works differently:
- Your code calls the AI provider directly — no proxy, no middleman
- After the call completes, you send a lightweight event with just the metadata: model name, token counts, cost, and your custom tags (feature name, user type, environment)
- The SDK never sees your prompts, API keys, or completions — it physically cannot, because it's not in the request path
Here's the security difference in practice:
| Gateway (Proxy) | Passive SDK | |
|---|---|---|
| Handles your API keys | Yes | No |
| Sees prompts/completions | Yes | No |
| In the request path | Yes | No |
| Single point of failure | Yes | No |
| Adds latency | Yes | No |
| If compromised, can steal credentials | Yes | No |
| If compromised, can intercept data | Yes | No |
A compromised passive SDK is essentially harmless. The worst case? It sends garbage metadata to the monitoring server. It cannot steal your OpenAI key, intercept customer prompts, or exfiltrate cloud credentials — because it never has access to any of those things.
"But I Need Full Tracing"
Fair point. Gateway tools offer features that passive SDKs don't: prompt logging, response caching, automatic retries, load balancing across providers.
If you need those features, a gateway might be the right choice — but you should treat it like any other critical infrastructure component. Pin versions, audit dependencies, use lock files, and have a kill switch.
But if your primary goal is cost visibility — knowing which features cost what, finding waste, tracking spend trends — you don't need a gateway. You don't need to route traffic through a third party. You don't need to hand over your API keys.
You need metadata. That's it.
What to Look for in a Cost Monitoring Tool
If the LiteLLM incident has you rethinking your monitoring stack, here's a security checklist:
1. Does it require your API keys? If yes, it's in your blast radius. Any compromise gives attackers access to your AI provider accounts.
2. Does it sit in your request path? If yes, it's a single point of failure and a supply chain attack surface. A compromised dependency can intercept every request.
3. Does it store your prompts? If yes, you have a data liability. Prompt data often contains PII, internal business logic, and customer information. A breach exposes all of it.
4. Can you verify what data it sends? Open-source SDKs let you inspect exactly what's transmitted. If the monitoring tool is a black-box proxy, you're trusting it with everything.
5. What happens if it goes down? With a gateway, your AI features stop working. With a passive SDK, your AI features keep running — you just temporarily lose visibility.
How AISpendGuard Handles This
We built AISpendGuard around a simple architectural principle: never touch what you don't need to see.
Our SDK is passive. It sends tags — model name, token counts, cost, and your custom labels. That's the complete list of what leaves your environment. No prompts, no completions, no API keys, no request bodies.
This isn't a policy decision ("we promise not to look"). It's an architectural guarantee. The SDK physically cannot access your prompts or credentials because it's never in the request path.
The result:
- No supply chain risk — even a compromised SDK version can't steal credentials it never touches
- No latency — your API calls go directly to OpenAI/Anthropic/Google
- No single point of failure — if our servers go down, your app keeps working
- No data liability — we never store prompts, so there's nothing to breach
- Full cost visibility — you still see exactly which feature, model, and tag is costing you money
The Tradeoff Is Real — And Worth It
We give up some things by not being a gateway. We can't cache your responses. We can't auto-retry failed calls. We can't do prompt-level tracing.
What we keep: your security, your privacy, and your independence.
For teams that need cost visibility without the risk of putting another dependency in their critical path, that's the right tradeoff.
AISpendGuard is a privacy-first AI cost monitoring tool. Free tier with 50,000 events/month, no credit card required. Start tracking your AI spend →