AI Model Pricing Compared: OpenAI vs Anthropic vs Google — Which Saves You More?
Choosing the right AI model isn't just about quality — it's about cost. A 10x pricing difference between models that perform similarly on your task means you're either saving thousands or burning them.
This guide compares every major AI model's pricing as of March 2026, organized by use case so you can find the cheapest option that meets your quality bar.
We update this page monthly. Last update: March 22, 2026.
The Full Pricing Table
Flagship Models (Best Quality)
| Provider | Model | Input/1M tokens | Output/1M tokens | Context | Best For |
|---|---|---|---|---|---|
| OpenAI | GPT-4o | $2.50 | $10.00 | 128K | General-purpose, creative, analysis |
| OpenAI | GPT-4.1 | $2.00 | $8.00 | 1M | Long-context coding, document analysis |
| OpenAI | o3 | $2.00 | $8.00 | 200K | Complex reasoning, math, planning |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $15.00 | 200K | Nuanced writing, analysis, coding |
| Anthropic | Claude Opus 4.6 | $5.00 | $25.00 | 200K | Hardest tasks, research, long reasoning |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | Multimodal, long context, cost-efficient | |
| Gemini 3.1 Pro | $2.00 | $12.00 | 200K | Latest quality tier | |
| Mistral | Mistral Large | $2.00 | $6.00 | 128K | Multilingual, EU-hosted option |
| Cohere | Command-A | $2.50 | $10.00 | 128K | RAG, enterprise search |
Mid-Tier Models (Good Balance)
| Provider | Model | Input/1M tokens | Output/1M tokens | Context | Best For |
|---|---|---|---|---|---|
| OpenAI | GPT-4.1-mini | $0.40 | $1.60 | 1M | Long docs at low cost |
| OpenAI | o3-mini | $1.10 | $4.40 | 200K | Reasoning at lower cost |
| OpenAI | o4-mini | $1.10 | $4.40 | 200K | Latest reasoning, mid-tier |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | 200K | Fast, capable, affordable |
| Gemini 2.5 Flash | $0.30 | $2.50 | 1M | Fast responses, 1M context | |
| Mistral | Mistral Medium | $0.40 | $2.00 | 128K | Balanced European option |
Budget Models (Cheapest)
| Provider | Model | Input/1M tokens | Output/1M tokens | Context | Best For |
|---|---|---|---|---|---|
| OpenAI | GPT-4o-mini | $0.15 | $0.60 | 128K | Classification, extraction, simple gen |
| OpenAI | GPT-4.1-nano | $0.10 | $0.40 | 1M | Cheapest OpenAI with huge context |
| Anthropic | Claude 3.5 Haiku | $0.80 | $4.00 | 200K | Budget Anthropic |
| Anthropic | Claude 3 Haiku | $0.25 | $1.25 | 200K | Cheapest Anthropic |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | Ultra-cheap with 1M context | |
| Gemini 2.5 Flash Lite | $0.10 | $0.40 | 1M | Latest budget tier | |
| Mistral | Mistral Small | $0.10 | $0.30 | 128K | Cheapest Mistral |
| Mistral | Mistral Nemo | $0.02 | $0.04 | 128K | Ultra-budget tasks |
| Groq | Llama 3.3 70B | $0.59 | $0.79 | 128K | Fast inference, open weights |
| Groq | Llama 3.1 8B | $0.05 | $0.08 | 128K | Fastest, cheapest hosted |
Cost Per Task: Real Comparisons
Let's compare what common tasks actually cost across providers.
Task 1: Classify a support ticket (300 input tokens, 5 output tokens)
| Model | Cost Per Call | Cost at 10K/month |
|---|---|---|
| GPT-4o | $0.0008 | $8.00 |
| Claude Sonnet 4.6 | $0.0010 | $9.75 |
| Gemini 2.5 Pro | $0.0004 | $4.25 |
| GPT-4o-mini | $0.00005 | $0.48 |
| Gemini 2.0 Flash | $0.00003 | $0.32 |
| Mistral Nemo | $0.000006 | $0.06 |
Winner: Mistral Nemo at $0.06/month — if quality is sufficient. GPT-4o-mini at $0.48/month is a safe bet. Spending $8.00/month on GPT-4o for classification is 167x overpaying vs Mistral Nemo.
Task 2: Summarize a 5,000-word document (7,500 input tokens, 300 output tokens)
| Model | Cost Per Call | Cost at 500/month |
|---|---|---|
| GPT-4o | $0.022 | $10.88 |
| Claude Sonnet 4.6 | $0.027 | $13.50 |
| Gemini 2.5 Pro | $0.012 | $6.19 |
| GPT-4o-mini | $0.001 | $0.65 |
| Gemini 2.5 Flash | $0.003 | $1.50 |
| GPT-4.1-nano | $0.001 | $0.45 |
Winner: GPT-4.1-nano at $0.45/month for simple summaries. Gemini 2.5 Flash at $1.50/month for better quality. Using a flagship model costs 10-30x more.
Task 3: Chatbot conversation (20 turns, ~50K input tokens total, ~4K output tokens)
| Model | Cost Per Conversation | Cost at 1K conversations/month |
|---|---|---|
| GPT-4o | $0.165 | $165.00 |
| Claude Sonnet 4.6 | $0.210 | $210.00 |
| Gemini 2.5 Pro | $0.103 | $102.50 |
| GPT-4o-mini | $0.010 | $9.90 |
| Gemini 2.5 Flash | $0.025 | $25.00 |
Winner: GPT-4o-mini at $9.90/month. Claude Sonnet costs 21x more for a general chatbot.
Task 4: Code generation (2K input tokens, 1K output tokens)
| Model | Cost Per Call | Cost at 5K/month |
|---|---|---|
| GPT-4o | $0.015 | $75.00 |
| GPT-4.1 | $0.012 | $60.00 |
| Claude Sonnet 4.6 | $0.021 | $105.00 |
| Claude Opus 4.6 | $0.035 | $175.00 |
| Gemini 2.5 Pro | $0.013 | $62.50 |
| GPT-4.1-mini | $0.002 | $10.40 |
Winner: GPT-4.1-mini at $10.40/month for most code tasks. GPT-4.1 at $60/month when quality matters. Claude Opus at $175/month only for the hardest problems.
Hidden Cost Multipliers
The base prices above don't tell the full story. These multipliers significantly affect your actual bill:
Cache Discounts (save 50-90%)
| Provider | Cache Read Discount | How It Works |
|---|---|---|
| Anthropic | 90% off | Explicit cache control headers; 5-min TTL |
| OpenAI | 50% off | Automatic for prompts >1,024 tokens with matching prefix |
| 90% off | Automatic caching | |
| Groq | 50% off | Automatic caching |
Batch API Discount (save 50%)
| Provider | Batch Discount | Turnaround |
|---|---|---|
| OpenAI | 50% off | 24 hours |
| Anthropic | 50% off | 24 hours |
Long-Context Surcharge (pay more)
| Provider | When | Surcharge |
|---|---|---|
| Google (Pro models) | Input >200K tokens | 2x on input AND output |
| Google (Flash models) | Never | No surcharge |
Reasoning Token Overhead
Models like o1, o3, and o4-mini use "thinking tokens" that count as output tokens but aren't shown in the response. Your actual output token bill can be 2-10x higher than the visible response length.
Provider Comparison: Beyond Price
OpenAI
- Strength: Widest model range (nano to o1), largest ecosystem, best batch API
- Weakness: No EU hosting, output tokens are expensive (4x input)
- Best value: GPT-4o-mini ($0.15/$0.60) and GPT-4.1-nano ($0.10/$0.40)
Anthropic
- Strength: Best prompt caching (90% off), strong code/analysis quality
- Weakness: Fewer budget models, expensive flagship (Opus $5/$25)
- Best value: Claude 3 Haiku ($0.25/$1.25) for simple tasks with caching
- Strength: Cheapest flagship (Gemini 2.5 Pro $1.25 input), huge context windows (1M tokens), 90% cache discount
- Weakness: Long-context surcharge on Pro models doubles the price past 200K tokens
- Best value: Gemini 2.0 Flash ($0.10/$0.40) — incredible value with 1M context
Mistral
- Strength: EU-hosted, competitive pricing, excellent multilingual support
- Weakness: Smaller ecosystem, fewer integrations
- Best value: Mistral Nemo ($0.02/$0.04) — cheapest hosted inference available
Groq
- Strength: Fastest inference (sub-second), runs open-weight models
- Weakness: Limited model selection (Llama, Mixtral only)
- Best value: Llama 3.1 8B ($0.05/$0.08) — fastest and cheapest option
Decision Framework: Which Model Should You Use?
Is the user waiting for a real-time response?
├── No → Use Batch API (50% off any model)
└── Yes
├── Does it need complex reasoning/creativity?
│ ├── Yes → GPT-4o, Claude Sonnet, or Gemini Pro ($1.25-3.00/1M in)
│ └── No
│ ├── Does it need >128K context?
│ │ ├── Yes → GPT-4.1-nano ($0.10/1M) or Gemini 2.0 Flash ($0.10/1M)
│ │ └── No → GPT-4o-mini ($0.15/1M) or Mistral Small ($0.10/1M)
│ └── Is it classification/extraction only?
│ └── Yes → Mistral Nemo ($0.02/1M) or Groq Llama 8B ($0.05/1M)
└── Does it need EU hosting?
└── Yes → Mistral models (EU-native)
Track It All in One Place
If you're using multiple models across multiple providers, your cost data is scattered across 3-4 different dashboards. Each gives you one number with no breakdown.
AISpendGuard unifies all your AI spend — OpenAI, Anthropic, Google, Mistral, Cohere, Groq — into a single dashboard with cost attribution by feature, customer, model, and environment. It automatically detects when you're using an expensive model where a cheaper one would work.
Free tier: 50,000 events/month. No credit card. Tags only — we never store your prompts.
Compare your AI costs for free →
This pricing data is updated monthly. All prices are from official provider pricing pages, verified March 2026. Prices shown in USD per 1 million tokens.