Private BetaWe're currently in closed beta.Join the waitlist
BlogOperations
OperationsFebruary 28, 20256 min read

AI Cost Attribution: Know Where Every Dollar Goes

AI spend is growing fast. Most teams have no idea where it's going. Here's how to fix that.

Empress Team
AI Operations & Observability

Your AI costs increased 40% last month. Quick: can you explain why?

For most organizations, the answer is no. They know total spend. They might know spend by provider. But they can't answer the questions that actually matter:

  • Which agents are most expensive?
  • Which actions cost the most?
  • Which customers drive the most AI spend?
  • Where are we wasting money?

This is the cost attribution problem. And solving it starts with observability.

The Attribution Challenge

AI costs are inherently distributed:

flowchart TD A[AI Spend $10,000/month] --> B[OpenAI $6,000] A --> C[Anthropic $3,000] A --> D[Other $1,000] B --> B1[???] C --> C1[???] D --> D1[???]

Provider invoices tell you which APIs you called. They don't tell you why.

To attribute costs meaningfully, you need to connect API calls to:

  • The agent that made them
  • The action being performed
  • The business context (customer, workflow, use case)
  • The outcome (success, failure, value delivered)

Capturing Cost at the Source

Every agent action should capture cost data:

{
  "actor": { "name": "Analysis Agent v2.3" },
  "verb": { "id": "analyzed" },
  "object": { "id": "customer-report-892" },
  "result": {
    "success": true,
    "extensions": {
      "cost": {
        "total_usd": 0.45,
        "breakdown": {
          "input_tokens": 12500,
          "output_tokens": 3200,
          "model": "gpt-4-turbo",
          "provider": "openai"
        }
      },
      "duration_ms": 4200
    }
  },
  "context": {
    "extensions": {
      "customer_id": "enterprise-127",
      "workflow": "monthly-analysis",
      "triggered_by": "schedule"
    }
  }
}

This single statement captures:

  • What happened (analysis completed)
  • What it cost ($0.45)
  • Why it cost that (12.5k input tokens, 3.2k output tokens)
  • Business context (which customer, which workflow)

Five Dimensions of Cost Attribution

1. By Agent

Which agents are most expensive?

Agent Actions/Month Cost/Action Total Cost
Analysis Agent 45,000 $0.38 $17,100
Support Agent 892,000 $0.012 $10,704
Research Agent 23,000 $0.42 $9,660
Routing Agent 1,200,000 $0.002 $2,400

The Analysis Agent costs more per action but handles fewer requests. The Support Agent is cheap per action but volume drives total spend.

2. By Action Type

What are you paying for?

pie title Cost by Action Type "Analysis" : 35 "Response Generation" : 28 "Research" : 18 "Classification" : 12 "Other" : 7

If analysis is 35% of spend, optimizing analysis prompts has 5x the impact of optimizing classification.

3. By Customer

Which customers drive the most AI cost?

This matters for pricing, resource allocation, and unit economics. If enterprise customers cost 10x more to serve but only pay 3x more, you have a margin problem.

{
  "aggregation": "by_customer",
  "period": "2025-02",
  "data": [
    { "customer": "enterprise-127", "cost": 4200, "actions": 8500 },
    { "customer": "enterprise-089", "cost": 3800, "actions": 7200 },
    { "customer": "startup-442", "cost": 890, "actions": 12000 }
  ]
}

Notice: startup-442 has more actions but lower cost. Their use case is cheaper to serve.

4. By Outcome

Are you paying for success or failure?

{
  "outcome_costs": {
    "successful_actions": {
      "count": 892000,
      "cost": 38000,
      "cost_per": 0.043
    },
    "failed_actions": {
      "count": 45000,
      "cost": 8500,
      "cost_per": 0.189
    },
    "retried_actions": {
      "count": 23000,
      "cost": 4200,
      "cost_per": 0.183
    }
  }
}

Failed and retried actions cost 4x more than successful ones. Reducing failures directly reduces cost.

5. By Time

When does spend occur?

xychart-beta title "Hourly AI Spend" x-axis [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23] y-axis "Cost ($)" 0 --> 500 bar [50, 30, 20, 15, 10, 25, 80, 200, 380, 420, 450, 470, 460, 440, 420, 400, 350, 280, 180, 120, 90, 70, 60, 55]

If 70% of spend happens during business hours, batch processing at night could reduce peak costs.

Cost Optimization Strategies

With proper attribution, optimization becomes systematic:

Strategy 1: Right-Size Models

Track cost and quality by model:

Task GPT-4 Cost GPT-3.5 Cost Quality Delta
Classification $0.08 $0.003 -2% accuracy
Analysis $0.45 $0.02 -15% quality
Simple Response $0.12 $0.005 -1% quality

For classification and simple responses, the cheaper model is 96-99% as good. For analysis, the premium model is worth it.

Strategy 2: Prompt Optimization

Input tokens often dominate cost. Track prompt length by action:

{
  "prompt_analysis": {
    "action": "customer_response",
    "avg_input_tokens": 2800,
    "avg_output_tokens": 450,
    "token_ratio": 6.2,
    "cost_breakdown": {
      "input": 0.084,
      "output": 0.018,
      "total": 0.102
    }
  }
}

A 6:1 input-to-output ratio suggests prompt bloat. Reducing input tokens by 30% cuts this action's cost by 25%.

Strategy 3: Caching

Identify repeated queries:

{
  "cache_opportunity": {
    "action": "product_lookup",
    "daily_volume": 45000,
    "unique_queries": 1200,
    "cache_hit_potential": 0.97,
    "current_cost": 4500,
    "potential_cost": 135,
    "savings": 4365
  }
}

97% of product lookups are redundant. Caching could save $4,365/day.

Strategy 4: Batch Processing

Some actions don't need real-time processing:

{
  "batch_candidates": [
    {
      "action": "daily_report",
      "current_mode": "realtime",
      "latency_requirement": "< 4 hours",
      "batch_savings": 0.35
    },
    {
      "action": "sentiment_analysis",
      "current_mode": "realtime",
      "latency_requirement": "< 1 minute",
      "batch_savings": 0.05
    }
  ]
}

Daily reports can batch for 35% savings. Sentiment analysis needs real-time.

Building a Cost Dashboard

Essential cost visibility includes:

  1. Total spend trend - Are we growing, stable, or declining?
  2. Spend by agent - Which agents cost most?
  3. Cost per action - Is efficiency improving?
  4. Cost by customer tier - Are unit economics healthy?
  5. Anomaly detection - Are there unexpected spikes?
flowchart LR A[Action Stream] --> B[Cost Calculator] B --> C[Attribution Engine] C --> D[Time Series DB] D --> E[Dashboard] D --> F[Alerts]

The Empress Approach

Empress automatically captures cost data for every action:

  • Token counts by model
  • Provider-specific pricing
  • Custom cost dimensions
  • Real-time cost tracking
  • Anomaly detection and alerts

You see not just what you spend, but where and why.

AI costs shouldn't be a mystery. With proper attribution, they become another optimization lever.

Share this article
Now in private beta

Ready to see what your AI agents do?

Complete observability for autonomous systems. One platform for compliance, operations, and intelligence.