Traceminder

Find where your LLM bill is leaking.

Upload a sanitized usage export and get a clear cost leak report within 24 hours. No prompts, no completions, no API keys.

Delivery Cost leak report in 24 hours
Input Sanitized usage metadata only
Privacy Your file is deleted after delivery

Upload usage metadata, not private data

Traceminder only needs sanitized cost metadata from OpenAI, LiteLLM, Langfuse, Helicone, or your own logs.

Fields we can use
timestamp model input_tokens output_tokens cost endpoint status retry_count cache_hit user_hash tenant_hash
Do not upload
prompts completions API keys secrets customer names raw requests raw responses

See a sample report before you buy. We never need prompts, completions, API keys, or customer data, only sanitized usage metadata.

top_leaks Top cost leaks

See which routes, models, tenants, retries, or endpoints are driving spend.

estimate_waste Estimated waste

Understand how much each issue may be costing per month.

fix_list Fix list

Get practical changes your team can make first.

shareable Shareable report

Receive a clean PDF or HTML report you can send to engineering or product.

How Traceminder finds cost leaks

Retry waste

Repeated failed or retried calls that still cost money.

Prompt bloat

Large static prompts or long contexts increasing every request.

Wrong model routing

Expensive models used where cheaper models may be enough.

Tenant outliers

Users or tenants spending far more than normal.

Cache misses

Repeated similar calls that could be cached.

Where spend leaks

How leaks show up in your traces.

Sample report

See the report before you buy.

Review a sample Traceminder report generated from demo usage data so you know exactly what you will receive. Every finding follows the same structure, so the output stays predictable no matter what your file contains.

Report structure
1. Top cost leaks

Ranked by estimated monthly waste, each tied to a route, model, or tenant.

2. Evidence

The usage pattern behind each leak, shown in your own numbers.

3. Estimated waste and fix

How much each issue may cost per month, with the change to make first and a confidence level.

traceminder-sample-report.pdf Open in new tab Download PDF
Sample report · demo usage data

Cost leak report

Based on 426 logged calls over roughly 0.9 days, projected to 30 days.

projected_monthly_spend $25.49
estimated_monthly_waste $16.60 About 65% of spend, de-duplicated so nothing is counted twice.
  1. Cache misses $11.10 / mo

    Identical requests repeat but are never served from cache, so each repeat is paid for.

    Evidence/answer: 66 calls share only 4 input sizes, cache-hit rate 0%.

    FixAdd caching keyed on the normalized request, enable provider prompt caching, and set sensible TTLs.

    Confidence: medium
  2. Tenant outliers $9.10 / mo

    A small number of tenants drive a disproportionate share of spend.

    EvidenceOne tenant drives 55% of spend ($0.41), while the median tenant is $0.10.

    FixReview limits and pricing for these tenants, apply the fixes above to their traffic first, and add per-tenant budget alerts.

    Confidence: high
  3. Prompt bloat $5.34 / mo

    A large, near-constant prompt prefix is sent on every call to some endpoints.

    Evidence/answer: 66 calls average 1315 tokens in and 249 out over a static floor of 1314 tokens.

    FixTrim the system prompt, move static context to provider prompt caching, and retrieve only relevant chunks.

    Confidence: medium
  4. Retry waste $3.10 / mo

    Calls are billed multiple times because failed or rejected responses are retried.

    Evidence108 billed retry attempts and 32 rate-limited or error attempts, most of them on /extract.

    FixCap retries, add exponential backoff, fix the validation that rejects good responses, and alert on retry spikes.

    Confidence: high
  5. Wrong model routing $0.74 / mo

    Trivial tasks run on an expensive model where a smaller model is typically sufficient.

    Evidence/classify: 110 calls on gpt-4o producing only 6 output tokens each.

    FixRoute classification and short tasks to a mini model behind a quality check, and add a model-selection policy.

    Confidence: medium-high

Per-finding figures are standalone and can overlap, so they are not summed. The headline waste attributes each billed call to a single leak, so it stays honest. Analysis uses only sanitized usage metadata, never prompt or completion text.

Two ways to audit your LLM spend

Deep Cost Audit $149

For teams that want deeper technical recommendations.

  • Everything in Cost Leak Audit
  • Route-by-route analysis
  • Model downgrade candidates
  • Before and after savings plan
  • Engineering action notes
Buy deep audit

Before you pay, we run a quick file check to confirm your export has enough cost metadata. You only purchase once your file is ready for a useful report.

Transparent by design

Traceminder shows the sample report, upload schema, and analysis method before you buy. Your file is used only to generate your report and is deleted after delivery.

Built by an independent engineer, not a black-box SaaS. Your file is never stored for training or resale.

We never need prompts, completions, API keys, secrets, customer names, or raw request or response data.

Fair use promise: if your uploaded usage file does not contain enough cost metadata to generate a meaningful report, we cancel and refund the order. Once the report is delivered, payments are final.

Questions before you upload

Do I need to connect my OpenAI account?

No. Upload a sanitized CSV or JSON export.

Do you need my prompts?

No. Traceminder works from usage metadata.

How fast is delivery?

Most reports are delivered within 24 hours.

What if my file is missing some fields?

Before you pay, we run a quick file check to see what your export contains. If anything important is missing, we show what to add or ask for a safer alternative export. You only purchase once your file has enough metadata for a useful report.

Is this a dashboard?

No. Traceminder is a cost leak report, not another observability dashboard.

Notes from real LLM bills

Practical notes on real LLM cost leaks: retry waste, prompt bloat, model routing mistakes, cache misses, and tenant outliers. Each note shows the data pattern, how Traceminder detects it, and what to change to reduce spend.

Coming soon

The first field notes are being written from real usage data. Check back shortly.

Traceminder turns a sanitized usage export into a ranked cost leak report for teams running OpenAI, Claude, Gemini, RAG, or agent workflows.

See where the AI budget went.

Upload usage metadata only. We check that your file has enough data before you pay, deliver the report within 24 hours, and delete the file after.
Get a Cost Leak Report