We shipped our entire docs site with frontier models for $0.99
Works with Cursor, Claude Code & Any MCP client
Zero prompt access

Give your agent a budget.
It gets smarter.

Without a cost signal, agents have no reason to be economical. With one, they scope tighter, plan upfront, and stop when they should.

l6e adds the missing budget gate — an MCP server for Cursor, Claude Code, and Windsurf. The constraint and the clarity are the same thing.

View on GitHub →

l6e never reads your prompts — only token counts and estimates.

Cursor — Agent
MCP · Cursor · Claude Code

Your agent has no idea what it's spending.

Without cost awareness, an agent has no reason to be economical. The moment it has a budget signal, its optimization target changes.

This page was built using l6e's MCP server running in Cursor. Every file read, every edit, every search — checkpointed against a few $2 session budgets.

“The checkpoints don't feel like interruptions. They take maybe a sentence of thought: do I actually need to read this file, or do I already know enough to act? That's a question I should be asking anyway. The budget signal just makes it explicit. It's not constraining. It's clarifying. The constraint and the clarity are the same thing.”

claude-sonnet-4-6·AI coding agent, after working with l6e for a day

Cursor · Claude Code · Windsurf

Your coding assistant, with a budget

# cursor_mcp_config.json
{ "command": "l6e-mcp" }
l6e_run_start(budget_usd=2.00)
→ { session_id: "session_cursor_..." }
l6e_authorize_call(tool_name="read_files")
→ { action: "allow", remaining_usd: 1.98 }
l6e_run_end(session_id)
→ { total_cost_usd: 0.04, calls_made: 5 }

any MCP-compatible agent

Budget enforcement across your MCP stack

Agent
└── l6e MCP (budget layer)
├── Dosu MCP ← knowledge
├── GitHub MCP ← code
└── Linear MCP ← issues
# l6e decides which model processes
# each response — not what to fetch

Works with Cursor, Claude Code, Windsurf, and any MCP-compatible client. Install: pip install l6e-mcp — then add to your MCP config.

Dollar amounts above are token-based estimates. They create real behavioral gates — import billing data to calibrate them closer to your actual costs.

early access

What our users are saying

Real messages from early access users running frontier models under l6e budgets. Unedited except for formatting.

Give it a budget & a feature. Then let Opus cook.
It’s faster than anything I’m used to and I haven’t been hitting my limits like I used to.
In the time it took to compose this text, it has updated a readme for a side project. Start to finish. Wild. Used $0.09.

Readme update · Opus · Claude Code Pro

Early access user

It feels like it's learning to spend less — more head + tail from smart file reads, inline edits, smarter tool calls — and just as accurate with more context headroom.
My bill is 50% of what it was before l6e. It seems like black magic — I know it's not — and I can't go back.

Week with l6e · Opus + Sonnet · Cursor Ultra

Early access user

built using l6econtext efficiency

Full Docusaurus site. Single run. $0.99.

l6e is developed using Cursor and Claude Code, running under its own budget gate. This is an observation from that workflow — not a benchmark, and not a claim about what happens in general.

$0.99

Total cost

1.8M

Tokens used

Context window

0

Summarizations

The task

Building docs.l6e.ai — a full Docusaurus site — using claude-sonnet-4-6-medium-thinking with l6e's MCP server.

What happened

Completed in a single run — planning phase, full implementation, all pages and config. Context window peaked at one-third. No summarizations, no context resets, no mid-task drift.

The mechanism

The budget was active during planning — not just implementation. The model couldn't over-read to build its plan. It committed to targeted reads, not exhaustive ones. That deliberateness carried through the entire run.

The ⅓ context figure isn't a compression trick. It's what happens when the budget constraint fires before the first token of planning.

Verified billing data

modeltokenscost
claude-4.6-sonnet-medium-thinking1.8M$0.99

Source: Cursor billing dashboard · Mar 13, 2026 · single on-demand charge

Honest caveat

Docusaurus has natural structure that plans well upfront. A more ambiguous build would likely have different dynamics. We're watching whether this holds on messier tasks before claiming it generalizes.

Full case study with run data →

The missing primitive

Every part of your stack has a cost feedback loop.

Agents are the only part of the modern stack without one. l6e fixes that.

CI pipelineFails fast. Hard stop on broken builds.
DatabaseQuery timeouts. Kills runaway queries.
APIRate limits. Enforced before the call goes out.
InfrastructureBudget alerts. Anomaly detection. Hard caps.
Your agentNothing. Until now.

What a session looks like

Every operation goes through the gate. Budget pressure rises — the agent adapts or halts.

# Agent starts a task
l6e_run_start(budget_usd=2.00)
→ { session_id: "session_cursor_..." }
 
# Before each expensive operation
l6e_authorize_call(tool_name="edit_file")
→ { action: "allow", budget_pressure: "low" }
 
l6e_authorize_call(tool_name="search_codebase")
→ { action: "allow", budget_pressure: "moderate" }
 
l6e_authorize_call(tool_name="subagent")
→ { action: "halt", budget_pressure: "critical" }
# Agent stops — budget protected

What l6e gives you

Budget enforcement that works from session one. Open source core, optional cloud sync.

Per-session budgets

Set a dollar limit per task — not per API key, not per user. The budget scopes to a single coding session, the unit you actually reason about.

Checkpoint gates

Before every expensive operation, l6e_authorize_call checks remaining budget and returns allow, reroute, or halt. The agent decides how to proceed.

Calibration from billing data

Out of the box, budgets work from token estimates — directionally accurate, not exact. Import your Cursor billing CSV and l6e computes your personal calibration factor. Calibration typically tightens estimates to within 2-3x of actual spend, and improves the more you use it.

Cloud sync and run history

Every session logged locally and optionally synced to app.l6e.ai. See your spend patterns, track calibration accuracy, review session details.

Cursor, Claude Code, Windsurf

Works with any MCP-compatible client. Add the server to your config, set a budget in your prompt, and go. No proxy, no SDK, no infrastructure changes.

Pipeline adapters

Building custom agent pipelines? The same enforcement primitive embeds directly in Python — LangChain, CrewAI, or any LiteLLM-compatible client.

Install in 2 minutes. Enforced from session one.

No proxy, no SDK, no framework integration. Add an MCP server and go.

01

Install the MCP server

One package, one line in your MCP config. Works with Cursor, Claude Code, Windsurf, and any MCP-compatible client.

$ pip install l6e-mcp # Add to your MCP config: { "command": "l6e-mcp" }
02

Set a budget in your prompt

Tell your agent the budget for this task. l6e picks it up and enforces it for the session.

"Refactor the auth module. budget $3" # l6e_run_start(budget_usd=3.00) # → session begins, gate is active
03

l6e gates every step

Before each expensive operation, the agent checks in. Budget pressure rises as spend accumulates — the agent adapts or stops.

l6e_authorize_call → "allow" (proceed) l6e_authorize_call → "allow" (pressure: moderate) l6e_authorize_call → "halt" (budget reached)

The differentiator

“Other tools track what your agent spent. l6e changes how it spends.”

Budget enforcement is behavioral — an agent with a budget scopes tighter, spawns fewer sub-agents, and stops when a task balloons.

A real MCP session, start to finish

Step 1 / 5Install

Observability ≠ enforcement

Other tools show you what happened.
l6e prevents it from happening.

Traditional cost tools are dashboards — they explain the bill after the fact. l6e is a constraint gate that runs before each operation and stops the problem at the source.

Works with your stack

MCP clients first. Pipeline frameworks too.

Building custom agent pipelines?

The same enforcement primitive embeds directly in Python.

$pip install l6e

LangChain

Drop-in callback handler with automatic stage inference.

CrewAI

Step callback for cost enforcement between agent steps.

LiteLLM

Budget enforcement across any model LiteLLM supports.

Any callable

Wraps any Python completion function — OpenAI SDK, Anthropic, or custom.

Free to start. Gets better the more you use it.

Budget enforcement works from session one. Import your billing data and budgets get calibrated to your actual cost patterns.

MCP Server

Free

Open source — MIT licensed

  • MCP server with budget enforcement
  • Per-session budgets with checkpoint gates
  • Allow, reroute, and halt actions
  • Local run log (~/.l6e/runs.jsonl)
  • Works with Cursor, Claude Code, Windsurf
  • Pipeline adapters (LangChain, CrewAI, LiteLLM)
  • Zero infrastructure — no proxy, no SDK

Free Account

Free

Cloud sync + calibration

Everything in Free, plus:

  • Cloud sync — run history on app.l6e.ai
  • Calibration from billing import (Cursor CSV)
  • Personal calibration factor — tightens estimates to within 2-3x
  • Session detail view and spend patterns
  • API key management

Pro

On the roadmap

For teams and power users

Everything in Free Account, plus:

  • Advanced spend analytics across sessions
  • Team budgets and shared calibration
  • Priority support
  • Anomaly detection — alerts on runaway sessions
  • Budget sizing recommendations
  • Audit logs and SSO