Skip to content
← All posts
Guides
June 6, 2026 · Updated June 12, 2026 · 3 min read

GitHub Copilot pricing explained: premium requests, AI Credits, and the agentic bill shock

GitHub Copilot pricing changed shape twice in twelve months: premium requests arrived in June 2025, and on June 1, 2026 Copilot moved to usage-based billing with GitHub AI Credits. Most "Copilot pricing explained" articles still describe the old model. This guide covers what a premium request was, what an AI Credit is, who's on which system, and why agentic workflows are where the money goes.

What did Copilot cost before — and what was a premium request?

From June 18, 2025, paid Copilot plans metered "premium requests": each chat or agent interaction counted as one or more requests depending on the model. Business included 300/user/month, Enterprise 1,000; overage billed at $0.04 per premium request.

The catch was model multipliers — one prompt to a frontier model could count as many requests. And in June 2026 the multipliers for legacy annual subscribers jumped sharply: Claude Opus-class models went to 27× (from 7.5×) and GPT-5-class to 6× (from 1×). Copilot code review alone now carries a 13× multiplier. The same prompt, the same plan — a very different bill.

What changed on June 1, 2026?

Copilot's metered features now bill by actual token usage, denominated in GitHub AI Credits (1 credit = $0.01), charged on input, output and cached tokens at per-model rates. Plans include a credit allowance roughly equal to their price; usage past it bills per credit.

PlanPriceIncluded
Business$19/user/mo$19 of AI Credits
Enterprise$39/user/mo$39 of AI Credits (shared pool)
Overage$0.01 per credit

Still unmetered: code completions and Next Edit Suggestions. Metered: chat, agent mode, CLI, cloud agents, Spaces, and third-party agents (billing docs). Annual subscribers ride the legacy premium-request model (with the new multipliers) until renewal — so two developers on the same team can be on different billing systems right now.

Why did your agentic workflow drain a month of credits in a day?

Because agent mode bills by tokens, and agent loops consume tokens quadratically — the full context re-sends on every internal roundtrip. Within days of the switch, developers reported bills 10–50× their previous spend, with Reddit threads describing a month's credits gone in a single day of agent use.

The mechanics are identical to every coding agent: roughly 76% of agent tokens go to reading and navigating code, and every file the agent reads travels with every subsequent request in the loop. Autocomplete users barely noticed the billing change; agent users absorbed nearly all of it. The bill tracks how efficiently your agent knows your codebase — you can model the loop math here.

How do you estimate and cap Copilot spend?

Translate your agent sessions to tokens first, then to credits. A worked example: an agent task with 150 tool calls accumulating 200K input + 20K output tokens on a Sonnet-class model costs roughly $1.00–1.50 in API terms — 100–150 credits. Ten such tasks per developer per day ≈ $200–300/developer/month, an order of magnitude over the included allowance.

Practical controls, in order: set org-level budgets and alerts in billing settings; watch per-model consumption (one frontier-model agent workflow can outspend a whole team's chat usage); and keep the multiplier table in view if you're on the legacy annual plan — a 27× model choice is a pricing decision, not just a quality one.

How do you make Copilot agent use cheaper without giving it up?

Cut the tokens per task, not the tasks. The levers, by impact:

  1. Fix what the agent reads. Reading is ~76% of the spend. A structural map that serves the agent the exact function or slice instead of whole files is the biggest lever — we measured −86% navigation / −90% read tokens, fidelity-gated. unerr serves that map locally over MCP, so the same layer works for Copilot, Claude Code and Cursor side by side.
  2. Scope agent tasks tightly. Short, well-scoped agent runs avoid the quadratic tail of long loops.
  3. Match the model to the task. Multipliers make frontier models 6–27× on legacy plans; completions-style work doesn't need them.
  4. Keep completions, meter the rest. Completions are still unmetered — the expensive Copilot is the agent, so that's where efficiency tooling pays.

Team-level visibility — which repo, which workflow, which developer's configuration burns the credits — is the difference between capping spend and capping usefulness; that's the operational layer unerr adds.


Billing details as of June 2026 — check GitHub's billing docs for current figures. Related: token optimization for coding agents · the benchmark · unerr pricing.

See it on your own repo

Free to start. One install, your codebase, real numbers.