What is LLM steering?

LLM steering is shaping a model's behavior at inference time — through what you put in context, the constraints you enforce around it, and the feedback you return to it — so output lands inside your team's conventions without retraining or fine-tuning anything. For coding agents, steering is the answer to a specific failure: the model writes plausible code, but plausible-for-the-internet, not consistent-with-your-codebase.

Steering operates at three layers:

Context steering — the model imitates what it sees, so what enters the window is the steering signal: your conventions, your patterns, the relevant slice of your code (see context ops)
Constraint steering — gates around the model: deny an edit that breaks callers, require references to be checked, enforce rules at the tool layer
Feedback steering — structured signals back into the loop (test results, lint, convention violations) so the next attempt corrects

The economics make context steering the workhorse: it needs no training run, it works across every model and agent, and it rides infrastructure you need anyway. The same structural map that cuts read tokens 86–90% is also the steering channel — the agent that's served your conventions and your patterns produces your code, on whichever agent each developer runs.

Steering is the consistency half of agent ops; the cost half is token ops. Related: why agents forget your conventions every session.

See it on your own repo