Thoughts on Welcome to Gas Town
I read (and re-read, a few times) Steve Yegge's Welcome to Gas Town, and my main reaction was: ambitious. It’s less a project pitch and more a call to action, a manifesto wrapped around an attempt to sketch a coordination layer for a world filled with agents. It’s worth paying attention to, and worth reading, for that alone.
The core idea, a structured environment where agents operate within defined roles, mediated by arbiters, feels like a plausible first approximation for how AI-backed systems, with fleets of agents, might actually work together at scale. Instead of assuming emergence, Gas Town proposes what we’re now starting to call a harness. That seems credible: durable, always-on systems need some kind of intentional design in their early phases. I’m biased in that I like the harness concept in general, primarily because it isn’t LLM-specific. A harness can include LLM components while still integrating with existing tools, and potentially swapping in better approaches tomorrow.
Steve’s description of Gas Town itself reads like a mash-up of social deduction game mechanics, control-plane software, and management theory. It feels, especially on the latter aspect, heavily designed in a modernist urban planning sense. Personally, for AI and agent-driven development, I find myself veering toward constraint-based approaches: establish invariants/laws and let agents explore within them. Gas Town can be read as an attempt to define invariants through roles that allow agents to make progress in an eyes-off/hands-off way. And so my first take is that it leans /hard/ into roles and division of labour. I keep wondering whether some kind of ‘physics’ or rubric-based approach might ultimately be the better primitives. To be fair, you have to start somewhere; there needs to be a bootstrap. And Gas Town, importantly, actually exists.
Looking into the project code itself, a few things stood out, as best I understand things:
A human operator (Overseer) talks to a coordinator (Mayor), who dispatches work to a fleet of worker agents (Polecats) running in isolated git worktrees. There are additional roles such as, Witnesses to nudge stuck agents, and Dogs to run background work. It’s 19th Century naming, but a very 20th-century org chart.
The system is built around three core abstractions: (i) a Bead, a versioned, queryable data record where all work and signals are captured written as JSONL; (ii) Dolt, a database agents write to for transactional backing and query support; and (iii) tmux as the agents’ runtime; each agent lives in a tmux session.
There’s a layered memory model: sessions for ephemeral work, sandboxes for persistence across runs (backed by git), and a permanent identity for each agent backed by a bead. Agents also have mailboxes, which immediately reminded me of Erlang/Pekko-style actor systems. Mail can be delivered directly to an individual agent, queued as a competing-consumer / first-come-first-served model, or broadcast to members of a channel.
Beads appear to have six states: Create, Live, Close, Decay, Compact, Flatten. This interacts with Dolt (to Create), with Live meaning an agent is actively working. Close/Decay/Compact/Flatten are essentially controls to stop records from growing without bound, and those lifecycle steps are handled by Dog agents rather than the worker. I found it interesting that bead docs themselves carry state, rather than the docs being purely something agents use. It’s reminiscent of ‘data is code’, in a LISP-ish way.
As a result, Dolt looks like a single point of failure. If you can’t create beads, the system stops and it reads like both control-plane and work-plane agents depend on Dolt being up. I’m not 100% on this one to be fair.
When an agent session starts (or restarts after a crash), it runs a prime command to load context: identity, current work task (a hook), work stage, pending mail, and so on. Layered memory architectures are becoming a norm as a way to manage context, and Gas Town seems to push that pattern hard. The overall design feels aligned with the “build a reliable system from unreliable parts” school of engineering.
To handle context bloat, agents use a handoff command to write a mail message (a bead) before ending the session. That message is picked up and a new session is spun up. This is interesting because it looks like agents are killed based on context bloat rather than running a compaction step inside a long-lived session. Insert your Memento memes here. But it also means the system is self-documenting and operationally transparent. Even if we don’t fully understand what’s going on inside an agent, we can understand what’s going on in Gas Town and to a degree, what went on. That’s a powerful property for building confidence and interpretability around what the agents are actually doing and why.
Instead of RAG or vector search, agents use structured queries, and a graph (DAG) provides steering. It reminded me of approaches like VexP to reduce token overhead, or modelling work as dependency graphs rather than doing endless context stuffing. Closer to Recursive Language Model reasoning pattern than a giant prompt. The design seems to want offload the toil of context management from people to the system. That alone makes Gas Town worth studying given how important a topic this is becoming.
GUPP, Gas Town Universal Propulsion Principle: ‘If you find work on your hook, YOU RUN IT.’ This is the prime directive: always make progress. When an agent session starts and sees a hook in loaded context, it executes immediately.
Monitoring looks like three layers: a dumb heartbeat daemon with a boot agent that decides whether a deacon is needed to drill down; and then deacon-level diagnosis is using witnesses and refinery information to assess whether a polecat agent doing the actual work needs fixing.
Git worktrees are doing a lot of work here (not unlike Claude-based workflows). The merge strategy reads like a Bors style: rebase commits into a batch, fast-forward, and bisect on failure. It’s efficient and should scale in principle with more agents. Me? I still pine for Mercurial. But it’s a very reasonable approach.
Gas Town’s software, written in Go code exists solely to shuttle state between agents and provide scaffold, what the project calls Zero Framework Cognition (ZFC). It’s an obvious design choice, but in practice logic bleeds into harness frameworks very easily (routing, escalation, arbitration). It’s one reason, as far as I can tell, the design ends up with so many roles and so much overwatch (the monitoring described above isn’t the half of it).
Token usage is the flip side of ZFC. If you move logic into agents, you spend more on inference. Every meaningful activity in Gas Town involves tokens, and the overwatch agents seem to scale in line with the agents doing the work. Steve to his credit is very clear about cost ramping with more agents. Ideally, the design would allow at least control-plane agents to use local/open-weight models (e.g., via Ollama). It’s a bit Claude-specific right now, but not fatally so, and the document as record and crash handling approaches suggests Gas Town should be robust to less capable models.
What I can’t tell is whether trying to balance the token budget with local models becomes impractical because local inference chews too much compute to be operationally viable and therefore vendors/cloud are a necessity, or whether it’s just an implementation/optimisation gap for now. I do think it needs to be squared away for an architecture where control and data planes don’t scale independently of workers.
On re-reading, I noticed, or felt, a serious underlying urgency, even anxiety to Steve’s post. He’s bullish on the project, but at the same time paradigm changes are uncomfortable if you’re not doing the paradigming. Maybe that undertone is better read as someone with decades of perspective recognizing a discontinuity and a need to act. On the other hand, it’s easy to overcompensate for a sense of falling behind by over-designing the option space ahead of us to regain a sense of control, and maybe that’s why I keep wondering whether the role-based design is an optimal primitive. Relatedly, the emphasis on ‘you’re not ready for this’ backed by an eight-level maturity model felt overdone. I can read it charitably as an attempt to jolt people out of incremental thinking, but it wore on me in the end.
Anyway. What makes Steve’s ideas in this post worth paying attention to, and what made me study the project more than I expected to, isn’t whether this is the specific blueprint we’ll end up building from. It’s that it’s trying to address that bigger question: what does coordination, decision-making—literally agency—look like when the participants aren’t just people anymore? Gas Town feels like a good opening answer, and a more hinged one than Steve is letting on.