In Plain Sight
A few days ago, in The Rungs, I reached for a phrase to describe something I'd found reassuring: two doors, one room. I'd been teaching a Rails application to compile itself into nine languages, and learned that someone I'd never met had, in the same months, written a complete Swift compiler and SwiftUI runtime in dependency-free C, retargeting Apple's platform into a browser tab. Different stack, no coordination, the same move — a single person building the kind of whole-language compiler that used to require a vendor. When the cost of building the tool falls far enough, I wrote, you don't get one person climbing. You get strangers arriving at the same rung from opposite sides.
I have since found a third door, and it is the most instructive of the three, because this stranger did not make the same move. They arrived at the same room by a different method — and the difference in method is the part worth a post.
The project is Plain, an open-source specification language, with a company, Codeplain, behind it. Their thesis, in the founder's words, is that "code should not be maintained — code should be regenerated. Specs should be reviewed, and it's the specs that you maintain." You write a structured specification; a renderer produces the implementation; when something breaks, you fix the spec and regenerate the code from scratch. The code is ash. The spec is the thing you keep.
I read that and felt the particular vertigo of meeting someone who has been having your thoughts. So let me do the honest thing and pull apart what I agree with from what I don't, because the disagreement turns out to be precise, and a couple of the choices in my own project exist for no other reason than to answer it.
Where I agree, which is most of the way
I agree with the inversion, completely, and I think it is the important thing about Plain. The implementation is no longer the asset. When the agent holds the depth and the cost of producing code falls toward nothing, the durable thing relocates upward, to whatever the code is generated from — and the right response is to stop maintaining the output and start maintaining the source of truth. I have made that same argument about my own work, and watched the rigor move off the code I never read and onto the thing the code answers to. The broader framing isn't even theirs: Chad Fowler — a veteran engineer and investor — calls it a Phoenix architecture, software designed to burn down and be reborn from a more durable artifact, and it gave Codeplain's founder the vocabulary he'd been missing. The instinct is exactly right. Review intent, not implementation. Regenerate, don't patch.
I also think the most useful thing about finding Plain is not that I can argue with it. It's that it is here at all. A funded company, with customers, arriving independently at the conclusion I'd been treating as a sample-of-one peculiarity, is not a threat to the thesis. It is the thesis: when authoring the high-level artifact stops requiring an institution, you do not get one eccentric retiree. You get a market. Plain is a witness I didn't have when I wrote that the construction had gotten cheap, and a witness who came in a different door — a company with customers, not one more lone builder — is worth more than one who came in mine.
So when I say I'm skeptical, I'm skeptical of one thing, and it is downstream of one choice. Everything else, we share.
What licenses you to throw the code away
Throwing the implementation away is the easy part. The hard part is what gives you the right to — what guarantees the regenerated thing is still correct, so that "I deleted it and made another" is an engineering practice and not a prayer. And here history is unkind and clarifying: it has only ever issued two licenses.
One is a proof. SQL is the great example. You write a declarative query, the optimizer produces an execution plan, and you throw plans away without a second thought — nobody version-controls a query plan — because every plan the optimizer considers is a provably equivalent rewrite of your query under relational algebra. The plan is disposable because the transformation is correctness-preserving by construction. That license is real, total, and available only in a closed domain that has an algebra.
The other is a reference. When you cannot prove equivalence, you point at something concrete and define correct as matches this. That is the whole idea of conformance, and it's the only thing that generalizes to open domains, because it doesn't ask for an algebra that isn't there.
The fourth-generation languages are the graveyard of everyone who wanted SQL's disposability without SQL's algebra and never supplied a reference in its place: declarative "trust me" in an open domain, backed by neither leg. They didn't fail because business logic is too hard. They failed because they removed the proof and put nothing in the empty socket.
The most successful thing I ever stood near, in the standards world, was watching the reference license get issued properly. Before HTML5, browsers were "compliant" against prose specifications that said what a valid document meant and fell silent exactly where the browsers actually diverged — error handling, the messy real input — so they interoperated worst precisely where the spec stopped talking. The WHATWG fixed it by writing the reference behavior into the spec itself: a parsing algorithm, step by step, defined for any byte stream including garbage, so that "conforms to the spec" and "matches the reference" became the same sentence. The executable half of standardization — the part that was always too expensive to build, the part that often never got built — finally got built, and the browsers converged.
My own project gets the second license, and I want to be careful about how, because I used to describe it lazily — as if I'd been handed the reference. I wasn't. The reference is not Rails, and it is not something I inherited. The reference is my application, and I write it. I write it quickly, in Rails' DSL, with all of Ruby's programmer happiness, because in a greenfield the scarce thing is finding out what I even mean, and Rails is the lowest-latency, most pleasant way I know to put a behavior in front of myself and look at it. Then I do the one act that mints an oracle out of a program: I run it, observe it, and validate that it does what I intended. That validated, running application is the reference — and from that point my oracle is a diff against it, fetched URL by fetched URL, byte for byte. I am not in the proof business; I can't prove my Rust equals my Ruby. I'm in the reference business, and the reference is a thing I authored and confirmed by eye, not a thing that was lying around.
A court that holds its own trials
Now look at where that leaves Plain, and be fair about it, because Plain is not careless — it has tests. It even calls them conformance tests, and it ships ***acceptance tests*** besides. But read the documentation and its :ConformanceTests: are, in its own words, "generated from the specifications." In the distinction I drew in Conformance vs Comprehension, that is a spec test wearing the name of a conformance test: it descends from the written authority, not from observed behavior. And the renderer that writes the code is a large language model — "probabilistic in nature," the whitepaper says, plainly and to its credit — while the tests that check the code are generated from the same prose by the same renderer. The court and the defendant were briefed by the same lawyer. There is no independent reference; the thing the regeneration answers to is the prose it was generated from.
Plain knows prose is treacherous. It spends an entire, genuinely good whitepaper on the problem, introducing a :Concept: notation to pin down the words that carry too much meaning, because — its own line — an LLM "has no basis for determining which interpretation is intended beyond statistical inference." I agree with every sentence of that whitepaper. I just think that, read straight, it is an argument against the pipeline it is defending. If probabilistic interpretation of prose is unreliable enough to need a special notation to fence it in, the deeper fix is not to annotate the prose more carefully. It is to stop interpreting it probabilistically at all.
Why my project looks the way it does
Which is the choice my project makes, and the reason it can make it is that I don't start from prose.
Rails is not a description of a program. It is a program — more precisely, an interpreter. Its routes, associations, and validations are structures the framework walks at request time, and an emitted target is what you get by specializing that interpreter to one specific application: the router stops dispatching over a generic table and becomes concrete handlers, the metaprogramming resolves into typed queries, the interpretive generality is compiled away. That's an old idea with a name — the first Futamura projection — and what's new is only that it's cheap now. The contrast with Plain is exact: a prose spec is a description you must re-interpret, by guess, every single time; a framework is an interpreter you can specialize, deterministically, once. And a deterministic specialization gives you something to stand on: run it a thousand times and you get the same bytes, which you can check against the reference once and trust thereafter. You cannot do either with the output of a model that may read the same paragraph two ways on two Tuesdays.
So the eccentric-looking choices line up, each one answering a place where I was skeptical of the other door. Author the behavior on a blank page in an executable framework-DSL rather than in prose — fast, because Rails is built for speed, in a language whose conventions were already carrying the types so intent is there to infer rather than annotate, and pleasant enough that I'll run the loop a hundred times to find what I mean. Lower with a deterministic compiler instead of a probabilistic renderer, because regeneration you can trust has to be byte-stable. And let the oracle be the running application I validated by hand, not a test apparatus spun from the same prose that wrote the code, so the court and the defendant have different briefs. None of those is cleverness. Each is a direct response to "what licenses throwing the code away," answered with a reference I author and check, because I had no algebra and refused to settle for a prayer.
The part I owe them
Here is what I owe Plain, and it took a wrong turn to see it. I almost wrote that I have the easy job — that my reference comes free because I work in brownfield, carrying an application someone already built and debugged, while Plain labors in a greenfield with nothing to point at. That is wrong, and the way it is wrong is the actual difference between us. I am in greenfield too. The key asset was never Rails; it is my application, and on a blank morning it doesn't exist yet either. We both start from nothing.
So the distinction was never brownfield versus greenfield — it is what you author onto the blank page, and whether the thing you author can validate itself. I author a running application, in a DSL fast and pleasant enough that I'll write it before I fully know what I want and iterate until I do, and I confirm it the only way that counts: by watching it run and checking it against what I meant. Plain authors a prose specification, which cannot run, and so cannot be watched. The same blank page, two media — and only one of them can become its own validated reference. That is my whole skepticism, stated fairly: not that Plain took the harder ground, but that it chose a medium that can never itself be the oracle, and so must keep a separate apparatus to stand in for one — and at the moment builds that apparatus out of the very prose it is meant to check.
The repair is not to abandon the medium; it is to stop asking the prose to validate itself. When your authoring medium can't be its own reference, you do what the WHATWG did — you legislate one: write the observable behavior down, separately, until conformance has something independent to mean. The tool for it is already in Plain's hands: those ***acceptance tests***, developer-authored, observable, independent of the generated prose. Every one of them is a small piece of legislated reference behavior, a recognition frozen into something that outlives the moment you recognized it. The friendly-stranger suggestion I'd leave on the table is to lean on those as the thing the regeneration answers to — to grow them into the trial court — and to stop letting spec-derived tests wear the word conformance, because the word is the one part of standardization you can't afford to spend loosely.
I've argued before that exploring a greenfield by producing a thing, watching it fail, and fixing the description rather than the output is a legitimate way to work now that an iteration costs minutes — it is the rehabilitated version of "bring me a rock," and it is how a great deal of real discovery has always happened. That loop is the heart of both our projects, and on it I think Plain is right and I'm a little envious of how directly it's aimed. The only law the loop has ever had is that the recognition behind each rejection must be banked into something that survives you looking away. When I validate a running app and keep it, that banking is automatic — the thing I looked at is the thing I keep. In prose you have to bank it on purpose, and the acceptance test is where you do. A prose edit is not.
And I owe them a correction to my own concession, because the obvious version is wrong. I started to say prose asks less of its author — no Ruby, no framework, just structured English. Plain's own user testing found the opposite: developers resist writing specs even as they're happy to read them. Writing the durable artifact is exactly the part people won't do — the same rock that sank spec-driven development every prior time. Plain's answer is honest and clever: plain-forge, where the agent drafts the spec from conversation and the human reviews it and checks the running software, "building relations with the spec" an increment at a time. But notice what that does. The promise was that the spec is the durable thing the human authors and maintains; once the agent authors it, the human's contribution is no longer the prose but the validating and the steering — judgments against running software that the agent transcribes. Which is precisely where I'd put the irreducible human work, and where Drucker and the change that transformed my own results put it. Automating the part nobody wanted to write quietly concedes that the spec was never where the authorship lived. The validation was. So the difference isn't that prose is easier to write than Ruby — it's that Rails is pleasant enough to write that the human stays the direct author of the durable artifact, no agent in the middle, while Plain found its medium unpleasant enough to author that it routed the human out. What prose keeps is reach: a non-programmer can read and steer it without a framework, and for much of the eventual audience that may matter more than anything I've said. That bet is real. But "maintain the spec, not the code" lands differently once the spec is something the agent writes and you approve — at which point we are both, again, doing the same job.
In The Scarce Thing I said the input about to become scarce is judgment about which behaviors are worth pinning down. Standing next to Plain, I'd add a second axis to that judgment, and I'm grateful to them for making me see it: not only which behaviors, but what your regeneration answers to — an executable thing you author and validate, or a separate reference you legislate around a description. Two doors, one room. We agree the code is ash and the durable thing sits above it. Where we differ is on what the durable thing is allowed to be — a description, or the running, validated behavior a description is only ever the last appeal to.
I am still a sample of one, with a strong prior and a single project, and I would rather find the cracks in this than not. Finding out that a stranger walked into the same room through a different door is the best way I know to do that — and the most useful thing about the third door is that it makes the shape of my own a great deal easier to see.
Roundhouse is open source: dual-licensed MIT / Apache-2.0. Issues and discussion welcome.