Rails Was Already Typed

2026-04-21T17:45:18Z

Compiling Ruby is essentially an impossible task. Compiling a Rails application is surprisingly tractable. The difference is that Rails has been carrying a declarative type system in its conventions for twenty years — and once a compiler can read it, the apps type without annotations.

Three legs

The type information lives in three places. None is enough on its own. Combined, they cover most of what an analyzer needs to type a typical Rails application.

Schema. db/schema.rb is the ground truth. t.string :title, null: false declares an attribute named title that holds a non-null string. t.references :article, null: false, foreign_key: true declares a non-null foreign key. Migrations produce it; the schema file freezes it. Roundhouse reads it as the root of the model type graph.

Rails conventions. The Rails DSL threads typed information across files. belongs_to :article declares a non-null Article association on the side that holds the foreign key. has_many :comments declares the inverse. before_action :set_article, only: [:show, :edit, :update, :destroy] declares which @article ivar is in scope in those actions, and which controller method assigns it. validates :title, presence: true narrows the type — a saved Article has a non-null title even if the database column doesn't enforce it. Custom inflections (inflect.irregular "leaf", "leaves") ensure the same pluralization is used in table names, model lookups, and route helpers. None of this is annotation. It's the application code itself.

Type inferencing. Standard compiler machinery fills the rest. Literals — {} is a hash, [] is an array. Method return types — to_a returns an array, count returns a number, group_by returns a hash (or a Map, depending on target). Block parameter types flow from the receiver. Local variables propagate types across assignments. Where the schema and conventions don't directly declare a type, inference extends them through expressions the developer wrote.

Each leg alone is insufficient. Schema is a database spec; without code semantics it doesn't type method calls. Rails conventions are patterns; without ground truth or inference they don't produce concrete types. Type inferencing in Ruby (Sorbet, RBS) requires explicit annotations because the language doesn't surface enough static information on its own. Combine all three and the types materialize. The information was already in the source. It was just scattered across files that heuristic-based transpilers couldn't cross-reference coherently.

Where things stand

Three days ago I introduced Roundhouse, an experimental compiler that reads Rails applications and emits standalone projects in other target languages. At that point, the analyzer fully typed the MVC fixture — every ivar, every block parameter, every partial local — but no target produced runnable code yet.

As of today, the TypeScript and Rust targets produce the same DOM as Rails on the standard blog scaffold:

$ scripts/compare typescript

  / ... match
  /articles ... match
  /articles/1 ... match
  /articles/new ... match
  /articles/1/edit ... match

5/5 paths match

$ scripts/compare rust

  / ... match
  /articles ... match
  /articles/1 ... match
  /articles/new ... match
  /articles/1/edit ... match

5/5 paths match

Same tag tree. Same attribute sets. Same text content. The only masked values are per-deployment noise — CSRF token, asset fingerprints, Turbo stream signatures, Rails view annotation comments. This is the bar the juntos compare tool established last month; roundhouse's TypeScript and Rust targets both clear it.

The other five targets (Crystal, Elixir, Go, Python, plus a Ruby round-trip emitter) compile and emit code, with runtime glue at varying maturity levels. With two targets at 5/5, the architectural patterns described below have their second data point, and extending across the rest gets cheaper.

Four days from first commit.

Layer by layer

Three of my projects in this lineage have used different development methodologies. Each was right for its phase.

Juntos was demo-driven. Pick a demo. Make it work. Fix what breaks. Move on. Appropriate when the design space is unexplored and you're discovering what's even possible.

Railcar was language-first. Get one target language end-to-end, then tackle the next. Appropriate when proving the multi-target thesis but each target is still research.

Roundhouse is layer-by-layer. Horizontal sweeps across all targets at once: get the analyzer to type every expression, then get every target to compile, then get controller tests to pass, then get the rendered output to byte-match Rails. Acceptance testing — humans plus browsers — is the next layer. Each layer is a forcing function. Once green, it's non-regressable: any future change must keep it green or it's a bug, not a feature.

What enables horizontal layers is compiler architecture. Specifically, a typed IR plus a refactoring move I'll call lifts — pulling per-target emitter logic up into a shared pre-emit pass so all targets inherit it. The Red-Green-Refactor cycle Kent Beck described for unit tests turns out to apply at the architectural level too:

Red. A forcing function fails. The compare tool reports /articles doesn't match Rails because article ordering is wrong — Article.includes(:comments).order(created_at: :desc) is being emitted as a bare Article.all(), dropping the modifiers.
Green. A targeted change in the per-target emitter passes. Enrich the IR's QueryChain to carry method, args, and receiver; walk the chain bottom-up; map order to .sort, limit to .slice, others to no-op pass-through. Compare returns to 5/5.
Refactor. Lift the new logic from the per-target emitter into a shared walker so the next target inherits it for free. Or defer the lift with a named trigger — "do this when the second target hits 5/5 and reveals which patterns are truly shared rather than accidentally similar."

Heuristic codebases can't refactor. They can only accumulate. Each heuristic that works today gets harder to remove tomorrow because every downstream consumer has silently come to depend on its specific quirks. A typed IR makes refactoring mechanical rather than archaeological — once the types make intent explicit, lifts are safe enough to be routine.

How far this goes

Honest scoping: roundhouse will not transpile every Rails application.

The realistic target is something like 80% of the patterns typical Rails apps actually use. Past that, each additional pattern costs more — rarer, harder to type, more likely to touch genuinely runtime-only behavior — while serving fewer apps. That maps onto application size:

Small apps. Zero-touch. The 80% covers them. Push-button transpilation, nothing asked of the developer.
Mid-size apps. Typable Rails-DSL patterns continue to Just Work. The wobble tends to be concerns — ActiveSupport::Concern modules whose bodies look more like arbitrary Ruby than the Rails DSL. They benefit from sidecar signatures or inline pragmas. Whether roundhouse adopts the RBS-style sidecar or the inline-comment pragmas juntos uses is a UX choice, not a forced one.
Large apps. Some code changes required. Refactor the untypable hot spots. Justified by the economic payoff: a compiled binary that loads faster than CRuby loads Rails, or an interpreter targeted to cloud needs (cold start, requests per second per gigabyte). For apps where infrastructure spend is meaningful, the refactoring cost is cheap by comparison.

The 20% isn't roundhouse's failure. It isn't the job. Long-tail apps are served by escape hatches and tier-aware tooling, not by chasing universal coverage indefinitely.

What this enables next

Once transpiling Rails becomes a compiler problem, the well-known compiler toolkit applies. Multiple targets — in place. Per-target runtime glue — in place. Cross-target shared lowering passes — in progress.

The next interesting direction is depth, not width. A bytecode emitter family. Reuse the Rust runtime — DB connection, HTTP, Action Cable, all the parts that don't belong in generated code. Instead of emitting application code as Rust, emit:

A bytecode representation of the application.
An interpreter for that bytecode.
The same Rust runtime underneath.

Bytecode design — register vs stack vs threaded vs combinator — is well-studied. There's no need to commit upfront. Emit multiple, benchmark, select. Garbage collection has an unusual opportunity here: because framework-aware whole-program analysis sees the entire application, it knows the per-request boundary. An arena-style allocator that frees on response is dramatically simpler than general-purpose tracing GC, and a Rails-specific runtime is exactly the place that assumption is safe.

Hypothesized payoff: a portable bytecode that loads faster than CRuby loads Rails, with better requests per second per gigabyte. Both metrics are CRuby's weak spots. Both get easier when the GC knows about request boundaries and the interpreter isn't carrying the weight of general Ruby.

This is speculative direction, not in-flight work. But it's a natural extension once the Rust runtime is solid — the architectural separation between application code and per-target runtime already factors the right way.

Bytecode is one way to cash out a broader claim: Rails as source of truth, deployment shape as compiler flag. Whether the substrate can also retarget into entirely different framework idioms — emitting Phoenix instead of Plug, or SvelteKit-with-a-companion-API instead of server-rendered MVC — is further exploration. It may produce something interesting. It may turn out to be a dead end.

Agentic engineering at architectural scale

Last October, Simon Willison proposed "vibe engineering" for the productive counterpart to vibe coding — the senior-engineer practices that let LLMs extend rather than replace expertise. He listed automated testing, advance planning, comprehensive documentation, version control, automation, code review, manual QA, research skills. He observed: "almost all of these are characteristics of senior software engineers already." In a February update, he conceded the term that had won the terminology competition was Agentic Engineering.

The roundhouse work is a case study of those practices — but at the architectural scale of a compiler rather than the feature scale Willison's list implicitly describes. Test-first becomes forcing functions per layer. Advance planning becomes lift planning. Documentation becomes the typed IR itself, where every expression carries its type and effect set. Strong version control becomes commit messages that serialize the Red-Green-Refactor narrative. Automation becomes the compare tool. Research skills become the move from heuristic transpilation to compiler theory.

The surprise isn't that Agentic Engineering works for a four-day solo compiler project. It's that the practice list is scale-invariant. Opus 4.7 one-shotted an IR design I'd been warned would need multiple rewrites; four days and seven emitters of downstream pressure later, it hasn't been rewritten. That's a model-quality observation, but it only happened because the practices around the model — typed IR up front, forcing functions on every layer, lifts as the refactor step — gave it what it needed to succeed.

The economic implication worth naming: a team weighing a multi-target Rails compiler under institutional expected-value constraints would never green-light it. Too speculative, too many bets stacked, too unclear when payoff arrives. The same project is feasible for a retired software engineer who finds it interesting and has copious spare time. That changes which projects get attempted, not just how fast they move.

Rails was already typed. The compiler that reads it is four days old. There's a lot left to do.

Roundhouse is open source: dual-licensed MIT / Apache-2.0. Issues and discussion welcome.