intertwingly

It’s just data

Show Your Work


In The Compilers Were Ready I argued that modern compilers — YJIT, ZJIT, V8's TurboFan, Spinel — are operating near the limit of what their input shape allows, and that the bottleneck for Rails performance under any of them isn't the compiler, it's the input. The compilers were ready. The input wasn't.

That post made the architectural argument. This one shows the work. Three moves the Roundhouse lowerer makes, each one demonstrated on the same real-blog application, with the Rails source on one side and the lowered Ruby — what every backend compiler sees — on the other. Where I link to short snippets, the complete emitted output for every target is browseable at rubys.github.io/roundhouse/browse/.

The reason to show this in code rather than describe it in prose is that benchmark numbers in this space tend to provoke a "that can't be real" reaction by default — and the right answer to that reaction is mechanism, not more benchmarks. The output is browseable. The mechanism is concrete. What's actually limited is something else entirely, and the honest version of that limit is the last section of this post.

Three moves

The lowerer's job is to take Rails source and produce Ruby that any downstream compiler can specialize against. The three moves below are the recurring transformations. Each one shows up dozens of times across a typical Rails app. Each one lands once in the lowerer and benefits every emit target — CRuby, TypeScript, Crystal, Rust, Spinel.

The shared micro-move beneath all three: replace runtime dispatch indirection with a directly-resolved method call. Inline caches stabilize. Whole-program inference closes. AOT compilers can monomorphize. The same input that helps the JITs is what makes strict-typed compilation possible at all.

1. Polymorphic containers → typed structs

The most common shape-instability source in a Rails app is ActionController::Parameters, Rails' wrapper around HashWithIndifferentAccess. Every key access returns Object. Every call site downstream of it has no static information about what came back.

Source:

class ArticlesController < ApplicationController
  def update
    @article = Article.find(params[:id])
    if @article.update(article_params)
      redirect_to @article
    else
      render :edit
    end
  end

  private
    def article_params
      params.expect(article: [ :title, :body ])
    end
end

Lowered. A per-resource ArticleParams class is emitted alongside the controller:

class ArticleParams
  def initialize
    @title = ""
    @body = ""
  end

  def title;        @title; end
  def title=(value); @title = value; end
  def body;         @body; end
  def body=(value); @body = value; end

  def self.from_raw(params)
    raw_sub = params.fetch("article", {})
    sub = raw_sub.is_a?(Hash) ? raw_sub : {}
    instance = ArticleParams.new
    raw_title = sub.fetch("title", "")
    instance.title = raw_title.is_a?(String) ? raw_title : ""
    raw_body = sub.fetch("body", "")
    instance.body = raw_body.is_a?(String) ? raw_body : ""
    instance
  end
end

And the controller becomes:

class ArticlesController < ApplicationController
  def update
    @article = Article.find(@params.fetch("id", "0").to_s.to_i)
    if @article.update(article_params)
      redirect_to(RouteHelpers.article_path(@article.id),
                  notice: "Article was successfully updated.",
                  status: :see_other)
    else
      render(Views::Articles.edit(@article, @flash[:notice], @flash[:alert]),
             status: :unprocessable_content)
    end
  end

  def article_params
    ArticleParams.from_raw(@params)
  end
end

The typed struct flows all the way through to the model. Article#update receives ArticleParams directly — no hash, no symbol-keyed access, no introspection:

def update(p)
  self.title = p.title unless p.title.nil?
  self.body  = p.body  unless p.body.nil?
  save
end

The params.expect call is gone. ActionController::Parameters is gone. The polymorphic key-access pattern of stock Rails is replaced by two distinct constructs: @params is a plain Hash[String, untyped] carrying raw HTTP input, accessed through @params.fetch("id", "0").to_s.to_i with explicit defaults and explicit type conversion at the boundary; and the strong-parameter shape becomes the typed ArticleParams class with explicit accessors. The from_raw factory is the single place where untyped HTTP input meets the typed surface — is_a? checks defensively at the boundary, with sensible defaults on every field. Everything downstream sees only the typed surface.

V8 sees a stable hidden class on ArticleParams. YJIT specializes on a known receiver. Spinel's whole-program inference closes — the twenty sp_RbVal errors that this single idiom produced before the closed-axis specialization PR are gone.

The same transformation applies to HashWithIndifferentAccess itself, which was retired from the runtime entirely earlier this month. Per-app ActionDispatch::Flash and Session structs replaced it. The flash and session of every Rails app now have known field types at the framework boundary.

2. ActiveRecord machinery → direct typed code

This is the largest single move and the most surprising one to see in emitted form. Stock Rails uses an elaborate runtime for every database access: ActiveRecord::Relation proxies queries lazily, Arel builds a query tree, ConnectionAdapter dispatches to the database driver, CollectionProxy wraps results, method_missing resolves column reads. Every layer is a polymorphic dispatch surface. Every layer prevents the compiler from seeing what's actually happening.

The lowerer eliminates all of it.

Source:

class Article < ApplicationRecord
  has_many :comments
  validates :title, presence: true
end

# Usage in a controller or view:
@article.comments.each { |c| ... }

Lowered (this is the actual emitted code for Article#comments):

def comments
  stmt = Db.prepare("SELECT id, article_id, body, commenter, created_at, updated_at FROM comments" + " WHERE " + "article_id = " + Db.escape_int(@id))
  results = []
  while Db.step?(stmt)
    instance = Comment.new
    instance.id = Db.column_int(stmt, 0)
    instance.article_id = Db.column_int(stmt, 1)
    instance.body = Db.column_text(stmt, 2)
    instance.commenter = Db.column_text(stmt, 3)
    instance.created_at = Db.column_text(stmt, 4)
    instance.updated_at = Db.column_text(stmt, 5)
    instance.mark_persisted!
    results << instance
  end
  Db.finalize(stmt)
  results
end

Take a moment with this. There is no ActiveRecord::Relation. There is no Arel tree being walked at runtime. There is no ConnectionAdapter dispatch. There is no CollectionProxy. There is no method_missing resolving column reads. There is a SQL string whose static parts are constants known at lowering time and whose only runtime input is the bound @id; there is a prepared statement, a typed column read for each schema column, an instance constructed and assigned to, and an array returned. That is the entire comments method.

Validations get the same treatment. validates :title, presence: true becomes an explicit conditional in a validate method (errors << "title can't be blank" if @title.nil? || @title.empty?) — no validator instance, no macro dispatch, no callback chain. Lifecycle callbacks (before_destroy, after_create_commit, etc.) become explicit methods on the model rather than entries registered via the before_destroy :method_name DSL.

This is the move that lets strict-typed targets compile at all. Rust cannot emit a generic Relation<unknown> class proxying for every model; Crystal struggles to. After lowering, both have direct per-model code with Db.column_int returning i64 or its target equivalent — typed primitives all the way down.

The Db module is the per-target primitives layer: roughly twelve methods (prepare, step?, column_int, column_text, escape_int, escape_string, finalize, execute, and a handful more) implemented natively per target. Ruby's Db wraps the sqlite3 gem; Crystal's wraps Crystal's DB module; Rust's wraps rusqlite. The lowered IR above runs unchanged on all of them; only Db differs per target. That twelve-method surface is the boundary between framework-Ruby-transpiled-to-each-target and per-target hand-written runtime.

3. Runtime builders → inline structure

form_with is the canonical example. The runtime builder pattern in Rails — form_with model: @article do |form|; form.text_field :title; end — relies on the builder being a runtime object whose model: parameter carries enough type information to construct the right URLs, field names, and value attributes. Strict-typed targets can't reconstruct that information from the runtime builder, so the lowerer deletes the builder entirely.

Source (ERB view):

<%= form_with model: @article do |form| %>
  <div>
    <%= form.label :title %>
    <%= form.text_field :title %>
  </div>
  <div>
    <%= form.submit %>
  </div>
<% end %>

Lowered (trimmed for brevity — error display and CSS class attributes elided; the structure is faithful to the actual emit):

module Views
  module Articles
    def self.form(article)
      io = String.new
      form_method = article.persisted? ? :patch : :post
      io << "<form action=\"#{ActionView::ViewHelpers.html_escape(article.persisted? ? RouteHelpers.article_path(article.id) : RouteHelpers.articles_path)}\" accept-charset=\"UTF-8\" method=\"post\">"
      io << ActionView::ViewHelpers.method_override_input(form_method)
      io << ActionView::ViewHelpers.csrf_token_hidden_input
      io << "<div>"
      io << "<label for=\"article_title\">Title</label>"
      io << "<input type=\"text\" name=\"article[title]\" id=\"article_title\"#{ActionView::ViewHelpers.optional_value_attr(article[:title])}>"
      io << "</div>"
      io << "<div>"
      io << "<input type=\"submit\" name=\"commit\" value=\"#{ActionView::ViewHelpers.html_escape(form_method == :patch ? "Update Article" : "Create Article")}\">"
      io << "</div>"
      io << "</form>"
      io
    end
  end
end

FormBuilder doesn't exist at runtime. The HTML structure is literally the method body. The ERB parser doesn't run. View helpers (html_escape, method_override_input, csrf_token_hidden_input, optional_value_attr) are direct method calls with computed arguments. Field names like "article[title]" are literal strings, not method-missing-resolved attributes on a runtime builder.

The polymorphism that prevents strict-typed compilation is the same polymorphism that pessimizes JITs. Inside Rails' FormBuilder, the line that fetches each field's current value is roughly @object.send(attr_name) — and @object is Article in one form, Comment in another, User in a third. The FormBuilder methods themselves see a polymorphic receiver across the program even though every individual application call site is monomorphic. V8's inline caches turn megamorphic. YJIT sees varied basic-block shapes. send(symbol) is independently a known JIT pessimization. Stacked, FormBuilder is structurally hostile to specializing compilers, not just strict-typed ones. After the lowerer deletes FormBuilder and inlines the HTML, the polymorphism is gone for every compiler downstream — JIT and AOT alike.

The same move applies to the routes DSL (config/routes.rb's resources :articles becomes a RouteHelpers module of constant-returning class methods — RouteHelpers.article_path(id) is literally "/articles/#{id}", no RouteSet, no URL-helper method_missing).

Jbuilder — the API-server case. For Rails applications used primarily as API servers — where every request renders JSON, and Jbuilder is in the hot path of every response — the same builder-elimination move applies, and it's the place where the throughput dividend is most direct.

Source (app/views/articles/_article.json.jbuilder):

json.extract! article, :id, :title, :body, :created_at, :updated_at
json.url article_url(article, format: :json)

Lowered:

module Views
  module Articles
    def self.article_json(article)
      io = String.new
      io << "{"
      io << "\"id\":"
      io << JsonBuilder.encode_value(article.id)
      io << ","
      io << "\"title\":"
      io << JsonBuilder.encode_value(article.title)
      io << ","
      io << "\"body\":"
      io << JsonBuilder.encode_value(article.body)
      io << ","
      io << "\"created_at\":"
      io << JsonBuilder.encode_datetime(article.created_at)
      io << ","
      io << "\"updated_at\":"
      io << JsonBuilder.encode_datetime(article.updated_at)
      io << ","
      io << "\"url\":"
      io << JsonBuilder.encode_value(RouteHelpers.article_path(article.id) + ".json")
      io << "}"
      io
    end
  end
end

The Jbuilder runtime is entirely gone. json.extract!, json.url, the method_missing that resolves arbitrary keys, the Jbuilder buffer, the rendering-pipeline integration — none of it survives lowering. What remains is a method that writes literal JSON syntax with typed encoding primitives (JsonBuilder.encode_value, JsonBuilder.encode_datetime). Collection responses (json.array! @articles, partial: ..., as: :article) lower the same way, producing a method that maps over the typed collection and joins.

Builder DSLs are recognizable by the pattern "a runtime object accepts method calls in order to construct a data structure." Every such builder — FormBuilder, Jbuilder, the routes DSL — gets the same treatment: deleted, with the data structure it would have produced inlined into the call site.

Three kinds of payoff

Not every move pays off in the same way. Three distinct payoff types are worth keeping separate, because conflating them invites the "do they really know what's making it faster?" reaction.

Move Enables strict-typed targets Specialization unlock for JITs Direct CPU/memory win
1. Polymorphic containers → typed structs
2. ActiveRecord machinery → direct typed code
3. Runtime builders → inline structure

Enabling is binary — a target compiles or it doesn't. Specialization unlock is observable in the lowered code's shape stability; the JIT's response per-move hasn't been directly measured, and the relationship with YJIT/V8/ZJIT is additive, not competitive, so a head-to-head comparison isn't the goal. Direct CPU/memory win is what shows up in throughput benchmarks of roundhouse-CRuby with no JIT involved — fewer indirections, less coercion, smaller framework.

The actual limit

The forthcoming benchmark runs are honest: same Rails source you'd write in stock Rails, byte-identical HTTP responses across every target, real database with real schema, no async tricks or stripped-down models. The methodology isn't the meaningful limit.

The meaningful limit is the implicit claim that any Rails application will work today. Roundhouse covers a subset of Rails, and the width of that subset is what determines whether the work in this post applies to your application.

Today's subset, as exercised by real-blog and the framework-test suite:

What's currently stubbed or partial:

What's not yet covered:

This list is the roadmap. Each item is a lowerer job. Each lowerer lands once and benefits every emit target. The forcing function is the compare-gate ratchet — emitted Rails apps that must produce byte-identical responses across CRuby, TypeScript, Crystal, Spinel, and (in progress) Rust. As each gate closes, the subset widens.

Is your Rails app in the subset today?

If you want a rough self-assessment without reading the lowerer source, here's a checklist. Mostly "no" answers mean your app probably fits today. Mostly "yes" means there are specific blockers, and they're identifiable.

  1. Do you use HashWithIndifferentAccess directly in application code, outside of params/flash/session?
  2. Do you have method_missing or respond_to_missing? in models or controllers?
  3. Do you use delegated_type, single-table inheritance, or polymorphic associations heavily?
  4. Do you have custom DSLs that call define_method at class-body time (e.g., acts_as_taggable-style gems)?
  5. Do you use Action Cable, Action Text, or Active Storage at production scale?
  6. Do you construct Arel queries directly (Article.arel_table[:title].matches(...))?
  7. Do you eval strings as Ruby code in production paths?

A single "yes" doesn't mean Roundhouse won't work for your app — it means there's a specific lowerer or runtime piece you'd be waiting on, and it's something I can name. If you run this checklist and want to know how your specific blockers map to landed/in-flight work, open an issue and I'll point you at the relevant commits or the punch list entry.

What the benchmarks are for

The next set of benchmark runs is in flight — covering all five target surfaces, with Spinel and Rust both close to producing useful numbers. The runs are diagnostic, not declarative: short runs, modest databases, intended to surface where the next lowerer should land, where Spinel's specialization is leaving performance on the table, which paths through the framework Ruby are still allocating more than they should. Roundhouse and Spinel are both works in progress; benchmarks at this stage indicate where to look next, not what to conclude. Phasing, workload choices, publication timing, and the cross-project synergy with tep and Spinel are tracked openly at the benchmark roadmap issue.

What this post documents is the mechanism by which any benchmark improvement gets made. The three moves above land once in the lowerer and apply across every target. The mechanism is browseable, the lowering is reproducible, the HTTP output is byte-identical across targets. The limit is the subset. The work ahead is widening it — guided by the diagnostic benchmarks, not claimed by them.

The compilers were ready. The input wasn't. The input now exists. What comes next is the work of widening the subset.


Roundhouse is open source: dual-licensed MIT / Apache-2.0. Issues and discussion welcome.