The Ruby JRuby Was Built to Run

2026-06-11T10:36:00Z

Rails developers who know JRuby have a fair question for Roundhouse: what does compiling Rails applications to other languages offer that two decades of running Rails on the JVM doesn't already provide?

Let me start with the part that isn't in dispute. JRuby is mature — a complete Ruby implementation, production-proven, with a compatibility story measured in decades. Roundhouse is weeks old, handles a subset of Rails, and its emit is still calibrated to one fixture within that subset. If you need to ship a Rails application on the JVM this quarter, JRuby is the answer and Roundhouse is not a candidate. Nothing below changes that.

What follows is instead an observation about where the two approaches meet — because they meet in a more interesting place than "competitors."

What JRuby already wins

JRuby's value proposition has always been: keep your app, change the runtime. And for traditional HTML-rendering Rails applications, the win is real. On the Roundhouse benchmark — the same small Rails 8 blog application measured across every target, methodology in Numbers Without Conclusions — stock Rails serving the HTML index does 481 req/sec under CRuby with YJIT, and 1,057 req/sec under JRuby, at less than half the median latency. A 2.2× throughput gain from changing nothing but the runtime, consistent with what production JRuby shops have reported for years.

(On the JSON endpoint the two are at rough parity — 1,272 vs 1,080 — which is worth keeping in mind whenever someone generalizes about "JRuby performance" in either direction. Where the request spends its time determines what the JVM can win back.)

The experiment the benchmark happens to contain

Roundhouse's bet is different: keep your app, change the code — where "change the code" is shorthand for something specific. An application serving thousands of requests per second is making the same decisions, with the same outcomes, on every one of them. Every decision whose answer cannot differ between requests can be made once, at transpile time, instead. Roundhouse reads a Rails application, makes those decisions, and emits the residue as standalone, metaprogramming-free projects — in Rust, Go, Crystal, Kotlin, Swift, Elixir, TypeScript, Python, and Ruby itself. That last target is the interesting one here, because the emitted Ruby is just Ruby. It runs anywhere Ruby runs.

So the benchmark includes a jruby row: the same emitted tree the CRuby target runs — byte-identical except that the SQLite backend swaps the sqlite3 C extension for the JDBC driver — executed under JRuby 10. That makes for an unusually clean experiment. Same source application. Same emitted code. Two runtimes.

HTML index (/articles):

configuration	req/sec	p50 (ms)	RSS (MB)
Rails on CRuby+YJIT	481	128.9	328
Rails on JRuby	1,057	55.7	1,169
Roundhouse emit on CRuby+YJIT	5,283	12.0	135
Roundhouse emit on JRuby	26,108	2.3	982

JSON show (/articles/1.json):

configuration	req/sec	p50 (ms)	RSS (MB)
Rails on CRuby+YJIT	1,272	47.6	416
Rails on JRuby	1,080	57.0	1,502
Roundhouse emit on CRuby+YJIT	7,700	8.0	165
Roundhouse emit on JRuby	46,886	1.3	1,048

Reading the 2×2

Those tables are a 2×2: two codebases × two runtimes. Read them that way and two different things are going on — and they compose.

The first effect: stripping the interpretive layers. Hold the runtime constant and swap the codebase. Under CRuby+YJIT, the emit serves the HTML index 11× faster than Rails; under JRuby, the same swap is worth 25× (on JSON: 6× and 43×). And this is not the emit doing less of the work that produces the response. The Rails index action is one line:

@articles = Article.includes(:comments).order(created_at: :desc)

The emitted action is that line's mechanism, spelled out:

def index
  stmt = Db.prepare("SELECT id, body, created_at, title, updated_at FROM articles" +
                    " ORDER BY created_at DESC")
  results = []
  while Db.step?(stmt)
    results << Article.from_stmt(stmt)
  end
  Db.finalize(stmt)
  __comments_ids = results.map { |a| a.id }
  __comments_stmt = Db.prepare("SELECT id, article_id, body, commenter, created_at, updated_at" +
                               " FROM comments WHERE article_id IN (" +
                               Db.escape_int_list(__comments_ids) + ")")
  # … drain, group by article_id, a._preload_comments(group) …
  @articles = results
  if request_format == :json
    render(Views::Articles.index_json(@articles), content_type: "application/json")
  else
    render(Views::Articles.index(@articles, @flash[:notice], @flash[:alert]))
  end
end

Same two queries — the ordered select, the IN-list eager load — and the same response: a cross-target compare gate holds every emitted app's output to Rails' (HTML compared DOM-node-for-DOM-node with whitespace preserved, JSON value-for-value). The JSON view tells the same story: jbuilder's json.extract! article, :id, :title, :body, :created_at, :updated_at becomes two dozen lines of direct string appends with real escaping and Rails-canonical timestamps. Both of these lowerings — and a third, the deletion of FormBuilder — are walked through at full length, Rails source against emitted output, in Show Your Work, for readers who want the mechanism in detail.

Notice which decisions survive to runtime in that excerpt: which format the client asked for, what the flash holds, which rows are in the table — the things that can actually differ between requests. Everything else has already been decided. What's gone is the machinery for re-deciding it, request after request, with the same outcome every time: Arel re-building the same SQL string, route recognition re-resolving the same path to the same action, template resolution re-finding the same template and partials (with an instrumentation event per partial), attribute type-casting re-parsing each timestamp into a TimeWithZone so that jbuilder can re-serialize it straight back into a string — the emit never lets it stop being a string — plus the middleware stack, log lines, and session-cookie verification. Most of that list is interpretive generality — work Rails does per-request to stay general, which compilation does once, with output the compare gate certifies as equivalent. Removing it is not benchmark inflation; it is the entire claim.

It's also why the usual one-line description of Roundhouse's enabling analysis — whole-program type inference — names the enabler rather than the payoff. The types are what make nine typed targets reachable; the performance story is partial evaluation. Rails is, operationally, an interpreter for your application: routes, associations, validations, and templates are data it consults on every request. Specialize the interpreter with respect to the program it interprets and you get the compiled program — the partial-evaluation literature has called that the first Futamura projection since 1971. Roundhouse reaches it by hand-built pattern recognizers rather than automatic specialization, but the dividend is the same: the number of decisions made per request collapses, and only the ones with genuinely per-request answers remain.

A smaller part of that machinery list — signed cookies, CSRF, logging, instrumentation — is genuinely absent service that a production deployment would put back. The compare gate knows it, too: its documented ignore-list — CSRF token metas and values, the HMAC suffix on signed stream names, asset fingerprints — is precisely this gap, masked knowingly rather than papered over. That is the honest discount to carry against the 11× and the 25×: real, but bounded, enumerable, and tracked — CSRF has a filed plan for real masked-token generation and validation, scheme-compatible with Rails so Turbo picks it up unchanged. Bounded more tightly than it first appears, even: CSRF validation runs on non-GET requests, which this benchmark never issues — of the listed services, only cookie verification and logging would put per-request cost on the measured endpoints at all.

The second effect: monomorphization lets the JIT do what it was built to do. Hold the codebase constant and swap the runtime. Stock Rails gains 2.2× moving to JRuby on HTML, and roughly nothing on JSON. The emitted code gains 4.9× and 6.1×. Same JVM, same JIT — what changed is the input. Thirty years of JVM JIT engineering are aimed at monomorphic call sites, resolved constants, and shapes that hold still; Rails' runtime metaprogramming manufactures the opposite, and JRuby has carried it admirably despite that. The emit's case-statement dispatch, explicit per-column accessors, and direct string building are the program that JIT was built for. This is the JVM instance of an argument made earlier in this series about YJIT and V8 — the compilers were ready; the input wasn't. The 5–6× isn't JRuby getting faster. It's JRuby finally being fed.

And they compose. The full diagonal — emitted Ruby on JRuby versus stock Rails on CRuby — is 54×, and it factors consistently from either margin: 11 × 4.9 through one corner, 2.2 × 24.7 through the other. The composition is visible inside the table itself: the JVM rewards the lowered code more than YJIT does (25× vs 11×), and the lowered code rewards the JVM more than Rails does (4.9× vs 2.2×). Stripping the abstraction is worth an order of magnitude on its own; handing the result to a JIT that can finally see through it multiplies that again. Neither factor alone is the story — the story is that they multiply.

The usual caveats, undiminished

Everything in Numbers Without Conclusions applies here without exception: this is a CPU-bound microbenchmark of a five-endpoint blog app on a single laptop, with local SQLite and no production I/O, and your application almost certainly does not transpile today. Absolute numbers shift between bench rounds as perf gates land and hardware conditions vary; the ratios have been more stable than the absolutes, and the ratios are what this post is about.

Two caveats specific to the JVM rows. First, memory: both JRuby configurations sit around 1–1.5GB RSS where CRuby sits at 135–416MB — the familiar JVM trade of heap for throughput, unchanged by anything Roundhouse does. Second, the stock-Rails-on-JRuby baseline runs Rails 8.0 rather than 8.1, because the only activerecord-jdbc release that drives JRuby 10 today pins to the 8.0 line. Same application code, one minor version of framework skew.

Reach

JRuby offers Rails developers one alternative ecosystem. Roundhouse, if it matures, aims to offer several — and it reaches the JVM by two independent routes. The route above keeps you in Ruby: emitted Ruby under JRuby, your debugging tools and your language intact. The other route is the Kotlin target, which compiles the same application into the JVM's native idiom and reaches 94,746 req/sec on the JSON endpoint — twice the emitted-Ruby-on-JRuby figure, as code a Kotlin team could read, audit, and maintain without knowing Ruby existed.

But "more ecosystems" is shorthand too, the same way "change the code" was, because JRuby's original pitch was never speed — it was reach. The JVM was a place Ruby couldn't otherwise go, and JRuby carried it there: into Java shops, onto application servers, behind enterprise deployment policies. Roundhouse generalizes the move: this is about taking Rails applications to places they can't currently reach. Some targets are faster places to run the application you have. The more interesting ones are places where it cannot run at all today: in the browser as an offline-first application — your models, validations, and views running against local data, no server required; on a Cloudflare Worker, with a memory cap no Rails process fits (a typescript-worker archive is already in the build matrix); inside an iOS or Android device as on-device Swift or Kotlin, no round-trip, no connectivity requirement; on Erlang's BEAM, where the Elixir emit picks up that runtime's supervision and distribution. The Worker and BEAM rungs are real today, within the demo's subset; browser and on-device are the roadmap. But none of these is a gradient where Rails is merely slower. They are binary gates: an interpreter and its framework either fit the deployment target or they don't.

That's the shape of the eventual value proposition for someone who already knows JRuby's: not "a faster JRuby," but the same keep-your-app promise with the deployment decision deferred — JVM via Ruby or via Kotlin, a 20MB Rust binary, an edge worker, the BEAM, a phone, a browser tab — from one Rails source, with the destination list no longer bounded by where an interpreter can follow.

And it cuts the other way too: if Roundhouse's emit turns out to be the kind of Ruby that unlocks a 5–6× JIT dividend, that's not a threat to the JRuby ecosystem. It's a gift to it.

Reproduce it without installing anything of mine

The emitted projects are self-contained archives — jruby.tgz and ruby.tgz — each with the generated application, its runtime, its tests, its Playwright suite, seed data, and a README whose commands a CI job executes verbatim against the published archive (the architecture document ships alongside as SPECIMEN.md). No Roundhouse install, no Rails install. Under JRuby 10 (and with wrk and the sqlite3 CLI available):

curl -sL https://rubys.github.io/roundhouse/browse/jruby.tgz | tar xz
cd jruby
jruby -S bundle install
make seed          # load db/seed.sql into tmp/blog.sqlite3
BLOG_DB=tmp/blog.sqlite3 WEB_CONCURRENCY=0 jruby -S bundle exec puma -C config/puma.rb

This is the quintessential Rails demo, modernized with Tailwind CSS, Turbo Streams, and Action Cable. Browse it at http://localhost:3000/articles, then from another terminal — after giving the JIT the warmup any JVM measurement deserves:

wrk -t4 -c64 -d30s --latency http://127.0.0.1:3000/articles

The same steps against ruby.tgz under CRuby (bundle install, make seed, rake dev) give you the other half of the comparison on your own hardware. The generated code for every other target is browseable at rubys.github.io/roundhouse/browse/, and the full benchmark data is at bench/.

If you run Rails on JRuby in production and these ratios don't match what you'd expect — or if they match exactly — both are calibration data I want, and Discussions is the place. The people best positioned to tell me whether the claim in this post's title is an insight or a benchmark artifact are the people who've been running the other kind for two decades.

Coda

JRuby's wager, twenty years ago, was that Ruby deserved the JVM — and the JVM has held up its end ever since, JIT-compiling whatever Ruby it was handed. What it was mostly handed was Rails: the most metaprogramming-dense Ruby ever shipped at scale, the input a JIT can reward least. That JRuby wins anyway on rendering-heavy workloads is a testament to the engineering underneath it.

Roundhouse set out to compile Rails applications to nine languages, and produced — almost incidentally, in its Ruby target — an existence proof of the Ruby that wager assumed: static shapes, resolved constants, monomorphic sends, every request-invariant decision already made. Hand that Ruby to the JVM and the JIT pays a further 5–6×, on top of the order of magnitude the compilation was worth on its own. Two bets, placed twenty years apart, that turn out to multiply.

The Ruby JRuby was built to run, it turns out, is a Ruby you can generate from the Rails application you already have. That seems worth saying out loud — and worth the JRuby community checking my math.

Roundhouse is open source: dual-licensed MIT / Apache-2.0. Issues and discussion welcome.