Testing Frozen String Literals in Production

2025-10-15T12:48:14Z

My Showcase application runs as a multi-tenant Rails app on Fly.io, where each ballroom dance event runs as a separate Rails instance with its own SQLite database. This makes it easy to run controlled experiments.

Having seen success with jemalloc and cgroups, it is time to look at the application itself. The first thing I looked at was frozen strings.

The theory was compelling: with frozen string literals enabled, identical string constants could be shared, reducing overall memory consumption.

The Experiment

Phase 1: Local Testing

First, I wanted to verify my application was compatible with frozen string literals before deploying to production.

Running the full test suite with frozen strings enabled:

RUBYOPT="--enable-frozen-string-literal" bin/rails test

Results:

✅ 1007 tests run
✅ 0 failures, 0 errors
✅ 13 skips (same as without frozen strings)

Running system tests with frozen strings enabled:

RUBYOPT="--enable-frozen-string-literal" bin/rails test:system

Results:

✅ 122 system tests run
✅ 0 failures, 0 errors
✅ 4 skips (normal)
✅ Full browser integration tests passing

Conclusion: My entire 41,000-line codebase was already compatible with frozen string literals. All 1,129 tests passed without any code changes.

Phase 2: Staging Deployment

Rather than add # frozen_string_literal: true to every Ruby file, I decided to enable it globally via environment variable. I configured my staging environment (smooth-nav on Fly.io) to test with real production workloads.

Configuration change in app/controllers/concerns/configurator.rb:

def build_tenants_list
  # ... existing tenant configuration ...

  # Add frozen string literal flag for staging (smooth-nav)
  if ENV['FLY_APP_NAME'] == 'smooth-nav'
    tenants.each do |tenant|
      tenant['env'] ||= {}
      tenant['env']['RUBYOPT'] = '--enable-frozen-string-literal'
    end
  end

  tenants
end

This configuration adds RUBYOPT="--enable-frozen-string-literal" to every tenant's environment only on the smooth-nav staging app, leaving production (smooth) unchanged for comparison.

Phase 3: Production Comparison

After deploying to staging, I visited the same demanding view (/showcase/2025/raleigh/disney/heats - a 5.3MB page rendering) on both staging (with frozen strings) and production (without), then reviewed the logs.

Results

Performance Impact

Comparing the /heats page (most demanding view in the application):

Environment	Frozen Strings	Request Time	Machine
smooth-nav	✅ Enabled	1.703s	286e340f991548 (iad)
smooth	❌ Disabled	1.458s	d890d65f622428 (iad)

Frozen string literals were 17% slower (245ms overhead).

Memory Impact

When Fly.io suspends machines due to inactivity, Navigator logs memory statistics for each tenant. Here's what was recorded:

smooth-nav (WITH frozen_string_literal):

Index tenant:

Peak usage: 220.5 MiB
Current usage: 181.7 MiB

2025/raleigh/disney tenant:

Peak usage: 175.8 MiB
Current usage: 175.5 MiB

Total peak memory: 396.3 MiB

smooth (WITHOUT frozen_string_literal):

Index tenant:

Peak usage: 151.0 MiB
Current usage: 113.7 MiB

2025/raleigh/disney tenant:

Peak usage: 169.4 MiB
Current usage: 169.0 MiB

Total peak memory: 320.4 MiB

Summary Table

Metric	smooth-nav (frozen)	smooth (unfrozen)	Difference
Index peak	220.5 MiB	151.0 MiB	+69.5 MiB (+46%)
Index current	181.7 MiB	113.7 MiB	+68 MiB (+60%)
Disney peak	175.8 MiB	169.4 MiB	+6.4 MiB (+3.8%)
Disney current	175.5 MiB	169.0 MiB	+6.5 MiB (+3.8%)
Total peak	396.3 MiB	320.4 MiB	+75.9 MiB (+23.7%)

Lessons Learned

Test Your Assumptions: Conventional wisdom about frozen string literals doesn't apply universally.
Measure in Production: Local tests showed compatibility but couldn't reveal the memory impact. Only production measurement with real data showed the true cost.
Incremental Rollout: Using FLY_APP_NAME environment checks allowed safe A/B testing in production without impacting users.
Performance vs. Memory Trade-off: Even if memory had improved, the 17% performance regression would have made this a poor trade-off.

The experiment was valuable because it challenged an assumption with real data. Sometimes the best optimization is the one you don't deploy. Perhaps frozen string literals make a difference in long running applications or other scenarios than the one I tested for; but if my first test results are any indication, then frozen string literals has to first make up for a significant deficit before it shows any benefits.