Testing Frozen String Literals in Production
My Showcase application runs as a multi-tenant Rails app on Fly.io, where each ballroom dance event runs as a separate Rails instance with its own SQLite database. This makes it easy to run controlled experiments.
Having seen success with jemalloc and cgroups, it is time to look at the application itself. The first thing I looked at was frozen strings.
The theory was compelling: with frozen string literals enabled, identical string constants could be shared, reducing overall memory consumption.
The Experiment
Phase 1: Local Testing
First, I wanted to verify my application was compatible with frozen string literals before deploying to production.
Running the full test suite with frozen strings enabled:
RUBYOPT="--enable-frozen-string-literal" bin/rails test
Results:
- ✅ 1007 tests run
- ✅ 0 failures, 0 errors
- ✅ 13 skips (same as without frozen strings)
Running system tests with frozen strings enabled:
RUBYOPT="--enable-frozen-string-literal" bin/rails test:system
Results:
- ✅ 122 system tests run
- ✅ 0 failures, 0 errors
- ✅ 4 skips (normal)
- ✅ Full browser integration tests passing
Conclusion: My entire 41,000-line codebase was already compatible with frozen string literals. All 1,129 tests passed without any code changes.
Phase 2: Staging Deployment
Rather than add # frozen_string_literal: true
to every Ruby file, I decided to enable it globally via environment variable. I configured my staging environment (smooth-nav on Fly.io) to test with real production workloads.
Configuration change in app/controllers/concerns/configurator.rb
:
def build_tenants_list
# ... existing tenant configuration ...
# Add frozen string literal flag for staging (smooth-nav)
if ENV['FLY_APP_NAME'] == 'smooth-nav'
tenants.each do |tenant|
tenant['env'] ||= {}
tenant['env']['RUBYOPT'] = '--enable-frozen-string-literal'
end
end
tenants
end
This configuration adds RUBYOPT="--enable-frozen-string-literal"
to every tenant's environment only on the smooth-nav staging app, leaving production (smooth) unchanged for comparison.
Phase 3: Production Comparison
After deploying to staging, I visited the same demanding view (/showcase/2025/raleigh/disney/heats
- a 5.3MB page rendering) on both staging (with frozen strings) and production (without), then reviewed the logs.
Results
Performance Impact
Comparing the /heats
page (most demanding view in the application):
Environment | Frozen Strings | Request Time | Machine |
---|---|---|---|
smooth-nav | ✅ Enabled | 1.703s | 286e340f991548 (iad) |
smooth | ❌ Disabled | 1.458s | d890d65f622428 (iad) |
Frozen string literals were 17% slower (245ms overhead).
Memory Impact
When Fly.io suspends machines due to inactivity, Navigator logs memory statistics for each tenant. Here's what was recorded:
smooth-nav (WITH frozen_string_literal):
Index tenant:
- Peak usage: 220.5 MiB
- Current usage: 181.7 MiB
2025/raleigh/disney tenant:
- Peak usage: 175.8 MiB
- Current usage: 175.5 MiB
Total peak memory: 396.3 MiB
smooth (WITHOUT frozen_string_literal):
Index tenant:
- Peak usage: 151.0 MiB
- Current usage: 113.7 MiB
2025/raleigh/disney tenant:
- Peak usage: 169.4 MiB
- Current usage: 169.0 MiB
Total peak memory: 320.4 MiB
Summary Table
Metric | smooth-nav (frozen) | smooth (unfrozen) | Difference |
---|---|---|---|
Index peak | 220.5 MiB | 151.0 MiB | +69.5 MiB (+46%) |
Index current | 181.7 MiB | 113.7 MiB | +68 MiB (+60%) |
Disney peak | 175.8 MiB | 169.4 MiB | +6.4 MiB (+3.8%) |
Disney current | 175.5 MiB | 169.0 MiB | +6.5 MiB (+3.8%) |
Total peak | 396.3 MiB | 320.4 MiB | +75.9 MiB (+23.7%) |
Lessons Learned
-
Test Your Assumptions: Conventional wisdom about frozen string literals doesn't apply universally.
-
Measure in Production: Local tests showed compatibility but couldn't reveal the memory impact. Only production measurement with real data showed the true cost.
-
Incremental Rollout: Using
FLY_APP_NAME
environment checks allowed safe A/B testing in production without impacting users. -
Performance vs. Memory Trade-off: Even if memory had improved, the 17% performance regression would have made this a poor trade-off.
The experiment was valuable because it challenged an assumption with real data. Sometimes the best optimization is the one you don't deploy. Perhaps frozen string literals make a difference in long running applications or other scenarios than the one I tested for; but if my first test results are any indication, then frozen string literals has to first make up for a significant deficit before it shows any benefits.