intertwingly

It’s just data

Frozen String Literals Redux - A More Rigorous Test


A few days ago I published results from testing frozen string literals in production. The experiment showed frozen string literals as 17% slower and using 24% more memory than the baseline.

Expert Feedback

Jean Boussier, a Ruby core team member who's much smarter than me, provided crucial feedback:

"It's impossible for frozen string literals to be slower than mutable string literals"

And on Lobsters:

"At the very least you'd need to hit that endpoint a few thousands of time to have any sort of statistically relevant result."

He was absolutely right. My initial test used a single request to each environment - hardly a rigorous benchmark. Time to do this properly.

The original post also generated discussion on Reddit and Hacker News, with commenters echoing the need for better methodology.

A Better Test

I removed the app/controllers/concerns/configurator.rb change from the previous post and instead configured frozen string literals via Fly.io secrets, allowing me to enable/disable the feature without code changes or deployments.

The test approach:

  1. Run thousands of requests without frozen strings enabled
  2. Wait for machine to suspend (about 5 minutes of inactivity)
  3. Enable frozen strings via RUBYOPT secret
  4. Run thousands of requests with frozen strings enabled
  5. Wait for machine to suspend again
  6. Compare the results

Test Script

I created a script to automate this benchmark. The script:

Results

After refining the methodology to properly warm up the application and exclude timeout errors, I ran 1,000 authenticated requests to the most demanding page in my application. The script ensures all measured requests hit a fully-loaded Rails instance with proper timeout handling (60 seconds).

Without Frozen String Literals

After 1,000 requests without frozen strings:

Performance:

Memory (at suspension):

With Frozen String Literals

After 1,000 requests with frozen strings enabled:

Performance:

Memory (at suspension):

Comparison

Metric Without Frozen With Frozen Difference
Median Response 1.345s 1.340s -0.005s (-0.3%)
Mean Response 1.377s 1.399s +0.022s (+1.6%)
Disney Peak Memory 324.1 MiB 337.2 MiB +13.1 MiB (+4.0%)

The results show frozen string literals have essentially no impact on performance (differences well under 2%). Surprisingly, frozen strings used slightly more memory (+4%), which contradicts the expected benefit of reducing string duplication. This could be due to:

The performance result aligns with Jean Boussier's assertion that frozen strings can't be slower than mutable strings - they're effectively identical in this real-world scenario.

What I Learned

  1. Sample Size Matters: A single request told me nothing. With 1,000 requests, the performance differences converge to under 2%, showing that frozen strings have essentially no performance impact in this real-world Rails application.

  2. Warmup Is Critical: Initial benchmark runs included maintenance page responses and cold starts in the measurements. Proper warmup logic that waits for real application responses (>0.5s) ensures all measured requests hit a fully-loaded Rails instance.

  3. Timeout Handling Matters: Without timeouts, occasional hung requests (1000+ seconds) completely skewed the mean. A 60-second timeout keeps the data clean and realistic.

  4. Proper Benchmarking Is Hard: Real benchmarks require controlled conditions, multiple runs, proper warmup, timeout handling, and waiting for machine suspension to capture peak memory usage. Getting the methodology right took several iterations.

  5. Fly.io Secrets for Configuration: Using RUBYOPT as a Fly secret is cleaner than conditional code and allows testing the same deployed code with different configurations.

  6. Expert Guidance Validated: Jean Boussier was right - frozen strings aren't slower than mutable strings. The performance is effectively identical.

  7. Memory Results Are Complex: The unexpected 4% memory increase with frozen strings suggests that real-world memory behavior is more complex than simple theory. GC timing, allocation patterns, and Ruby's internal optimizations all play a role.

The moral of the story: when your results contradict established knowledge, fix your methodology. Proper measurement confirms the experts were right about performance, though the memory story remains more nuanced than expected.

Addendum: Startup Time Impact

While analyzing the warmup behavior during the main benchmark, I noticed that frozen strings seemed to require more warmup attempts before Rails fully loaded (6 attempts vs 2 in one run). This led to an unexpected question: does --enable-frozen-string-literal affect Rails startup time?

Startup Time Experiment

I created a separate benchmark that restarts the machine 10 times with and without frozen strings, measuring the total time from restart until the first successful Rails response:

Without Frozen Strings:

With Frozen Strings:

Result: 40% slower startup with frozen strings enabled

Why This Might Happen

My application's startup process includes:

When RUBYOPT=--enable-frozen-string-literal is set globally, it affects all Ruby code that runs during startup, including these initialization scripts. The frozen string overhead compounds across the entire boot sequence.

A Better Approach?

This suggests that setting frozen strings globally via RUBYOPT may not be optimal. A more targeted approach would be:

The 10-iteration sample size is too small to be conclusive, but the consistent pattern (iterations 1-3 all took 3 attempts with frozen strings vs 1 attempt without) suggests this is worth investigating further. The startup time impact may be specific to applications with complex initialization sequences rather than a universal issue with frozen strings.