Shared-Nothing Multi-Tenancy with SQLite, TurboCable, and Navigator

2025-11-04T14:26:21Z

Part 2 of 3: Part 1: TurboCable | Part 3: Development Process

The showcase application runs 350+ independent dance competition events across 75 studios in 8 countries on 4 continents. Each event has its own database, handles real-time score updates from multiple judges, and serves live displays to audiences. The infrastructure cost? Around $100 per month on Fly.io.

This is possible because of a coherent architectural choice: shared-nothing multi-tenancy. Each tenant (event) runs in complete isolation with its own SQLite database, its own Rails process, and its own WebSocket connections. No shared database, no Redis pub/sub, no distributed state.

Three technologies make this work: SQLite for storage, TurboCable for real-time updates, and Navigator for routing. What makes this interesting is that all three share the same fundamental constraint: they work within a single machine. This isn't a limitation to work around—it's a design choice that eliminates entire categories of complexity.

TL;DR (2 min read)

Running 350+ dance competition events across 8 countries on 4 continents for ~$100/month using:

SQLite - Zero network latency, ~1MB per database, simple backups
TurboCable - 85% memory reduction vs Action Cable, in-process WebSocket
Navigator - Process-per-tenant isolation, on-demand spawning, intelligent routing

Key insight: When data naturally partitions by tenant, shared-nothing architecture eliminates Redis, complex connection pooling, distributed locking, and cross-tenant bugs.

Sweet spot: B2B SaaS with dozens to hundreds of tenants where each customer's data is completely independent.

Memory savings: 163MB → 18MB per machine (89% reduction)

Tenant provisioning: ~30 seconds without deployment via live config updates

Growth path: Single machine → Multi-region → Multi-machine → Multi-cloud (vendor independence)

Read full post ↓ | ARCHITECTURE.md | Try TurboCable

The Problem with Default Architectures

Open any Rails tutorial on building a SaaS application and you'll likely see:

Postgres for the database (often hosted on AWS RDS or Heroku)
Redis for Action Cable and background jobs
Horizontal scaling with multiple app servers behind a load balancer
Connection pooling (pgBouncer or similar)
Distributed locking (Redis or database-level)
Cross-tenant data isolation via row-level security or tenant scoping

This architecture makes sense for many applications. But it assumes:

You need to scale to millions of users
Your data doesn't naturally partition by tenant
You need to run complex cross-tenant analytics
You're comfortable with the operational overhead

For showcase, none of these assumptions hold. Each dance competition is completely independent. There's no need to query across events, no shared leaderboards, no cross-tenant analytics. The natural data model is one database per event.

Enter Shared-Nothing Architecture

Shared-nothing architecture means each tenant runs in complete isolation:

Event "Boston 2025"          Event "NYC 2025"
├── SQLite database          ├── SQLite database
├── Rails process            ├── Rails process
└── WebSocket connections    └── WebSocket connections

Navigator (a custom Go reverse proxy) routes requests to the correct tenant based on URL path. When a request comes in for /showcase/2025/boston/, Navigator:

Checks if this tenant should run on this machine (based on configuration)
If not, uses Fly-Replay to route to the correct machine
If yes, checks if a Rails process exists for that event
If not, spawns one with DATABASE_URL=/data/db/2025-boston.sqlite3
Routes the HTTP request to that process
Routes WebSocket /cable connections to the same process

Each Rails process is completely unaware of other tenants. It just sees a normal Rails application with a SQLite database.

There's also a fourth component: the index database (index.sqlite3) which manages tenant configuration across all regions. It stores event metadata, generates navigator.yml dynamically, and provides an admin UI for provisioning new events—all without requiring deployment.

The SQLite Choice

SQLite is perfect for this pattern because:

Zero network latency - Database is on local filesystem
Small footprint - Each database is ~1MB, even with thousands of entries
Simple connection management - Rails handles connection pooling; single-threaded writes are fine for event-sized workloads
Atomic transactions - No distributed transaction coordination needed
Simple backups - From simple file copies to S3 to comprehensive solutions like Litestream

The showcase application stores databases on Fly volumes and syncs to Tigris when idle. When a machine restarts or a request comes in for a new event, Navigator's ready hook downloads the database from Tigris to local storage. First request after restart takes ~2 seconds; subsequent requests are instant.

The Real-Time Challenge

Dance competitions need real-time updates: judges entering scores on tablets, live displays showing current heat numbers, audience screens updating, and progress bars for long operations.

The standard Rails approach is Action Cable backed by Redis or Solid Cable, which requires ~163MB per machine. With 8 regional machines, that's 1.3GB just for WebSocket infrastructure.

Enter TurboCable

TurboCable provides the same Turbo Streams API as Action Cable, but uses in-process WebSocket handling via Rack hijack: ~18MB total (89% reduction). No Redis, no Solid Cable, no external dependencies.

Here's the key insight: TurboCable only broadcasts within a single Rails process. For horizontally-scaled architectures, this would be a deal-breaker. But in shared-nothing multi-tenancy, this constraint doesn't matter. Each tenant runs in its own process, and Navigator ensures all requests for a tenant go to the same process.

The constraint that makes TurboCable unsuitable for traditional horizontal scaling is exactly what makes it perfect for process-isolated multi-tenancy.

All of showcase's 5 Action Cable channels were pure stream_from channels with no custom actions—the migration was zero code changes. Just delete the channel files and use turbo_stream_from in views.

Production Memory Measurements

Real measurements from production machines in the iad region (November 2025):

Action Cable + Redis (smooth production):

navigator          1.0%    21 MB
puma (cable)       7.6%   153 MB
redis-server       0.6%    13 MB
─────────────────────────────────
Total WebSocket:   8.2%   163 MB

TurboCable (smooth-nav staging):

navigator          0.9%    18 MB
─────────────────────────────────
Total WebSocket:   0.9%    18 MB

Memory savings: 145 MB per machine (89% reduction)

With 8 regional machines, that's 1.16 GB saved just by eliminating Action Cable and Redis. The TurboCable WebSocket handling is integrated directly into Navigator with minimal overhead.

For complete details on TurboCable, including installation, implementation details, custom JSON broadcasting, and real-world examples, see TurboCable - Real-Time Rails Without Redis.

The Navigator Routing Layer

Navigator is a Go-based reverse proxy that makes multi-tenancy transparent. It handles:

Process management:

Spawns Rails processes on-demand when requests arrive
Passes environment variables (e.g., DATABASE_URL=/data/db/2025-boston.sqlite3)
Terminates processes after 5 minutes of inactivity
Manages process lifecycle (restart on crash, graceful shutdown)

Request routing:

Maps URL paths to tenants (e.g., /showcase/2025/boston/ → boston tenant)
Routes both HTTP and WebSocket connections to same process
Serves static assets directly without Rails (CSS, JavaScript, images)
Implements htpasswd authentication with per-path exclusions

Cross-region routing:

Uses fly-prefer-region headers from client JavaScript
Routes requests to geographically closest machine
Falls back to Fly-Replay for large uploads (>1MB)
Supports reverse proxy for requests beyond Fly-Replay's limits

Configuration management:

Reads config/navigator.yml for routing and tenant definitions
Reloads configuration via SIGHUP without dropping connections
Executes lifecycle hooks (initialization, ready, shutdown)
Serves health check endpoint (/up) without Rails

Here's a simplified example of Navigator's configuration:

server:
  port: 3000
  rails:
    command: bin/rails server -p %{port}
    idle: 300  # Terminate after 5 minutes of inactivity
  maintenance_page: public/503.html

tenants:
  boston-2025:
    database: db/2025-boston.sqlite3
    paths:
      - /showcase/2025/boston
  nyc-2025:
    database: db/2025-nyc.sqlite3
    paths:
      - /showcase/2025/nyc

Navigator spawns processes with different DATABASE_URL environment variables, and Rails reads the correct database:

# config/database.yml
production:
  adapter: sqlite3
  database: <%= ENV.fetch("DATABASE_URL") { "db/production.sqlite3" } %>

The Rails application is completely unaware of multi-tenancy. It just sees a standard Rails app with a SQLite database.

For deeper technical details, see the full ARCHITECTURE.md in the showcase repository.

Live Provisioning

Adding a new tenant takes about 30 seconds without any deployment:

Admin fills out event request form
Creates record in index.sqlite3
Background job (ConfigUpdateJob) syncs to all machines
Each machine's CGI script regenerates navigator.yml
Navigator reloads configuration via SIGHUP
Ready hook downloads/prepares databases
New event is live

The admin sees real-time progress updates via WebSocket as each machine is configured. No fly deploy, no downtime, no waiting.

Three-Level Auto-Scaling

The architecture minimizes costs through auto-scaling at three levels:

1. Machine-level - Fly.io suspends machines after 30 minutes of inactivity. Suspended machines consume zero compute and memory. Fly automatically resumes on incoming requests.

2. Tenant-level - Navigator terminates Rails processes after 5 minutes of inactivity. Dozens of events coexist on one machine, but only active ones consume memory.

3. Appliance-level - Resource-intensive operations (PDF generation, potentially video encoding or audio transcription) run on separate on-demand machines that spin up when needed and stop when idle.

This is why 350+ events cost ~$100/month: most are idle most of the time, consuming only storage costs.

PDF Generation Appliance

PDF generation uses puppeteer and Chrome, requiring significantly more memory than the main app. Navigator routes PDF requests to dedicated machines via Fly-Replay:

routes:
  fly:
    replay:
      - path: "^/showcase/.+\\.pdf$"
        target: "app=smooth-pdf"

Five PDF machines are available but cost almost nothing because they're only running during actual PDF generation. This pattern can extend to any resource-intensive operation without compromising the main architecture's simplicity.

No CDN Needed

Static assets (CSS, JavaScript, images, pre-rendered HTML) are stored on every machine and served directly by Navigator without Rails. Since each region has a complete copy, assets are always local—no CDN, no cache invalidation, no additional infrastructure.

The Architecture Coheres

What makes this interesting is how the constraints align:

Technology	Constraint	Benefit
SQLite	Single machine	Zero network latency, simple backups
TurboCable	Single process	No Redis, no distributed state
Navigator	Process isolation	No cross-tenant data leakage, independent scaling

You don't need to work around these constraints—they reinforce each other:

SQLite's single-machine constraint is fine because Navigator routes to the right machine
TurboCable's single-process constraint is fine because Navigator routes to the right process
Navigator's process-per-tenant pattern is fine because SQLite and TurboCable are self-contained

The result is architectural simplicity:

No distributed locking (each tenant is independent)
Simple connection management (Rails handles pooling per-process)
No cross-tenant queries (databases are separate)
No shared state (WebSockets are in-process)
No data leakage (processes are isolated)

These guarantees are structural, not disciplinary. You can't accidentally query the wrong tenant because each process only has access to its own database. You can't leak data via WebSocket because connections are process-local.

Compare this to row-level security or tenant scoping in a shared database, which requires constant vigilance. Miss a where(tenant_id: current_tenant) clause and you've got a security issue.

The Economics

Running 350+ events across 8 regions:

Current infrastructure:

8 Fly.io machines × (2 shared vCPUs + 2GB RAM)
8 × 1GB volumes with auto-extension
5 on-demand PDF generation machines (minimal cost, only run when needed)
1 log aggregation machine
Tigris S3 storage for databases and uploaded files

Estimated monthly cost: ~$100 (before Fly.io plan allowances)

Most of these 350+ events are idle most of the time. Fly.io suspends machines after 30 minutes of inactivity, and Navigator stops Rails processes after 5 minutes. An idle event consumes:

Compute: $0 (suspended machine)
Memory: $0 (no running processes)
Storage: ~1MB SQLite database + uploaded files on Tigris

When someone accesses the event:

Fly.io resumes machine (~instant)
Navigator downloads database from Tigris (~2 seconds, cached afterward)
Navigator spawns Rails process on first request
Subsequent requests are instant (process stays alive for 5 minutes)

Compare to traditional stack:

Managed Postgres: $38-72/month minimum (Fly.io pricing: Basic/Starter plans)
Redis/Solid Cable: $15-50/month minimum
Load balancer: $20-50/month
Observability: $50-200/month
Multiple app servers: $100-500/month

You'd easily hit $250-400/month before handling a single tenant. At showcase's scale (350+ tenants), a traditional architecture would cost thousands per month.

The Growth Path

This architecture scales in stages:

Phase 1: Single machine (works for dozens of tenants)

All tenants on one server
Navigator routes by path
SQLite databases on local volume

Phase 2: Geographic distribution (showcase's current state)

Multiple machines in different regions
Each region handles local tenants
Client JavaScript adds fly-prefer-region headers
Navigator routes cross-region when needed

Phase 3: Density limits (future, if needed)

Multiple machines per region
Tenants assigned by hash or manual placement
Navigator routes by tenant ID
Each machine still runs isolated processes

Phase 4: Multi-cloud (future, vendor independence)

Deploy to Hetzner, AWS, or other providers via Kamal
Cloudflare Workers route requests across clouds
Already proven: Hetzner deployment exists
Escape hatch from any single provider

Key insight: You scale by adding machines, not by adding middleware. Each machine remains architecturally simple—just SQLite + Rails + TurboCable. Complexity is added only when economically justified.

This is the opposite of the typical growth path:

Start simple (single Postgres instance)
Add caching (Redis)
Add connection pooling (pgBouncer)
Add read replicas
Add job queue (Sidekiq)
Add message queue (RabbitMQ)
Add service mesh
Add...

Each step adds operational overhead, new failure modes, and increased costs.

Development/Production Parity

One underrated benefit: developers run the exact production architecture on their laptop.

# Run showcase with specific event database
bin/dev db/2025-boston.sqlite3

This starts Navigator locally, which spawns a Rails process with the correct database. No Docker, no Kubernetes, no docker-compose with 12 services. Just SQLite + Rails.

Want to test multi-tenancy locally? Run Navigator:

bin/nav

Navigator spawns and manages multiple Rails processes based on configuration, routing to the correct process based on URL path. This makes debugging trivial:

Reproduce bugs locally with actual production databases (copied from Tigris)
Step through code with debugger
Test real-time updates by opening multiple browser tabs
Verify WebSocket behavior without deployment

Compare to typical development environments:

Docker Compose with Postgres, Redis, multiple containers
Kubernetes with Minikube or Kind
Different behavior in dev vs production
"Works on my machine" syndrome

With showcase, my machine IS production. Same SQLite, same TurboCable, same Navigator.

When This Works

This architecture has a sweet spot:

Ideal for:

✅ B2B SaaS with natural tenant isolation (customers don't share data)
✅ Event-driven applications (conferences, competitions, time-bound activities)
✅ Dozens to hundreds of tenants (not millions of users)
✅ Real-time updates (dashboards, progress, status changes)
✅ Cost-conscious operations (indie hackers, bootstrapped startups)

Not appropriate for:

❌ Consumer apps with millions of users sharing data
❌ Complex cross-tenant analytics or reporting
❌ Chat/collaboration requiring bidirectional WebSocket
❌ Applications requiring horizontal scaling within a single tenant

The key question: Does your data naturally partition by tenant?

If yes, shared-nothing multi-tenancy eliminates entire categories of complexity. If no, you need shared infrastructure (Postgres, Redis, load balancers).

Real-World Validation

Showcase isn't a toy example or proof of concept. It's been running in production for several years:

75+ dance studios (customers)
350+ events (tenants) across 8 countries on 4 continents
Zero security incidents (structural isolation prevents leakage)
One person operation (minimal operational overhead)
~$100/month (infrastructure costs)

During live events, the system handles:

Multiple judges entering scores simultaneously
Real-time displays updating for audiences
PDF generation for scorecards
Audio file uploads for DJ playlists
Background jobs for batch operations

All with SQLite, TurboCable, and Navigator. No Postgres, no Redis, no Kubernetes.

The SQLite Renaissance

This architecture fits into a broader trend: SQLite is being taken seriously for production applications.

Recent developments:

Turso - Distributed SQLite (SQLite-over-HTTP)
Litestream - SQLite replication to S3
LiteFS - Distributed SQLite filesystem
Rails 8 - SQLite as first-class citizen with proper defaults

For applications that fit SQLite's constraints (single writer, moderate write volume, local access), it offers unbeatable simplicity:

Zero configuration
No network latency
Simple backups (from file copies to comprehensive solutions like Litestream)
Built-in connection pooling via Rails
No query planner tuning

Showcase proves SQLite scales to real applications at real scale, not just prototypes.

Try It Yourself

All the pieces are open source:

Navigator: github.com/rubys/navigator
TurboCable: github.com/rubys/turbo_cable
Showcase application: github.com/rubys/showcase

To use TurboCable in your Rails app:

# Gemfile
gem 'turbo_cable'

bundle install
rails generate turbo_cable:install

Your views and models stay the same. Just remove Action Cable channels that only do stream_from with no custom actions.

See EXAMPLES.md for real-world patterns: live scoring, progress tracking, streaming command output, and more.

Conclusion

Most tutorials teach the default architecture: Postgres + Redis + horizontal scaling. This makes sense for applications that need it.

But if your data naturally partitions by tenant, there's a simpler path: shared-nothing multi-tenancy with SQLite, TurboCable, and process isolation.

The constraints that make these technologies "unsuitable" for traditional architectures are exactly what makes them perfect for multi-tenant isolation:

SQLite's single-machine constraint → Zero network latency
TurboCable's single-process constraint → No distributed state
Navigator's process-per-tenant pattern → Structural data isolation

The result is radical simplification: no Redis, simple connection management, no distributed locking, no cross-tenant query bugs.

Start simple. Scale by adding machines, not middleware. Let your architecture cohere around aligned constraints.

Running 350+ real-time applications for $100/month isn't magic—it's just embracing the right constraints.

Building this architecture required both technical implementation and thorough documentation at multiple levels. For insights into the development methodology that made this possible, see Disciplined use of Claude.

For complete architectural details including operations, maintenance mode, backups, and deployment patterns, see ARCHITECTURE.md in the showcase repository.