Essay · Web Architecture · 2026

The Data
Stays With You

A manifesto for the next generation of web apps — where your database lives in your browser, peers talk directly over WebRTC, and the cloud is just an optional courier.

~18 min read Local-First SPA + sql.js WebRTC P2P Privacy by Design
scroll

What if your web app's database never touched a server? What if the richest, most structured store of your personal data lived entirely inside your own device — and the internet was just the road, not the warehouse?

We have spent two decades building the web in one direction: toward the centre. Every form submission, every search query, every click funnelled upward into server farms that grow larger, more expensive, and more politically fraught with each passing year. We called it "the cloud" to make it sound diffuse, but it is anything but. It is a handful of data centres in Virginia, Ireland, and Singapore holding a mirror of your life.

There is another way. It is not new — it is, in fact, the oldest way. The data stays with you. The application runs in your browser. The server, if it exists at all, is a humble messenger — not a keeper of secrets.

01 — The Architecture of Dependence

The modern web application stack — React frontend, REST API, PostgreSQL on AWS — is a marvel of engineering. It is also a monument to a particular power relationship: you use the application, but you do not own your data. It lives at a URL you do not control, backed up on hardware you will never see, subject to terms of service that can change on a Tuesday.

The business logic is sound: centralised storage is easy to operate, easy to monetise, easy to query across users. But the externalities are paid by everyone else — by you, in privacy; by the planet, in energy; by the open web, in structural fragility.

Traditional Model
Browser (thin client)
API Server
All data lives here
Centralised DB
↓ rethink ↓
Local-First Model · data lives here
Browser + sql.js SQLite DB
optional sync
Relay / Edge Node
optional backup
User-controlled storage

02 — Enter sql.js — SQLite in the Browser

sql.js is SQLite compiled to WebAssembly. It gives you the full power of a relational database — transactions, joins, indices, full-text search — running entirely inside the browser's JavaScript runtime. No server. No network round-trip. No latency beyond the speed of your CPU's memory bus.

Paired with the Origin Private File System (OPFS) — a sandboxed, persistent filesystem available in modern browsers — your SQLite database can survive page reloads, browser restarts, and reboots. It is yours. It does not evaporate when you close the tab.

// Bootstrapping a local database in a SPA
import initSqlJs from 'sql.js';

async function openLocalDB() {
  const SQL = await initSqlJs({ locateFile: f => `/wasm/${f}` });

  // Load from OPFS if it exists, else start fresh
  const root = await navigator.storage.getDirectory();
  const file = await root.getFileHandle('app.db', { create: true });
  const existing = await readFile(file); // your helper

  const db = existing.byteLength > 0
    ? new SQL.Database(new Uint8Array(existing))
    : new SQL.Database();

  db.run(`CREATE TABLE IF NOT EXISTS notes (
    id    INTEGER PRIMARY KEY,
    body  TEXT,
    tags  TEXT,
    ts    INTEGER DEFAULT (unixepoch())
  )`);

  return db;
}

The database is just a Uint8Array — a chunk of bytes. You can save it, export it, email it to yourself, sync it via any transport, or hand it to another application. It is a file, not a service.

Why SQLite specifically?

SQLite is the most deployed database engine in the world — it lives in every iPhone, Android device, Firefox profile, and Chromium installation. It is extraordinarily well-tested, single-file, zero-configuration, and its file format is an open, documented standard explicitly designed for long-term archival. Your data will be readable in 30 years.

03 — Privacy Is Architecture, Not Policy

Privacy policies are promises. Architecture is physics. When data never leaves the device, no breach can expose it, no subpoena can compel it, no acquisition can sell it. The privacy guarantee is structural — it does not require you to trust the application developer, the hosting provider, or the jurisdiction their servers happen to occupy.

When data never leaves the device, privacy is not a feature — it is an inevitability.

This matters more than it might seem in 2026. The regulatory landscape for data residency is fragmenting. Cross-border data flows face new friction. Building an application on centralised storage means your architecture becomes entangled in geopolitics. The local-first SPA sidesteps this entirely: the data is in whatever country the user is sitting in, because it is on their device.

04 — Data That Breathes — Organic Enrichment and Degradation

One of the most underappreciated properties of user-owned data is that it can evolve at the pace of the user's life, not the pace of a product roadmap.

Enrichment: Because the database is local and full-featured, users (and the applications they choose) can add columns, create derived views, attach embeddings, link records to new schemas — all without a migration script running on a remote server. The data grows richer as habits form.

Degradation: Equally, the user can let data age gracefully. A journal app might automatically archive entries older than a year into a compressed blob. A fitness tracker might roll up daily records into weekly summaries after 90 days. These policies are enforced locally, transparently, and can be inspected or reversed by the user at any time. There is no mysterious data retention schedule buried in a privacy policy.

-- Example: organic data lifecycle in the local DB
-- Run as a scheduled task inside the SPA (e.g., on app focus)

-- Enrich: tag notes with a generated reading-time estimate
UPDATE notes
   SET   reading_secs = length(body) / 20
 WHERE reading_secs IS NULL;

-- Degrade: summarise old detailed logs into weekly rollups
INSERT INTO weekly_summary (week_start, total_words)
  SELECT strftime('%Y-%W', ts, 'unixepoch'), sum(length(body))
    FROM  notes
   WHERE ts < unixepoch('now', '-90 days')
     AND  archived = 0
   GROUP BY 1;

UPDATE notes SET archived = 1
 WHERE ts < unixepoch('now', '-90 days');

05 — Reducing the Data Centre Footprint

Every read query that resolves locally is a query that did not traverse a network, did not wake a database server, did not increment a cloud bill, and did not consume a joule of energy in a data centre. At the scale of millions of users, this is not a rounding error — it is a meaningful reduction in the embodied carbon of software.

Dimension Centralised SaaS Local-First SPA
Read latency 20–300ms (network) <1ms (memory)
Write latency Round-trip + ACK Synchronous, instant
Works offline? Rarely / degraded Fully, by default
Privacy breach surface Entire user base Individual device only
Server infra cost Scales with users Near-zero (static hosting)
Data longevity Company-dependent User-controlled
Energy per read Data centre + network CPU cache only

The server does not disappear entirely. Collaboration requires a shared medium. But the server's role shrinks dramatically: it becomes a relay and an optional backup endpoint, not the system of record. You can serve an entire application from a CDN edge node and a thin sync service — infrastructure that costs pennies per thousand users rather than dollars.

06 — The Principles of a Local-First SPA

🗄️

Database in the Browser

sql.js + OPFS gives you a full SQLite engine. The data file is yours to export, back up, or migrate.

✈️

Offline First

Every operation resolves locally. The network is an enhancement, not a requirement. Service workers cache the app shell.

🔄

Sync is Optional

CRDTs or event logs enable multi-device sync without central authority. Conflict resolution happens at the edges.

🔐

Encrypt Before Sync

If data ever leaves the device, encrypt it client-side first. The relay sees ciphertext, never plaintext.

🌱

Organic Lifecycle

Data enriches or degrades on the user's schedule, governed by local policies the user can inspect and change.

📦

Portable by Default

One-click export to a standard SQLite file. Import into any tool. No vendor lock-in, ever.

07 — What This Looks Like in Practice

Consider a personal knowledge management app built on these principles. The SPA is a static bundle served from a CDN. On first load, it initialises a SQLite database in the browser's OPFS. Every note, link, and tag is written there — full-text indexed, instantly queryable. Startup time for a search is measured in microseconds.

The user can optionally connect a sync endpoint — a minimal relay that stores encrypted deltas. Device B downloads those deltas on next open and merges them using a CRDT log. The relay never sees the plaintext. If the relay service shuts down, nothing is lost: both devices have the full database.

Six months later, the user exports their database as a .sqlite file, opens it in DB Browser for SQLite, and runs whatever query they like. Their data is not hostage to an API. It was never anything other than theirs.

Ecosystem Signals

This architecture is gaining momentum. ElectricSQL, PowerSync, and Turso's embedded replicas all explore the local-database pattern. The WHATWG OPFS spec shipped in all major browsers by 2023. wa-sqlite and @sqlite.org/sqlite-wasm extend sql.js with VFS adapters for true persistence. The primitives are here.

08 — WebRTC — The Serverless Communication Layer

If sql.js gives each peer a brain, WebRTC gives them a voice. Web Real-Time Communication is a browser-native API that establishes encrypted, direct peer-to-peer channels — for data, audio, and video — without routing a single byte through an application server. It is the missing transport layer for a genuinely distributed web.

The key primitive is the RTCDataChannel: a low-latency, ordered or unordered binary stream between two browsers. Once open, it is as fast as the internet between those two machines — no server, no database write, no cloud bill. Send a CRDT delta, a file chunk, or a heartbeat. The channel does not care.

Peer topology — no central server in the data path

PEER A sql.js PEER B sql.js PEER C sql.js PEER D sql.js RTCDataChannel (encrypted) CRDT delta in flight

The elephant in the room: WebRTC peers must find each other before they can go direct. This requires a brief handshake through a signalling server — a lightweight process that exchanges ICE candidates and SDP session descriptions. Crucially, the signalling server sees only connection metadata, never the payload. Once the handshake completes, it can go offline and the peers continue talking uninterrupted.

Peer A (initiator)
RTCPeerConnection
signalling server
Peer B (responder)
RTCPeerConnection
1. createOffer() → SDP
2. setRemoteDescription(offer)
3. createAnswer() → SDP
4. setRemoteDescription(answer)
5. ICE candidates exchanged ↔ (STUN/TURN)
✓  Direct P2P channel open — signalling server no longer needed

With the channel open, the two sql.js databases can sync directly. Each peer maintains a CRDT operation log — an append-only table of timestamped mutations. On connect, peers exchange their log tails and merge. No server coordinates the merge. No server even knows what was merged.

// Open a WebRTC data channel and sync CRDT log with a peer
async function syncWithPeer(db, signalingUrl, roomId) {
  const pc = new RTCPeerConnection({
    iceServers: [{ urls: 'stun:stun.l.google.com:19302' }]
  });

  // Open a reliable, ordered data channel
  const ch = pc.createDataChannel('crdt-sync', { ordered: true });

  ch.onopen = () => {
    // Send our CRDT log tail: all ops since last sync
    const ops = db.exec(
      `SELECT op_id, table_name, row_id, col, val, ts
         FROM crdt_log
        WHERE synced = 0
        ORDER BY ts`
    );
    ch.send(JSON.stringify({ type: 'log_tail', ops }));
  };

  ch.onmessage = ({ data }) => {
    const { type, ops } = JSON.parse(data);
    if (type === 'log_tail') applyRemoteOps(db, ops);
  };

  // Minimal signalling via a tiny WebSocket relay
  const ws = new WebSocket(`${signalingUrl}?room=${roomId}`);
  ws.onmessage = async ({ data }) => {
    const msg = JSON.parse(data);
    if (msg.sdp)        await pc.setRemoteDescription(msg.sdp);
    if (msg.candidate)  await pc.addIceCandidate(msg.candidate);
  };

  pc.onicecandidate = ({ candidate }) =>
    candidate && ws.send(JSON.stringify({ candidate }));

  const offer = await pc.createOffer();
  await pc.setLocalDescription(offer);
  ws.send(JSON.stringify({ sdp: offer }));

  return pc; // caller can close when done
}

function applyRemoteOps(db, ops) {
  // Last-write-wins merge: apply only ops newer than local copy
  const stmt = db.prepare(`
    INSERT OR IGNORE INTO crdt_log(op_id,table_name,row_id,col,val,ts,synced)
    VALUES (?,?,?,?,?,?,1)`);
  ops.forEach(row => stmt.run(row));
  stmt.free();
  materializeFromLog(db); // rebuild live tables from the log
}

What about NAT traversal?

Most direct connections succeed via STUN (a tiny stateless server that tells peers their public IP). When both peers are behind strict NAT, a TURN relay is needed — but TURN only relays encrypted packets and can be self-hosted at minimal cost. The signalling server and TURN server combined handle zero application data; they are infrastructure, not custodians.

08b — The Full Stack: sql.js + WebRTC Together

The combination produces something qualitatively new. Each browser is simultaneously a database node, a compute node, and a network node. The application server degrades gracefully into a static file host and an optional thin signalling relay — both of which can run on commodity edge infrastructure for a fraction of a cent per user per month.

Browser A — full stack
SPA · sql.js DB · CRDT log · RTCPeerConnection
←→
WebRTC
DataChannel
Browser B — full stack
SPA · sql.js DB · CRDT log · RTCPeerConnection
↑ only for initial handshake ↑
CDN edge (static files)
SPA bundle · WASM · service worker
Signalling relay (tiny)
WebSocket room · sees SDP only · stateless
STUN / TURN (optional)
NAT traversal · self-hostable · zero app data

When Peer A edits a document, the change is written to their local sql.js database in under a millisecond. The CRDT log records the operation. If Peer B has an open data channel, the delta is pushed immediately — real-time collaboration with no server round-trip. If Peer B is offline, the delta waits in the log and syncs the next time a channel opens. The experience is instant; the architecture is honest.

09 — An Invitation

The centralised web is not evil — it solved real problems and built remarkable things. But it has extracted a price in privacy, energy, and structural power that we are only beginning to reckon with.

The local-first SPA is not a regression to desktop software. It is a more honest architecture — one that treats the user's device as a first-class compute node, their data as genuinely their own, and the network as a gift rather than a dependency. It is quieter, faster, cheaper, and more private almost as a side-effect of how it is built.

The data centre footprint shrinks not because we passed a regulation or bought a carbon credit, but because we stopped sending data there in the first place. That is the kind of environmental win that compounds.

Build things that work when the network doesn't. Store data where the user is. Let it grow old gracefully, on their terms.

The tools are ready. The browser is more than ready. The question is whether we are willing to give up the comfortable centre and trust the edge.

🌐 Chat App Demo

Experience a local-first chat application built with the principles in this manifesto. View the demo — see WebRTC peer discovery, P2P connections, and E2EE in action.

Further reading: Martin Kleppmann et al., Local-first software (Ink & Switch, 2019) · SQLite OPFS VFS documentation · CRDTs: An Overview (Conflict-free Replicated Data Types) · wa-sqlite on GitHub · MDN WebRTC API · RFC 8825 — WebRTC Overview · High Performance Browser Networking ch.18 (Ilya Grigorik)