jima c4ead07fa4 Version bump, AppSwitch, cloud backend docs, audit files to docs/, gitignore cleanup

- Bump version 0.0.38 → 0.0.39
- AppSwitch: scaled-down (0.75x) Material3 Switch with full touch target,
  replaces default Switch in KeyboardSettingsDialog for consistent narrow style
- Cloud backend spec: FUTURE.md summary + FUTURE_BACKEND.md full architecture
  (zero-knowledge sync, packs, team sharing, web dashboard, swb CLI) +
  FUTURE_BACKEND_TECH.md implementation details
- Move Audit.md and SecurityAudit.md into docs/ folder
- Add scripts/ to .gitignore (test results, deploy scripts — local only)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-12 11:47:17 +02:00

76 KiB

Raw Permalink Blame History

SSH Workbench — Cloud Backend Technical Architecture

Status: Planning / Vision document. Nothing here is implemented. Audience: senior developer (likely solo) building this incrementally alongside the existing Android app. Companion documents: FUTURE_BACKEND.md (product spec, threat model, concept model), FUTURE.md §"Vault Export — Option C" (X25519 design that team sharing extends). Last updated: 2026-04-11

1. Purpose of this document

FUTURE_BACKEND.md describes what we are building and why. This document describes how — concrete stack choices, schema, endpoints, deployment, and the order to ship things in.

It is written to be handed to one senior developer who will build this without a team. Every decision in this document optimizes for that constraint: simplicity over cleverness, boring proven tools over fashionable ones, and a delivery sequence where each phase ships something that already pays its own way.

If the developer reading this disagrees with a choice, the right move is to push back before writing code. Every section ends with the "why" so disagreements can be reasoned about, not just inherited.

2. Hard constraints

These are not negotiable. They shape every other decision in this document.

Solo developer. Simplicity and maintainability beat cleverness. No microservices. No event sourcing. No GraphQL. No message queues unless we cannot avoid them. One language on the backend. One framework. One database.
Zero-knowledge. The server never stores anything that could decrypt user credentials. All cryptography happens on the client (Android, CLI, web JS). The backend is a dumb store of opaque ciphertext blobs plus some metadata it needs to route things. There is exactly one documented exception (the web SSH terminal — see §10) and it is mitigated, disclosed, and optional.
Incremental delivery. Each phase must be shippable and useful on its own. We do not build Phase 2 until Phase 1 has real users. We do not build Phase 4 until Phase 3 has paying customers asking for it.
The mobile app already exists and works. It has Argon2id + AES-256-GCM in lib-vault-crypto (JNI/C++), an X25519 keypair design sketched in FUTURE.md Option C, and a .swb vault format with mode bytes. The backend reuses these — never replaces them.
The backend is never a single point of failure. If the backend is offline for a week, the Android app keeps working with full functionality. If the company shuts down, the user's local vault and .swb exports keep working forever. Backend is convenience, not dependency.

3. Stack — recommended choices and justifications

Each line of the stack is a decision with consequences. The default for a solo developer should be "what do you already know, and what has the largest community when you Google an obscure error message at 2am". Optimize for that.

3.1 Runtime: Node.js (LTS) with TypeScript

Why. Familiarity for most web developers, fast iteration, the largest package ecosystem of any backend language, and TypeScript gives us the type safety we need to maintain a non-trivial schema solo. Reasonable cold-start and memory footprint on Railway's small instances.

Why not Go for the backend (we use Go for the CLI — see §11). Go is excellent for CLIs and systems code, but for a CRUD-heavy backend with JSON endpoints, JWT, ORMs, webhooks, and email integrations, Node + TS lets us move 2-3× faster and the resulting code is shorter. Solo dev velocity matters more here than raw throughput.

Why not Rust. Rust is great. Rust is also slower to write, slower to refactor, and slower to onboard a future contributor to. A solo developer cannot afford the borrow checker friction on a moving target.

Why not Kotlin. Tempting because of code sharing with the Android app. But the Kotlin/JVM web ecosystem (Ktor, Spring) is smaller and less battle-tested for the specific solo-dev sweet spot we are in.

3.2 Framework: Fastify

Why. Faster than Express (we will care once we have real load), built-in JSON schema validation on every route (which doubles as input validation and OpenAPI generation), excellent TypeScript story, plugin ecosystem covers everything we need (auth, rate limit, helmet, cors, websocket, swagger), actively maintained, written by people who actually run it in production.

Why not Express. The default. Slower. No built-in validation. We would end up adding Joi or Zod and reinventing what Fastify gives us for free.

Why not NestJS. Heavier, opinionated, more abstractions to learn. Good for teams enforcing structure. Overhead for one developer.

Why not Hono / Elysia. Newer. Smaller communities. Either is a defensible alternative if the developer knows them well. Fastify is the lower-risk default.

3.3 Database: PostgreSQL (managed, single instance)

Why. Boring, proven, the right answer for 95% of new backends. Excellent JSONB support which we use heavily for opaque encrypted blobs. Row-level security policies let us scope queries to the current user at the database layer (defense in depth — a buggy query cannot leak across users). Strong constraints, transactions, and a sane SQL dialect. Free-tier managed instances on Railway and Render.

No sharding, no read replicas, no Citus. Premature for years. A single Postgres instance handles tens of thousands of users for a workload this small (encrypted blob reads + metadata writes). When we need more, we add a read replica. When we need more than that, we are profitable enough to hire someone to do it properly.

Why not SQLite. Tempting for simplicity. But we want managed backups, point-in-time recovery, and the ability to scale up without rewriting persistence. Not worth the savings.

Why not MongoDB / DynamoDB. We have a relational model with foreign keys, transactions, and a strict schema. A document store would be the wrong shape and leak its weakness into the application code.

3.4 ORM: Prisma

Why. Type-safe, generated TypeScript client matches the schema exactly, migration story is clean (prisma migrate dev / prisma migrate deploy), works well solo because the schema lives in one file, good Postgres feature coverage (JSONB, enums, RLS via raw SQL). Excellent error messages.

Why not Drizzle. Newer, lighter, closer to SQL. Defensible alternative. Pick Drizzle if the developer prefers writing SQL-flavored query builders. Prisma's stronger migration tooling tips the choice for solo dev.

Why not raw SQL with a thin client. Tempting for the 10× developer aesthetic. But every schema change is then a hand-written migration, every query is hand-written and untyped, and we will spend hours debugging things Prisma would have caught at compile time. Not worth the purity.

3.5 Authentication: email + password (Argon2id), JWT + rotating refresh tokens

Why this and not OAuth. OAuth (Google sign-in, GitHub sign-in) would be convenient for users but adds non-trivial complexity for a solo dev: client IDs, redirect URI management, account linking edge cases, token refresh semantics, "what if the user revokes Google access" handling. We can add it in Phase 2 once the core auth flow is stable. Phase 1 ships with email + password only.

Argon2id parameters. Same as lib-vault-crypto (memory=64 MiB, t=3, p=1, output=32). Two important details:

The Argon2id hash stored on the server is for authentication and is computed on the server from a transport of the password (typically the Argon2id-derived auth key, see §5.2). It is not the master key used to decrypt the vault. Those two derivations must be cryptographically distinct so a database leak cannot be replayed against the vault.
The vault master key is derived on the client from the password and the user's salt. The server never sees the password and never sees the master key. The client sends a different value for authentication.

JWT access tokens. 15-minute lifetime, signed with HS256 using a 256-bit secret stored in the runtime environment. Short lifetime so that a leaked access token has a small blast radius without us needing token revocation infrastructure.

Refresh tokens. Stored hashed in the database with a family_id, 30-day lifetime, single-use (rotation on every use). On refresh, the old token is marked used and a new token is issued with the same family_id. If a used token is presented again, the entire family is invalidated immediately — that is the standard refresh-token-theft detection pattern. The client logs out and prompts re-login.

No OAuth, no SAML, no SSO until we have Teams customers asking for it specifically. Adding these prematurely is a classic time sink for solo developers.

3.6 Blob storage: Cloudflare R2

Why. S3-compatible API (any S3 SDK works), zero egress fees (the killer feature — egress on AWS S3 is what kills self-funded SaaS budgets), cheap storage, simple billing, no surprise charges. Good free tier.

What we store there. Larger encrypted blobs that do not belong in Postgres rows: vault snapshots (for backup/restore), exported .swb files generated server-side for download, possibly per-pack ciphertext blobs once they grow past a few KB. Postgres holds metadata + small blobs; R2 holds large blobs.

What we do not store there. Anything plaintext. Same zero-knowledge rules as the database.

Why not S3. Egress fees. We will be downloading vaults to multiple devices. AWS would bill us for every sync. R2 does not.

Why not Backblaze B2. Also great. R2 has the edge on free tier and on global network locality. Either works.

3.7 Email: Resend

Why. Simple HTTP API, good deliverability out of the box, generous free tier (3,000 emails/month, 100/day), modern developer experience, reasonable pricing as we grow. Ships with React Email components if we want templated transactional mail.

What we send. Account verification, password reset (which only resets the auth password, never the vault — see §5.2), team member invites, billing receipts (optional — Stripe handles its own), security alerts (new device login).

Why not SendGrid / Mailgun / Postmark. All fine. Resend is simpler to set up and the free tier is enough until we have real users.

Why not run our own SMTP. This is the classic solo-dev trap. Don't.

3.8 Hosting: Railway for backend + Postgres + WebSocket; Vercel for the Next.js frontend

Why Railway for the backend. Simple deployment (git push → live), managed Postgres available in the same project with private networking, environment variable management, persistent processes (essential for WebSockets — see below), reasonable pricing for a solo dev's load. No DevOps overhead.

Why explicitly NOT Vercel for the backend. Vercel is great for static and SSR Next.js but it is a serverless platform. Serverless means no persistent processes, which means no long-lived WebSocket sessions — and the web SSH terminal is exactly that. Trying to put WebSockets on a serverless function would force us to add a separate WebSocket service anyway, which negates the simplicity argument entirely.

Why Vercel is fine for the Next.js frontend. Pages, API routes that are truly stateless, image optimization, edge caching, ISR — all of this is what Vercel was built for. The frontend lives on Vercel; the backend lives on Railway; the WebSocket terminal endpoint is part of the backend on Railway.

Why not Render. Equivalent to Railway. Either works. Pick whichever the developer knows. Railway has slightly better DX as of this writing.

Why not AWS / GCP / Azure. Solo dev. Do not do this. The time spent learning IAM, VPCs, ALBs, and ECS is time not spent shipping. Migrate later if a customer demands it.

Why not self-hosted on a Hetzner VPS. Tempting for cost. But we are also taking on backups, monitoring, OS patching, and on-call. For a solo dev, the $20/month savings are not worth the operational tail.

3.9 Web frontend: Next.js 14 (App Router) + Tailwind + shadcn/ui

Why Next.js. Mature, large ecosystem, good docs, App Router gives us server components for the parts that should be SSR (marketing pages, dashboard shell), client components for everything that touches WebCrypto and the WebSocket terminal. TypeScript native. Vercel deploys it for free at the scale we will have.

Why Tailwind. Solo dev cannot afford to maintain a custom CSS system. Utility-first scales without a designer in the loop. Pairs perfectly with shadcn/ui.

Why shadcn/ui. Not a dependency — copy-paste components into the repo, customize freely, no upgrade treadmill, accessible by default, responsive by default. Exactly the right shape for a solo dev who needs a polished UI without building a component library.

Why not Svelte / Solid / Remix. All fine. Next.js wins on ecosystem and on Vercel integration. Pick a different framework if the developer is significantly more productive in it.

3.10 Web terminal: xterm.js + node-pty bridge

Why xterm.js. The de facto browser terminal. Used by VS Code, Hyper, and every other web SSH tool. Mature, accessible, fast, good API, large addon ecosystem.

How the bridge works. A WebSocket endpoint on the Railway backend accepts an authenticated connection, opens a node-pty PTY into an ssh subprocess (or, alternatively, uses the ssh2 Node library for an in-process SSH client), and proxies bytes both ways. See §10 for the security model and the explicit zero-knowledge exception this introduces.

3.11 CLI tool: Go

Why Go for the CLI specifically. Single static binary, cross-compiles to Linux/macOS/Windows from one machine, no runtime dependency on the user's system, fast startup, reasonable standard library for the things we need (HTTP, JSON, crypto, exec). Distribution story is "download a binary from GitHub releases".

Why not Rust for the CLI. Equally good single-binary story. Slower to write. Pick Rust if the developer is more fluent in Rust.

Why not Node for the CLI. Pkg/nexe style bundling exists but the resulting binaries are heavier and slower to start. The "single binary, no dependencies" promise is what makes a CLI usable.

Cross-compatibility with Android. The Go CLI must be able to read and write .swb vault files produced by the Android app. This means implementing Argon2id (use golang.org/x/crypto/argon2 — same RFC 9106 reference parameters) and AES-256-GCM (standard library) with byte-for-byte the same encoding as lib-vault-crypto. There must be a test file in the Go repo that contains a known .swb produced by the Android JNI and asserts the Go decrypt produces the expected plaintext. This test is the contract.

3.12 Encryption on the backend: never

Critical. The backend has no crypto code beyond TLS termination and JWT signing. Argon2id, AES-GCM, X25519 — none of these run on the server. Vault material arrives encrypted, is stored encrypted, and is served encrypted.

If a backend route ever needs to call into a crypto library to manipulate vault content, it is wrong and the design must be reconsidered.

4. Repository layout

Single monorepo:

ssh-workbench-cloud/
├── backend/                      # Node + Fastify + Prisma
│   ├── src/
│   │   ├── routes/               # Fastify route plugins
│   │   ├── plugins/              # auth, rate-limit, helmet, etc.
│   │   ├── lib/                  # JWT, email, billing, R2
│   │   └── server.ts             # entry point
│   ├── prisma/
│   │   ├── schema.prisma
│   │   └── migrations/
│   ├── tests/
│   ├── package.json
│   └── tsconfig.json
│
├── web/                          # Next.js 14 dashboard
│   ├── app/                      # App Router pages
│   ├── components/               # shadcn/ui + custom
│   ├── lib/                      # crypto wrappers, API client
│   └── package.json
│
├── cli/                          # Go CLI
│   ├── cmd/swb/                  # main package
│   ├── internal/
│   │   ├── crypto/               # Argon2id + AES-GCM (must match lib-vault-crypto byte-for-byte)
│   │   ├── api/                  # backend client
│   │   └── vault/                # .swb format reader/writer
│   ├── go.mod
│   └── Makefile
│
├── shared/                       # cross-language artifacts
│   ├── proto/                    # if we ever go gRPC
│   ├── openapi.yaml              # generated from Fastify schemas
│   └── test-vectors/             # known-good .swb files for cross-impl tests
│
├── docker-compose.yml            # local dev: postgres + backend + web
├── .env.example
└── README.md

A monorepo simplifies cross-language refactors (rename a field, update all three clients in one PR), test vectors live in one place, and CI builds everything together. For a solo dev, the alternative — three separate repos — adds friction with no upside.

5. Security architecture

This is the most important section in the document. If anything in here is unclear, stop and resolve it before writing code.

5.1 What the server stores vs what it never sees

The server stores:

Data	Form
User email	Plaintext (for auth, billing, communication)
Argon2id hash of an auth key (see §5.2)	Plaintext, salted, server-side hash
User's per-account password salt	Plaintext (must be sent on login so client can derive the same master key)
User's X25519 public key	Plaintext (other members need it to wrap pack keys to this user)
User's X25519 private key	Encrypted under the master key (server cannot decrypt)
Vault entries	Encrypted under master key or pack data key (server cannot decrypt)
Pack metadata names	Encrypted under master key or pack data key (server cannot decrypt)
Pack data key, wrapped per member	Encrypted to each member's X25519 public key (server cannot decrypt)
Subscription status, billing IDs	Plaintext (operational metadata)
Refresh tokens	Hashed (so a DB leak does not yield session takeover)
Org membership, role	Plaintext (used for ACL enforcement)
Pack subscription cursor (which version a device is at)	Plaintext (incremental sync metadata)

The server never sees:

Data	Where it lives instead
User password	Client memory only, zeroed after use
Master key	Derived on client from password + salt, never transmitted
X25519 private key in plaintext	Decrypted on client only when needed
Pack data keys in plaintext	Decrypted on client only when needed
Hostnames, usernames, ports of saved connections	Inside the encrypted blob
Private SSH keys	Inside the encrypted blob
Saved passwords	Inside the encrypted blob
Snippets, keyboard layouts, QB configs	Inside the encrypted blob
Pack names in plaintext	Encrypted (`name_ciphertext` column)

The principle: anything the server holds in plaintext must be either (a) operationally necessary (email for billing, salt for re-derivation, public keys for wrapping) or (b) metadata that does not compromise the secret material (updated_at, version, kind enum). Everything else is ciphertext.

5.2 Master key derivation and the auth key separation

The user has one password. The client derives two cryptographic keys from it, with different salts and different domain separation tags. They must be cryptographically distinct so a leak of one cannot be replayed against the other.

master_key = Argon2id(
    password   = user_password,
    salt       = user_salt || "swb-vault-master-v1",
    memory     = 65536 KiB,
    iterations = 3,
    parallelism= 1,
    output     = 32 bytes
)

auth_key   = Argon2id(
    password   = user_password,
    salt       = user_salt || "swb-auth-v1",
    memory     = 65536 KiB,
    iterations = 3,
    parallelism= 1,
    output     = 32 bytes
)

The client sends auth_key to the server (over TLS) as the "password" for the login flow. The server then runs its own server-side Argon2id over auth_key with a server-generated salt — this is the standard "double hashing" pattern used to prevent a database leak from yielding usable login credentials. The result is what gets stored in users.argon2_hash.

Why two layers of Argon2id? The client-side derivation is what makes the vault zero-knowledge. The server-side derivation is what makes the auth database leak-resistant. They protect against different threats and they are both cheap because the heavy memory cost happens only at login time.

The client never sends the password or the master key. The server never sees the master key.

Password change. When the user changes their password, the client:

Derives the new master_key' and the new auth_key'.
Decrypts the vault locally with the old master_key.
Re-encrypts everything with the new master_key'.
Decrypts the X25519 private key with the old master_key, re-encrypts with master_key'.
Uploads new ciphertexts and the new auth_key'.

This is the only way to do password change in a zero-knowledge system. There is no shortcut. The full vault round-trip is the cost of the security property.

5.3 Vault encryption

Each vault entry is serialized as JSON, then encrypted with AES-256-GCM using the master key (for solo entries) or the pack data key (for shared entries):

ciphertext = nonce(12) || AES-256-GCM(key, plaintext_json) || tag(16)

This is the same wire format lib-vault-crypto already produces. No new code path.

Suppose Alice (admin) wants to share a pack containing some hosts with Bob (new member). Here is exactly what happens:

Setup (one-time, at Bob's signup):

Bob installs the app, signs up.
Bob's client generates an X25519 keypair: (bob_priv, bob_pub).
Bob's client encrypts bob_priv with his master key (AES-GCM): enc_bob_priv = AES-256-GCM(bob_master_key, bob_priv).
Bob's client uploads bob_pub (plaintext) and enc_bob_priv (ciphertext) to the server.
Server stores both. Server can read bob_pub. Server cannot decrypt enc_bob_priv because it does not have Bob's master key.

Pack creation (Alice creates the pack):

Alice's client generates a random 32-byte pack_data_key.
Alice's client encrypts pack content (host entries, etc.) with pack_data_key using AES-GCM.
Alice wraps pack_data_key to herself: she fetches her own alice_pub (or has it cached), generates an ephemeral X25519 keypair, computes a shared secret, derives a wrap key via HKDF, and wraps pack_data_key. The wrapped key, ephemeral public key, and salt are uploaded.
Server stores: pack metadata, the encrypted entries (opaque), and Alice's wrapped key.

Granting Bob access to the pack:

Alice's client decrypts pack_data_key for itself using her own wrapped copy.
Alice's client fetches bob_pub from the server.
Alice's client generates an ephemeral keypair (eph_priv, eph_pub), computes the shared secret shared = X25519(eph_priv, bob_pub).
Alice derives a wrap key: wrap_key = HKDF-SHA256(shared, salt=pack_id, info="swb-pack-wrap-v1").
Alice wraps the pack data key: wrapped = AES-256-GCM(wrap_key, pack_data_key).
Alice uploads { pack_id, member_user_id=bob, ephemeral_pub=eph_pub, wrapped_data_key=wrapped }.
Server stores the row in pack_members. Server still cannot decrypt anything — it has the wrapped key but not Bob's private key.

Bob receiving the pack:

Bob logs in. His client downloads enc_bob_priv and decrypts it with his master key, recovering bob_priv in client memory.
Bob's client lists his accessible packs from the server. The server returns the rows from pack_members where user_id = bob.
For each pack, Bob's client computes shared = X25519(bob_priv, ephemeral_pub) (the same scalar multiplication from the other side — ECDH magic), derives the same wrap_key via HKDF, and unwraps pack_data_key.
Bob's client downloads the pack's encrypted entries and decrypts them with pack_data_key.
Bob now sees the hosts. The server still does not.

Revoking Bob's access:

Alice tells the server "remove Bob from this pack". Server deletes Bob's row from pack_members. Bob's client will no longer be able to fetch the wrapped key for this pack.
Important: Bob already had pack_data_key cached locally up to this point. We cannot reach into his device and unsay it. To prevent future entries in the pack from being readable to Bob, Alice rotates pack_data_key — she generates a new random key, re-encrypts all pack entries with it, and re-wraps it for all remaining members. Bob's old cache decrypts the old entries (which he already had) but not new ones.
The audit log records the revocation with timestamp and acting admin.

This is a standard pattern (similar to the design in 1Password's secret sharing, Bitwarden's organization keys, and the Signal sender keys protocol). Nothing here is novel cryptography. Use a vetted X25519 library (libsodium via sodium-native on Node, or the Go golang.org/x/crypto/curve25519 package on the CLI side).

5.5 JWT and refresh token security

Access token.

HS256 signed with a 256-bit secret stored in the runtime environment (Railway secrets, never in code or git).
15-minute lifetime.
Payload: { sub: user_id, iat, exp, ver }. ver is a token format version so we can change the schema without invalidating sessions.
Sent in the Authorization: Bearer <token> header.
No revocation list. Short lifetime is the revocation mechanism.

Refresh token.

256-bit random, stored hashed in the database (token_hash is SHA-256(token)).
30-day lifetime.
Single-use. Each /auth/refresh call returns a new refresh token and marks the old one used.
Each refresh token has a family_id. When the client first logs in, we generate a new family. Each rotation reuses the family.
Theft detection: if a used refresh token is presented again, we invalidate the entire family (UPDATE refresh_tokens SET used = true WHERE family_id = ?). The user is logged out everywhere and must re-authenticate. This is the standard OAuth2 refresh token rotation pattern (RFC 6749 §10.4).
Stored in mobile EncryptedSharedPreferences, in the OS keychain on the CLI (Linux: Secret Service / libsecret, macOS: Keychain, Windows: Credential Manager), and in an HTTP-only secure cookie on the web dashboard.

Logout. Deletes the current refresh token and the entire family. Does not invalidate the access token (which expires in <15 min anyway).

5.6 API security

Rate limiting. @fastify/rate-limit plugin, applied globally with per-route overrides:

Login / register / refresh: 5 attempts per minute per IP, then exponential backoff.
Vault read: 60 requests per minute per user.
Vault write: 20 requests per minute per user.
Web terminal connect: 10 per minute per user.
All other authenticated endpoints: 120 per minute per user.

Rate limit state in Redis when we have one (Phase 2+); in-memory for Phase 1, which is fine for a single backend instance.

Helmet headers. @fastify/helmet plugin:

Content-Security-Policy: strict, no inline scripts, no eval.
Strict-Transport-Security: max-age=63072000; includeSubDomains; preload.
X-Content-Type-Options: nosniff.
X-Frame-Options: DENY (the dashboard is not embeddable).
Referrer-Policy: no-referrer.

Input validation. Every Fastify route declares a JSON Schema for body, params, querystring, and response. Fastify enforces this at the framework level — invalid input is rejected with 400 before the handler runs. This is not optional. There is no route in the API without a schema.

CORS. Strict allowlist: only the dashboard origin, the marketing site origin, and null (for the CLI which has no origin). Wildcards are forbidden.

SQL injection. Prisma parameterizes all queries. We never construct raw SQL with string concatenation. The few places we need raw SQL (RLS policies, complex aggregations) use Prisma's $queryRaw with template literal binding, which is also safe.

XSS. React + Next.js auto-escapes output. We do not render dangerouslySetInnerHTML from user content. Markdown content (if we ever render it) goes through DOMPurify.

CSRF. API routes are JSON only and require Authorization: Bearer. Browsers cannot forge that header cross-origin without the cookie. We do not use cookie-based session auth for API routes, which sidesteps CSRF entirely. The dashboard uses cookies only for the refresh token, with SameSite=Strict and HttpOnly.

5.7 Database security: row-level security (RLS)

PostgreSQL row-level security adds a defense-in-depth layer below the application. Even if a route handler accidentally queries vault_entries without a WHERE owner_id = $current_user, RLS will prevent rows from leaking.

We enable RLS on every user-scoped table:

ALTER TABLE vault_entries ENABLE ROW LEVEL SECURITY;

CREATE POLICY vault_entries_owner ON vault_entries
    FOR ALL
    USING (owner_id = current_setting('app.current_user_id')::uuid);

The Prisma client sets app.current_user_id at the start of each request via SET LOCAL. Subsequent queries within the same transaction inherit it. RLS enforces it.

This is belt and braces. We still write the WHERE clauses in the application — RLS is the safety net for the case where we forget.

5.8 Secrets management

All secrets (database URL, JWT signing key, R2 access key, Stripe key, Resend key, Google Play service account JSON) live in Railway environment variables.
.env.example in the repo with placeholder values. .env.local is gitignored.
The Google Play service account JSON is base64-encoded into a single environment variable to avoid file management.
Rotating a secret is a Railway dashboard action plus a redeploy. No code changes.
We never log secrets. Pino's redaction config removes authorization, cookie, password, token, secret from log output.

5.9 Threat model

Threat: full database dump.

Attacker gets: emails, server-side Argon2id hashes (which are themselves hashes of derived auth keys, not of passwords), password salts, X25519 public keys, encrypted X25519 private keys, encrypted vault blobs, encrypted pack content, wrapped pack data keys, subscription metadata, hashed refresh tokens.
Attacker cannot: decrypt vault content, decrypt pack content, log in as any user (refresh tokens are hashed, access tokens have expired, password authentication requires the actual password to derive the auth key), unwrap any pack data keys (no private keys).
Best the attacker can do: an offline dictionary attack against the server-side Argon2id hash to recover auth_key, then a second offline Argon2id attack against the user salt to recover the password, then a third Argon2id derivation to compute the master key. Each step costs ~64 MiB and ~250 ms per password tried. Targeted attack against a high-value user is feasible if they used a weak password. Mass attack against the database is computationally infeasible — Argon2id is specifically designed for this.

Threat: full server compromise (attacker has shell on the production server).

Attacker gets: everything from the database dump scenario, plus the JWT signing key (so they can forge access tokens), plus the ability to capture future requests in real time, plus the ability to silently replace the JavaScript served to web dashboard users with a malicious version that exfiltrates the master key when the user next logs in.
This is the worst case. Mitigations:
- Subresource integrity on the dashboard JavaScript so a swap is detectable by the browser.
- Reproducible builds of the Android app and CLI so users can verify what they install.
- Monitoring: alerting on unexpected process activity, unauthorized config changes, anomalous traffic patterns.
- Transparency: a public security incident response policy.
We cannot make a server compromise invisible from the user's vault perspective in the long run. We can make it noisy, expensive, and detectable. That is the realistic ceiling.

Threat: compromised admin / rogue developer.

Same as full server compromise. The mitigations are organizational, not technical: principle of least privilege for production access, audit logging, two-person review for sensitive changes. For a solo dev: two-factor on Railway, two-factor on GitHub, signed commits, hardware security key.

Threat: phishing.

The dashboard URL is app.sshworkbench.com (or whatever we end up with). DNS controlled tightly. HSTS preload. Browser autofill on the canonical URL trains the user to suspect lookalikes.
We send signed transactional email so phishing emails impersonating us are easier to flag.
We never email a password reset link. Password reset is in-app only. Anyone receiving an email "click here to reset your SSH Workbench password" is being phished, full stop.

Threat: subpoena / legal compulsion.

We can be compelled to hand over data we have. We have ciphertext, public keys, and email addresses. We do not have the master keys, the private keys, or the means to decrypt anything. We cannot be compelled to produce what we do not possess.
We publish a transparency report annually noting any law enforcement requests and what we provided.

5.10 What we as developers can see (operationally)

It is important to be honest and explicit about what the operators of the service have visibility into, because users will ask and trust depends on the answer being clear:

We can see	We cannot see
Email addresses	Passwords
Account creation date	Master keys
Login times and IPs	Vault contents (hostnames, usernames, ports)
Number of vault entries	Private SSH keys
Number of packs	Saved passwords
Pack metadata (count, last modified)	Pack names (encrypted)
Subscription status and billing info	Snippets, keyboard layouts, QB configs
Active web terminal session count	Which specific host a user is connected to via web terminal (host details are decrypted client-side and routed through the WebSocket; see §10)
Aggregate usage stats	Individual user behavior

This list goes verbatim into the public privacy policy. If anything on the right column ever moves to the left, it is a breaking change to the trust model and must be announced.

6. Database schema (Prisma)

The full Prisma schema for Phase 1 + Phase 4. Phase 1 only needs the first half; the rest is added when its phase ships.

// prisma/schema.prisma

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

// ─────────────────────────────────────────────────────────────
// Phase 1 — Auth + Vault sync
// ─────────────────────────────────────────────────────────────

model User {
  id                          String   @id @default(uuid())
  email                       String   @unique
  emailVerifiedAt             DateTime?

  // Authentication
  passwordSalt                Bytes    // 16 bytes — used by client to derive both auth_key and master_key
  argon2Hash                  String   // server-side Argon2id of the client-derived auth_key

  // Zero-knowledge key material
  x25519PublicKey             Bytes    // 32 bytes, plaintext
  encryptedX25519PrivateKey   Bytes    // wrapped with master_key (AES-256-GCM)

  // Subscription
  subscriptionStatus          SubscriptionStatus @default(FREE)
  subscriptionTier            SubscriptionTier   @default(FREE)

  // Org membership (Phase 4)
  orgId                       String?
  org                         Org?     @relation(fields: [orgId], references: [id])
  orgRole                     OrgRole?

  // Relations
  vault                       Vault?
  ownedPacks                  Pack[]   @relation("PackOwner")
  packMemberships             PackMember[]
  refreshTokens               RefreshToken[]
  keyboardLayouts             KeyboardLayout[]
  subscriptions               Subscription[]
  ownedOrgs                   Org[]    @relation("OrgOwner")

  createdAt                   DateTime @default(now())
  updatedAt                   DateTime @updatedAt

  @@index([email])
}

enum SubscriptionStatus {
  FREE
  ACTIVE
  PAST_DUE
  CANCELED
  GRANDFATHERED
}

enum SubscriptionTier {
  FREE
  PRO
  TEAMS
}

enum OrgRole {
  ADMIN
  MEMBER
}

// One vault per user. Stores the entire encrypted vault as a single blob
// in Phase 1 for simplicity. Phase 2+ may shard into per-entry rows once
// the size warrants it.
model Vault {
  id            String   @id @default(uuid())
  userId        String   @unique
  user          User     @relation(fields: [userId], references: [id], onDelete: Cascade)

  encryptedBlob Bytes    // AES-256-GCM(master_key, vault_json)
  version       Int      @default(1)

  updatedAt     DateTime @updatedAt
  createdAt     DateTime @default(now())
}

model RefreshToken {
  id         String   @id @default(uuid())
  userId     String
  user       User     @relation(fields: [userId], references: [id], onDelete: Cascade)

  tokenHash  String   @unique  // SHA-256 of the plaintext token
  familyId   String              // for theft detection: invalidate the whole family on reuse
  used       Boolean  @default(false)
  expiresAt  DateTime
  createdAt  DateTime @default(now())

  @@index([userId])
  @@index([familyId])
}

// ─────────────────────────────────────────────────────────────
// Phase 2 — Web dashboard + KB editor
// ─────────────────────────────────────────────────────────────

model KeyboardLayout {
  id            String   @id @default(uuid())
  userId        String
  user          User     @relation(fields: [userId], references: [id], onDelete: Cascade)

  // Layout metadata that the server may need (display name) is stored
  // encrypted under the user's master key. The server cannot read the name.
  encryptedBlob Bytes    // contains { name, json_layout }

  version       Int      @default(1)
  updatedAt     DateTime @updatedAt
  createdAt     DateTime @default(now())

  @@index([userId])
}

// Pack metadata for Phase 2 (solo) and Phase 4 (team).
// In Phase 2 the owner is also the only member; pack_members has one row.
model Pack {
  id            String   @id @default(uuid())
  ownerId       String
  owner         User     @relation("PackOwner", fields: [ownerId], references: [id], onDelete: Cascade)

  // Pack name and entry list, encrypted under pack_data_key
  encryptedBlob Bytes
  version       Int      @default(1)

  members       PackMember[]

  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt

  @@index([ownerId])
}

model PackMember {
  packId           String
  pack             Pack    @relation(fields: [packId], references: [id], onDelete: Cascade)

  userId           String
  user             User    @relation(fields: [userId], references: [id], onDelete: Cascade)

  // X25519 wrapping of the pack data key for this user
  ephemeralPubKey  Bytes   // 32 bytes
  wrappedDataKey   Bytes   // AES-256-GCM(HKDF(X25519(eph_priv, user_pub)), pack_data_key)

  addedAt          DateTime @default(now())

  @@id([packId, userId])
  @@index([userId])
}

// ─────────────────────────────────────────────────────────────
// Phase 4 — Teams
// ─────────────────────────────────────────────────────────────

model Org {
  id              String   @id @default(uuid())
  name            String
  ownerId         String
  owner           User     @relation("OrgOwner", fields: [ownerId], references: [id])

  members         User[]
  packAssignments OrgPackAssignment[]
  invites         OrgInvite[]

  // Billing
  stripeCustomerId String?
  seatCount        Int     @default(1)

  createdAt       DateTime @default(now())
  updatedAt       DateTime @updatedAt
}

// Records which org members have access to which packs.
// PackMember (above) is the actual cryptographic record;
// this table is the human-facing assignment that drives PackMember row creation.
model OrgPackAssignment {
  orgId          String
  org            Org    @relation(fields: [orgId], references: [id], onDelete: Cascade)

  packId         String
  memberUserId   String

  assignedBy     String   // user_id of admin who granted
  assignedAt     DateTime @default(now())

  @@id([orgId, packId, memberUserId])
}

model OrgInvite {
  id        String   @id @default(uuid())
  orgId     String
  org       Org      @relation(fields: [orgId], references: [id], onDelete: Cascade)

  email     String
  token     String   @unique  // signed invite token sent in email
  role      OrgRole
  expiresAt DateTime
  acceptedAt DateTime?
  createdAt DateTime @default(now())

  @@index([email])
}

// ─────────────────────────────────────────────────────────────
// Billing — used from Phase 1 onwards
// ─────────────────────────────────────────────────────────────

model Subscription {
  id                  String   @id @default(uuid())
  userId              String
  user                User     @relation(fields: [userId], references: [id], onDelete: Cascade)

  provider            BillingProvider
  productId           String              // google_play sku or stripe price id
  status              String              // provider-specific status string
  currentPeriodEnd    DateTime?
  externalSubscriptionId String?         // Google Play purchase token or Stripe sub id

  rawWebhookPayload   Json?               // last received payload, for debugging

  createdAt           DateTime @default(now())
  updatedAt           DateTime @updatedAt

  @@index([userId])
  @@unique([provider, externalSubscriptionId])
}

enum BillingProvider {
  GOOGLE_PLAY
  STRIPE
}

A few notes on the schema:

Phase 1 only requires User, Vault, RefreshToken, Subscription. Everything else is added in its phase. Prisma migrations make this clean.
The single-blob vault in Phase 1 is a deliberate simplification. We do not need per-entry sync until packs ship in Phase 2. A single blob fits in a Postgres row up to about 1 MiB comfortably (we set a hard limit of 5 MiB at the API layer to keep things sane). If a user genuinely has more than 5 MiB of vault, we shard then.
encryptedBlob is Bytes, not Json, because it is an opaque binary AEAD ciphertext. The server must never try to parse it.
Row-level security policies are added in a separate migration that runs raw SQL after Prisma's schema migration. We script this in prisma/migrations/<timestamp>_enable_rls/migration.sql.

7. API design (REST)

The API is REST + JSON, served over HTTPS only, versioned by URL prefix (/v1/...). Phase 1 needs only the auth + vault routes; later phases add more.

7.1 Authentication

POST   /v1/auth/register
       Body: { email, auth_key (base64), password_salt (base64),
               x25519_public_key (base64), encrypted_x25519_private_key (base64) }
       Response: 201 { user_id }
       Side effect: sends email verification

POST   /v1/auth/verify
       Body: { token }
       Response: 200 { ok: true }

POST   /v1/auth/login
       Body: { email, auth_key (base64) }
       Response: 200 { access_token, refresh_token, user: {...},
                       password_salt, encrypted_x25519_private_key }

POST   /v1/auth/refresh
       Body: { refresh_token }
       Response: 200 { access_token, refresh_token }
       Errors: 401 if reused (invalidates entire family)

POST   /v1/auth/logout
       Headers: Authorization: Bearer <access>
       Body: { refresh_token }
       Response: 204

POST   /v1/auth/change-password
       Headers: Authorization: Bearer <access>
       Body: { new_auth_key, new_password_salt,
               new_encrypted_vault, new_encrypted_x25519_private_key }
       Response: 200 { ok: true }
       Note: client must re-encrypt vault locally before this call.

7.2 Vault (Phase 1)

GET    /v1/vault
       Headers: Authorization: Bearer <access>
       Response: 200 { encrypted_blob (base64), version, updated_at }
                 404 if no vault yet (new user)

PUT    /v1/vault
       Headers: Authorization: Bearer <access>
       Body: { encrypted_blob (base64), expected_version }
       Response: 200 { version, updated_at }
                 409 if expected_version does not match (concurrent write conflict)

Phase 1 vault sync is simple: the client uploads the entire encrypted blob, the server stores it. The expected_version field gives us optimistic concurrency to detect concurrent writes from multiple devices. On conflict, the client re-fetches, merges, and retries.

7.3 Packs (Phase 2+)

GET    /v1/packs
       Response: 200 { packs: [{ id, encrypted_blob, version, updated_at, member_count }] }

POST   /v1/packs
       Body: { encrypted_blob, wrapped_data_key_for_self }
       Response: 201 { pack_id }

GET    /v1/packs/:id
       Response: 200 { id, encrypted_blob, version, members (with wrapping), updated_at }

PUT    /v1/packs/:id
       Body: { encrypted_blob, expected_version }
       Response: 200 { version, updated_at }

DELETE /v1/packs/:id
       Response: 204

POST   /v1/packs/:id/members
       Body: { invite_email }                       // if user does not exist yet
              OR
              { user_id, ephemeral_pub, wrapped_data_key }   // if user exists
       Response: 201 { added: true } | { invite_sent: true }

DELETE /v1/packs/:id/members/:user_id
       Response: 204
       Note: client should rotate the pack data key and PUT new entries afterwards

7.4 Keyboard layouts (Phase 2+)

GET    /v1/keyboard-layouts
POST   /v1/keyboard-layouts
PUT    /v1/keyboard-layouts/:id
DELETE /v1/keyboard-layouts/:id

Same shape as packs — opaque encrypted blobs with versioning.

7.5 Web SSH terminal (Phase 2)

WebSocket  wss://api.sshworkbench.com/v1/terminal
           Headers: Authorization: Bearer <access>
           Initial message from client: { host_details: {...} }
                                         (decrypted client-side, sent over TLS)
           Subsequent messages: terminal bytes (binary frames)

See §10 for the security model and the explicit zero-knowledge exception.

7.6 Subscriptions (Phase 1+)

POST   /v1/subscriptions/google-play/verify
       Body: { purchase_token, product_id, package_name }
       Response: 200 { status, current_period_end }
       Backend calls Google Play Developer API to verify the receipt server-side.

POST   /v1/subscriptions/stripe/webhook
       Body: Stripe webhook payload
       Headers: Stripe-Signature
       Response: 200 (must be idempotent — Stripe retries)

7.7 OpenAPI

Fastify schemas are exported as an OpenAPI 3 document at /v1/openapi.json (in development only) and committed as shared/openapi.yaml for the CLI and web client to consume. Both clients use generated code from this file.

8. Phase plan

Each phase is independently shippable. A phase only starts after the prior phase has real users.

Phase 1 — Auth + Vault sync (6-8 weeks solo)

Goal: vault backup and multi-device sync. No web dashboard, no packs, no teams.

Backend deliverables:

Fastify project skeleton with TypeScript, ESLint, Prettier, Vitest.
Prisma schema with User, Vault, RefreshToken, Subscription tables and migrations.
Routes: /v1/auth/{register,login,refresh,logout,change-password,verify}, /v1/vault GET/PUT, /v1/subscriptions/google-play/verify.
Email verification flow via Resend.
Server-side Google Play receipt verification.
Rate limiting, helmet, input validation, CORS.
RLS policies on Vault and RefreshToken.
Local Docker Compose dev environment.
Deployment to Railway with managed Postgres.
CI on GitHub Actions: lint, typecheck, test on PR.
Health check endpoint and basic Pino logging.

Mobile deliverables:

"Sign in" / "Create account" screens in the existing app.
Argon2id derivation of auth_key and master_key (reuse lib-vault-crypto, add a salt-domain-tag wrapper).
X25519 keypair generation on signup (use libsodium for Android — org.libsodium:libsodium-android).
Vault encryption + upload on every local mutation, debounced.
Vault download + decrypt on app start when logged in.
Conflict resolution UI (version mismatch → "another device made changes, merge?").
Offline mode: queue uploads, replay on reconnect.

Acceptance: A user signs up on phone A, adds 5 hosts, signs in on phone B, sees the same 5 hosts. They edit a host on phone B, the change appears on phone A within seconds. They go offline on phone A, edit, come back online, the change uploads. They lose phone A entirely; they buy phone C, sign in, vault is restored.

Value to ship: vault backup, multi-device sync. This alone is a credible Pro feature.

Not in this phase: packs, web dashboard, CLI, teams, keyboard layout sync, web terminal, Stripe.

Phase 2 — Web dashboard + KB editor + Solo Packs (8-10 weeks)

Goal: the visual keyboard layout editor that justifies a Pro subscription on the web. Solo packs as a bonus.

Backend deliverables:

New routes: /v1/keyboard-layouts/*, /v1/packs/* (solo flow only — owner is the only member), /v1/subscriptions/stripe/webhook.
Stripe integration for web signups.
Pack data key wrapping endpoint (server stores blobs; the wrapping happens client-side).

Web deliverables:

Next.js 14 project on Vercel.
Auth flow (login, signup, refresh) with the same wire protocol as the mobile app.
WebCrypto + argon2-browser for client-side key derivation.
Vault decryption in the browser, displayed in a list UI.
Visual keyboard layout editor (drag-and-drop key cells, live preview, save as encrypted blob). This is the headline feature.
Pack composer (drag hosts into named groups, save as encrypted blob).
Billing screen with Stripe Checkout.
Account settings: change password, change email, log out everywhere, export .swb.
Mobile-responsive layout (see §12 for the hard requirements).

Mobile deliverables:

Pack subscription UI: list packs, subscribe, see synced entries.
Keyboard layout sync: download user's layouts from cloud, apply.

Value to ship: the web dashboard with the KB editor is the main pro hook. Real subscription revenue starts here.

Phase 3 — CLI tool (4-6 weeks)

Goal: desktop and server users can swb connect from their terminal.

Deliverables:

Go module under cli/.
swb login, swb sync, swb list, swb connect <name>, swb export, swb import, swb logout.
Argon2id + AES-256-GCM in Go, byte-compatible with lib-vault-crypto.
OS keychain integration for refresh token storage.
Cross-compilation in CI for linux/amd64, linux/arm64, darwin/arm64, darwin/amd64, windows/amd64.
GitHub Releases with signed binaries.
Homebrew tap.
Test vector cross-compatibility test in CI: load a .swb produced by Android, decrypt in Go, assert equality.

Value to ship: no more "Termius is closed and limited and I want a real CLI". This is differentiator territory.

Phase 4 — Teams (10-12 weeks)

Goal: B2B revenue via per-seat billing.

Backend deliverables:

New tables: Org, OrgInvite, OrgPackAssignment. (PackMember was already added in Phase 2 but only ever had one row per pack.)
Org creation, invite, accept, member removal endpoints.
Per-seat billing via Stripe (subscription quantity = seat count).
Audit log table (append-only, admin-visible).

Web deliverables:

Team management screen: invite by email, see pending invites, see active members and their pack assignments, remove members.
Pack ACL UI: assign pack to specific members.
Audit log viewer.
Mobile-responsive (see §12).

Mobile/CLI deliverables:

X25519 wrapping flow: when admin adds a member, admin's client wraps pack_data_key for the new member.
Member's client unwraps on next sync.

Value to ship: B2B revenue, higher ARPU. Build only when there is real demand from companies asking.

9. Billing

9.1 Two providers, one truth

Mobile signups go through Google Play Billing (the existing in-app billing flow). Web signups go through Stripe. Both write to the same Subscription table. The provider and externalSubscriptionId columns let us disambiguate.

The user has a single subscription tier (Free / Pro / Teams) regardless of where they paid. If they have an active subscription via either provider, they get the features. We do not double-charge.

9.2 Google Play receipt verification

When the mobile app completes a purchase, it sends the purchase token to POST /v1/subscriptions/google-play/verify. The backend:

Calls the Google Play Developer API (androidpublisher.purchases.subscriptionsv2.get) with a service account credential.
Verifies the purchase is ACKNOWLEDGED, not refunded, not in a grace period that has elapsed.
Acknowledges the purchase if not already acknowledged (Google requires this within 3 days).
Upserts a Subscription row keyed on (provider, externalSubscriptionId).
Updates User.subscriptionStatus and User.subscriptionTier.

The service account JSON is stored as a base64-encoded environment variable.

9.3 Stripe webhooks

Stripe pushes subscription lifecycle events to POST /v1/subscriptions/stripe/webhook. The handler:

Verifies the Stripe-Signature header against the signing secret. Reject if invalid.
Parses the event type (customer.subscription.created, .updated, .deleted, invoice.payment_failed, etc.).
Idempotency: every Stripe event has a unique id. The handler records processed event IDs in a small table and short-circuits duplicates. Stripe retries on 5xx, so the handler must be safe to run twice.
Updates the Subscription row and the User.subscriptionStatus.

9.4 Periodic reconciliation

A cron job (Railway scheduled task) runs nightly to:

Re-verify all Google Play subscriptions (catch refunds and chargebacks that did not webhook).
Reconcile any drift between Subscription.status and User.subscriptionStatus.
Downgrade users whose subscription has ended past the grace period.

This is the safety net for missed webhooks.

10. Web SSH terminal — the one zero-knowledge exception

The web terminal is the single explicit exception to the zero-knowledge model in this entire architecture. It exists because users want it, and a browser cannot open a raw TCP socket. This section documents the exception honestly.

10.1 What happens

The user clicks "Open in web terminal" on a host in the dashboard.
The browser already has the host's plaintext details — it decrypted the vault locally to display the connection list. Hostname, port, username, private key (or password) are in JavaScript memory.
The browser opens an authenticated WebSocket to wss://api.sshworkbench.com/v1/terminal.
The browser sends an initial JSON message: { host, port, username, auth: { type: "key" | "password", value: "..." } }. This message contains the plaintext credentials necessary to open the SSH session.
The backend receives this message, spawns a node-pty PTY into an ssh subprocess (or uses the ssh2 library in-process), authenticates to the target host, and begins relaying bytes over the WebSocket.
xterm.js in the browser displays the session.
On disconnect (user closes tab, network drops, server kills idle session), the backend tears down the SSH connection and the PTY.

10.2 What this means for zero-knowledge

The backend, for the duration of an active web terminal session, holds the plaintext credentials for the connection in process memory. They are not written to disk, not logged, and not stored in the database. They exist for the lifetime of the WebSocket and are zeroed when the connection closes.

This is a real exception to the "server never sees plaintext credentials" promise we make for everything else. We document it in three places:

The privacy policy. Clearly: "When you use the web terminal feature, your connection credentials are sent to our servers over an encrypted channel and held in memory for the duration of the session. They are never written to disk and are erased when the session ends."
The dashboard UI. A small notice on the "Open in web terminal" button: "This sends your connection details to our server for the duration of the session. Use the desktop CLI or mobile app for a fully local experience."
The marketing site security page. Same explanation.

10.3 Why we accept this

The alternative is no web terminal feature, which the market expects.
The alternative-alternative is a WASM SSH client in the browser. This is technically possible (see Mozilla's wasi-ssh, or the ssh2 library compiled to WASM) but the user experience is significantly worse — slow handshakes, no support for hardware tokens, no support for SSH agent forwarding, larger bundle size, more bugs. We may revisit this in the future.
The mitigation is that the credentials live only in process memory of one stateless backend instance for at most the session lifetime. A server compromise during an active session can intercept that one user. A server compromise at any other time gives an attacker nothing useful.
Users who care strongly about this can simply not use the web terminal feature and use the mobile app or swb CLI instead. Both are fully zero-knowledge.

10.4 Implementation details

The WebSocket endpoint runs only on Railway, never on Vercel. Vercel's serverless model cannot host long-lived WebSockets.
Each session is handled in a child process (child_process.spawn for ssh or a Worker thread for ssh2) so a hostile server cannot trivially exhaust the main backend process.
Idle timeout: 30 minutes of no terminal activity → server-side disconnect.
Session-level resource limit: max concurrent terminal sessions per user (10 by default) to prevent abuse.
All session bytes are logged at level trace only — never stored, never forwarded. In production, trace logging is off.
The Pino logger's redaction strips the initial auth.value field from any logged message at all levels.

11. CLI tool design

11.1 Commands

swb login [--server <url>]              Authenticate. Stores refresh token in OS keychain.
swb logout                              Clears local credentials.
swb sync                                Pulls latest vault from cloud, decrypts locally.
swb list [--pack <name>]                Lists hosts.
swb connect <name|id>                   Decrypts host, execs system ssh.
swb export [--pack <name>] <file.swb>   Exports a vault snapshot or pack.
swb import <file.swb>                   Imports from a .swb file.
swb keys list
swb keys generate ed25519 --name <n>    Generates a keypair locally.
swb status                              Shows account status, sync state, subscription tier.
swb version

11.2 Local config

~/.config/swb/
├── config.json                   # server URL, last sync timestamp
├── vault.cache.enc               # encrypted vault cache (decrypted at command time)
└── refresh-token.enc             # additional fallback if OS keychain unavailable

config.json is plaintext (no secrets). vault.cache.enc is the same AES-256-GCM blob format as the cloud vault — the master key is derived from the user password each time a command runs that needs decryption (swb list, swb connect). This means commands prompt for the password unless SWB_PASSWORD is set in the environment (for scripting) or a session token cache is implemented (Phase 3 v2).

11.3 `swb connect` flow

$ swb connect prod-web-01
Password: ********
[swb] decrypted host: web1.example.com:22
[swb] decrypted key: ed25519 (work-key)
[swb] writing temp key to /run/user/1000/swb-tmp-Xa9k.key (mode 0600)
[swb] exec: ssh -i /run/user/1000/swb-tmp-Xa9k.key -p 22 deploy@web1.example.com
... interactive ssh session ...
[swb] removing temp key
$

The temp key lives in $XDG_RUNTIME_DIR (tmpfs on most Linux distros), is mode 0600, and is removed in a deferred cleanup when the SSH process exits. On macOS we use /tmp with the same mode. On Windows we use %LOCALAPPDATA%\Temp with restricted ACLs.

A v2 mode implements an SSH agent (SSH_AUTH_SOCK) so the key is never written to disk at all. Use golang.org/x/crypto/ssh/agent.

11.4 Cross-compatibility with Android

The CLI must read and write .swb files produced by the Android app and vice versa. The contract is:

Test vectors in shared/test-vectors/ contain .swb files produced by lib-vault-crypto JNI on Android, with a known plaintext (in *.plaintext.json) and a known password (in *.password.txt).
CI on the CLI repo decrypts each test vector and asserts equality with the plaintext.
CI on the Android app does the inverse: takes a .swb produced by the Go CLI, decrypts via JNI, asserts equality.
Both directions must pass on every commit. This is the contract that keeps the two implementations in sync.

The Argon2id parameters, AES-GCM nonce length, tag length, header byte layout, and JSON serialization of the plaintext payload are all fixed and documented in lib-vault-crypto/README.md as the canonical spec.

12. Web dashboard mobile experience

The web dashboard must be fully responsive and usable on mobile browsers. This is not optional, and it is not a nice-to-have to be added later. Admins will frequently need to manage teams from their phone when they are away from a desk: revoke an employee's access from the airport, add a host on a train, check who is connected during an incident. If the dashboard is desktop-only, the dashboard is broken.

12.1 Design rules

Mobile-first. Build every screen at the 390px viewport (iPhone 15) and scale up. Do not build a desktop layout and squeeze it down. The two layouts produce different decisions (sidebar vs bottom nav, multi-column vs stacked, hover affordances vs tap targets) and the squeeze-down approach always loses.
Bottom navigation bar on mobile, sidebar on desktop. The breakpoint is md (768px) in Tailwind. Below it, a fixed bottom nav with the four main sections (Vault, Packs, Team, Account). Above it, a sidebar.
Tap targets are at least 44×44 pt (the iOS HIG recommendation). shadcn/ui defaults are usually fine but icon-only buttons need explicit padding.
No hover-only affordances. Anything that reveals on hover must also reveal on tap.
Forms use the right input types. inputmode="email", inputmode="numeric", autocomplete="...", autocorrect="off" on credential fields. No tiny date pickers.
shadcn/ui components are responsive by default, but every critical flow must be explicitly tested at 390px during development. Storybook with viewport addons makes this enforceable.

12.2 Critical flows that must work perfectly on mobile

Login. Including TOTP if enabled. Including biometric WebAuthn if enabled.
View org members and their pack assignments. Read-only is the common case on mobile.
Revoke a member's access. This is the urgent case (lost device, employee leaving). It must be reachable in two taps from the dashboard home, and it must complete in one screen with confirmation.
Add or remove a host from a pack. Swipe-to-delete on the list, plus an explicit "Add host" button.
Invite a new member by email. Single text field, role dropdown, send button.
View active web terminal sessions (read-only on mobile is fine). For an admin to see "is anyone connected via web terminal right now?".
Change own password. Including the full vault re-encryption flow, with a clear progress indicator because this can take a few seconds.
Manage subscription / billing. Updating payment method via Stripe Checkout, viewing invoices, cancelling.

Each of these has an automated browser test running at 390px width as part of CI. A regression in mobile usability fails the build.

12.3 The one exception: visual keyboard layout editor

The visual keyboard layout editor is a complex drag-and-drop canvas. Building this for mobile is not worth the effort. On mobile we show:

"The keyboard layout editor needs more screen space than a phone has. Open this page on a desktop or tablet browser to edit your layouts."

…with a button to email a "Continue on desktop" magic link to the user's address. A simplified read-only list of saved layouts is acceptable for mobile (so the admin can see what layouts exist), but no editing.

This is the only screen where we accept "use desktop" as the answer. Every other screen must work on a phone.

12.4 Tooling

Tailwind responsive utilities (sm:, md:, lg:) for layout breakpoints.
shadcn/ui components for the building blocks (already responsive).
Headless UI for any custom interactive components we build (focus management, ARIA).
Playwright with mobile viewport configurations in CI.
Lighthouse mobile audit on every dashboard release, target Performance ≥85 and Accessibility ≥95.

13. Development setup

13.1 Local environment

docker-compose.yml:
  postgres:    image postgres:16, port 5432, volume for data
  backend:     build ./backend, depends on postgres, port 3000
  web:         build ./web, port 3001 (uses backend at localhost:3000)
  mailhog:     image mailhog/mailhog, port 8025 (catches Resend emails in dev)

pnpm install in backend/ and web/ (pnpm because it is faster and uses less disk than npm/yarn). go mod tidy in cli/.

docker compose up brings the stack up. Migrations run automatically on backend start (prisma migrate deploy).

13.2 Environment variables

.env.example is committed with placeholders. .env.local is gitignored. Backend reads from process env at startup, fails fast on any missing required variable.

DATABASE_URL=postgresql://swb:swb@localhost:5432/swb_dev
JWT_SECRET=<256-bit random>
GOOGLE_PLAY_SERVICE_ACCOUNT_JSON_BASE64=<...>
STRIPE_SECRET_KEY=sk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...
RESEND_API_KEY=re_...
R2_ACCOUNT_ID=...
R2_ACCESS_KEY_ID=...
R2_SECRET_ACCESS_KEY=...
R2_BUCKET=swb-vaults
WEB_ORIGIN=http://localhost:3001

13.3 CI

GitHub Actions, three workflows:

backend.yml: lint, typecheck, test, prisma validate, run on every PR touching backend/**.
web.yml: lint, typecheck, test, Playwright mobile + desktop viewports, run on every PR touching web/**.
cli.yml: go test, cross-compile to all five targets, run cross-compatibility tests against test vectors.

Branch protection on main: all workflows must pass, one approving review (or self-merge for solo dev with squash commits and signed commits required).

13.4 Deployment

Backend → Railway. GitHub integration, auto-deploys main branch. Run prisma migrate deploy as a release command. Health check at /health returning { status: "ok", version, uptime }.
Web frontend → Vercel. GitHub integration, auto-deploys main. Preview deployments on every PR.
CLI → GitHub Releases. Tag-triggered workflow builds all five binaries and uploads them with checksums and signatures.

13.5 Observability

Logs: Pino → stdout → Railway log retention (30 days). Structured JSON.
Metrics: Railway built-in CPU/RAM/request counts. Add Prometheus exporter when we outgrow it.
Errors: Sentry SDK in backend and web. Source maps uploaded on deploy. Alerts configured for error rate spikes.
Uptime: Better Stack (or any equivalent) pinging /health every minute. SMS alerts if down.
No APM (Datadog / New Relic) until we have real load. Premature.

14. What NOT to build yet

These are deliberately deferred. Each one is a tempting time sink. Resist.

Mobile push notifications. Nice-to-have. Requires APNs/FCM integration, certificate management, per-user device tokens, opt-in flow. Add when there is a concrete use case (e.g. "your subscription is about to expire" — but Stripe and Google Play already email this).
Audit logs (general). Add only in the Teams phase, where admins need to see "who did what". Solo users do not need this. Keep audit log scope narrow (admin actions only) to avoid storing user behavior.
SSO / SAML / SCIM. Enterprise-only. Adds significant complexity. Defer until a customer with real money asks for it. Then build it as part of an Enterprise tier with custom pricing.
Self-hosted backend / on-prem option. Strong differentiator on paper, massive support burden in practice ("it does not work on our weird network", "the upgrade broke our setup", "can you patch this bug we hit on Postgres 12"). Only consider this if a single customer pays enough to justify a dedicated support tier.
Native desktop apps (Mac/Windows/Linux GUI). The CLI plus the web terminal cover the desktop use case. Building Electron apps is a multi-month side project that competes with the web dashboard for engineering time.
A mobile app for iOS. SSH Workbench is Android-first. iOS is a second app, second store, second review process, second crypto stack, second UI codebase. Defer until the Android side is profitable enough to fund a native iOS rebuild (or until the team grows beyond solo).
End-to-end encrypted chat / collaboration features. Tempting to "since we already have crypto…" — no. Stay focused. SSH workbench is not Slack.
AI features. No. Not yet. Maybe in two years if there is a compelling concrete use case.
Marketplace for shared keyboard layouts / packs. Interesting idea, lots of moderation overhead, no clear revenue. Defer.
Custom branded vault domains for enterprise. Defer until there is a real enterprise customer.
GraphQL. REST is enough. GraphQL would be a third API style alongside REST and the WebSocket. Resist.
Microservices. Single Node process until it actually cannot scale. That will be a long time.

The general principle: say no to everything that is not in the current phase. Write down the deferred ideas (this section is the deferred-ideas list). Revisit them when the current phase ships.

15. Risk and complexity — honest assessment

15.1 The X25519 team key wrapping is the most complex piece

The wrapping protocol in §5.4 is correct in the abstract but easy to subtly mis-implement. Specific risks:

Wrong HKDF info string → keys derived correctly today but incompatible with a future client that uses a different string. Pin the info string in a constant in shared/, use it in all three implementations (backend reads it for nothing, but it appears in the test vectors).
Nonce reuse in pack data key encryption → catastrophic confidentiality break for AES-GCM. We must use random 12-byte nonces per encryption, never derived nonces.
Forgetting to rotate pack_data_key on member removal → revoked member can decrypt new entries until rotation. Make rotation a server-side enforced step, not a client-side good-intention step: when the server processes a member removal, it bumps a data_key_version field on the pack, and the client refuses to upload new entries until it has performed the rotation.
Implementing X25519 from scratch → do not. Use libsodium on Node and Android. Use golang.org/x/crypto/curve25519 on the Go CLI. These are vetted.

Mitigation: before shipping Phase 4 (Teams), pay for a security review of the wrapping protocol from a real cryptographer. Cure53, Trail of Bits, NCC Group all do small targeted reviews. Budget: $5-15k for a focused 1-week review of the wrapping protocol and its three implementations. This is the single most important QA spend on the entire backend.

15.2 The web SSH terminal is the one zero-knowledge exception

Documented in §10. The risk is not technical (the implementation is straightforward) but trust-related. Users who care about zero-knowledge will notice the gap. The mitigation is honesty: document it clearly in the privacy policy, the dashboard UI, and the marketing site. Do not hide it. Users who care can use the mobile app or CLI instead.

15.3 Google Play + Stripe billing sync is fiddly

Two billing systems writing to one Subscription table is a recipe for race conditions and double-charging. The mitigations:

Idempotent webhook handlers keyed on (provider, externalSubscriptionId, eventId). Process each event at most once.
Reconciliation cron nightly to catch missed events.
Test mode coverage: Stripe has a test mode and Google Play has a test track. Use both. Run test purchases and refunds end-to-end before every billing-related deploy.
Error budget: have a runbook for "user paid but does not have access" that triages within hours, not days. Manual override is acceptable while the bug is being fixed.

The real risk is not technical correctness; it is angry users when something goes wrong with their payment. Be responsive to support requests, refund cheerfully when in doubt, and keep the manual override path well-tested.

15.4 Solo developer bandwidth

Phases 1 and 2 are realistic for one senior developer working full time:

Phase 1: 6-8 weeks full time. Mostly backend skeleton, mobile app integration, deployment plumbing.
Phase 2: 8-10 weeks full time. The visual keyboard editor is the long pole — building a polished drag-and-drop interface that works on desktop and mobile (with the mobile fallback per §12) takes time.

Phases 3 and 4 are conditional on Phase 1 having real users:

Phase 3 (CLI): 4-6 weeks. Build only if there is feedback from Phase 1 users asking for it. The CLI is a differentiator, not a starting feature.
Phase 4 (Teams): 10-12 weeks plus the cryptographer review. Build only if there is concrete demand from companies (multiple inbound requests, willingness to pre-pay). Teams adds support burden, billing complexity, and a cryptographic protocol that has to be correct.

The hard rule: do not build Phase 2+ until Phase 1 has real users. If Phase 1 launches and nobody signs up, Phase 2 will not save it. The product hypothesis ("users want cloud sync of their SSH vault") has to validate before we invest more.

15.5 The biggest risk overall

The biggest risk is not technical. It is building all of this and then nobody using it. The mobile app is currently a free standalone client. Adding a backend means asking users to trust us with anything (even encrypted blobs), to create an account, to potentially pay. That is a much bigger ask than "install a free SSH client".

Mitigation:

Ship the free mobile app first. Get to 10,000+ users on Play Store. That validates the product itself.
Add the lifetime Pro purchase (~€6.99) for local features. If 1% of free users buy it, we have validated willingness to pay and earned a small budget.
Only then build Phase 1 of the backend. A user base of paying customers is the prerequisite for the cloud product, not the consequence of it.

If steps 1 and 2 do not produce real revenue, do not build a backend. Keep the mobile app as a polished free product, and find a different business model (sponsorship, donations, consulting, or simply maintaining it as a hobby).

16. Summary

We are building a zero-knowledge cloud backend that adds vault sync, packs, team sharing, a web dashboard, and a CLI to the SSH Workbench mobile app. The mobile app remains the core product and works fully without the backend.

The stack is Node + TypeScript + Fastify + Postgres + Prisma on the backend, Next.js on the web, Go on the CLI. Hosted on Railway (backend + WebSocket terminal) and Vercel (frontend pages). All cryptography is client-side, reusing the existing lib-vault-crypto Argon2id + AES-256-GCM stack. Team key sharing uses X25519 wrapping built on the design already sketched in FUTURE.md Option C.

The plan is incremental. Phase 1 ships vault sync. Phase 2 ships the web dashboard with the visual keyboard editor — the headline pro feature. Phase 3 ships the CLI. Phase 4 ships Teams. Each phase only starts after the prior phase has real users.

The biggest risks are the X25519 wrapping protocol (mitigation: external cryptographer review before Phase 4), the web SSH terminal as the documented zero-knowledge exception (mitigation: honesty), and solo developer bandwidth (mitigation: do not build Phase 2+ until Phase 1 validates).

Build the free mobile app first, add a lifetime Pro purchase second, build the backend third — and only if the prior steps work.

76 KiB Raw Permalink Blame History Unescape Escape

SSH Workbench — Cloud Backend Technical Architecture

1. Purpose of this document

2. Hard constraints

3. Stack — recommended choices and justifications

3.1 Runtime: Node.js (LTS) with TypeScript

3.2 Framework: Fastify

3.3 Database: PostgreSQL (managed, single instance)

3.4 ORM: Prisma

3.5 Authentication: email + password (Argon2id), JWT + rotating refresh tokens

3.6 Blob storage: Cloudflare R2

3.7 Email: Resend

3.8 Hosting: Railway for backend + Postgres + WebSocket; Vercel for the Next.js frontend

3.9 Web frontend: Next.js 14 (App Router) + Tailwind + shadcn/ui

3.10 Web terminal: xterm.js + node-pty bridge

3.11 CLI tool: Go

3.12 Encryption on the backend: never

4. Repository layout

5. Security architecture

5.1 What the server stores vs what it never sees

5.2 Master key derivation and the auth key separation

5.3 Vault encryption

5.4 Team key sharing — X25519 wrapping, step by step

5.5 JWT and refresh token security

5.6 API security

5.7 Database security: row-level security (RLS)

5.8 Secrets management

5.9 Threat model

5.10 What we as developers can see (operationally)

6. Database schema (Prisma)

7. API design (REST)

7.1 Authentication

7.2 Vault (Phase 1)

7.3 Packs (Phase 2+)

7.4 Keyboard layouts (Phase 2+)

7.5 Web SSH terminal (Phase 2)

7.6 Subscriptions (Phase 1+)

7.7 OpenAPI

8. Phase plan

Phase 1 — Auth + Vault sync (6-8 weeks solo)

Phase 2 — Web dashboard + KB editor + Solo Packs (8-10 weeks)

Phase 3 — CLI tool (4-6 weeks)

Phase 4 — Teams (10-12 weeks)

9. Billing

9.1 Two providers, one truth

9.2 Google Play receipt verification

9.3 Stripe webhooks

9.4 Periodic reconciliation

10. Web SSH terminal — the one zero-knowledge exception

10.1 What happens

10.2 What this means for zero-knowledge

10.3 Why we accept this

10.4 Implementation details

11. CLI tool design

11.1 Commands

11.2 Local config

11.3 swb connect flow

11.4 Cross-compatibility with Android

12. Web dashboard mobile experience

12.1 Design rules

12.2 Critical flows that must work perfectly on mobile

12.3 The one exception: visual keyboard layout editor

12.4 Tooling

13. Development setup

13.1 Local environment

13.2 Environment variables

13.3 CI

13.4 Deployment

13.5 Observability

14. What NOT to build yet

15. Risk and complexity — honest assessment

15.1 The X25519 team key wrapping is the most complex piece

15.2 The web SSH terminal is the one zero-knowledge exception

15.3 Google Play + Stripe billing sync is fiddly

15.4 Solo developer bandwidth

15.5 The biggest risk overall

16. Summary

76 KiB

Raw Permalink Blame History

11.3 `swb connect` flow