# Xtablo backend Go + HTMX + Postgres. Phase 1: Walking Skeleton. This README is the contract for FOUND-05: a developer with the prerequisites below should be able to clone the repo, follow the Quickstart, and see the HTMX-driven page within ~5 minutes. ## Prerequisites Install these on your dev machine before starting: - **Go** ≥ 1.22 (this project's `go.mod` declares 1.26) - **just** — task runner (`brew install just` on macOS, `cargo install just`, or see ) - **podman** with `podman compose` (preferred per D-11) **or** **docker** with `docker compose` - **curl** - **git** You do **not** need to install `goose`, `templ`, `sqlc`, `air`, the Tailwind CLI, or `htmx.min.js` — `just bootstrap` installs the Go tools into `$GOBIN` and bootstrap-downloads the Tailwind binary and HTMX script into local, gitignored paths. ## Quickstart Clone-to-running-page in ~5 minutes. Run from inside `backend/`. ``` cd backend cp .env.example .env # adjust DATABASE_URL if Postgres is not on localhost:5432 just bootstrap # installs goose/templ/sqlc/air; bootstrap-downloads tailwindcss + htmx.min.js just db-up # starts postgres via podman compose (see fallback below) just migrate up # applies migrations from ./migrations just dev # terminal 1: brings up db, runs generate, then air on :8080 # in a SECOND terminal: just styles-watch # rebuilds static/tailwind.css on .templ / .go changes # open http://localhost:8080 ``` The page should render with a "Fetch server time" button. Clicking it swaps an ISO-8601 timestamp into the page via HTMX. If the page shows "No time fetched yet." and nothing happens on click, see Troubleshooting. `bootstrap` is the slowest step (Go tool installs + two HTTP downloads). It only needs to run once per clone. ## docker compose fallback `compose.yaml` is portable across podman and docker — the service definition is identical. If you don't have podman: - Replace `podman compose` with `docker compose` mentally throughout this README. - The `just db-up` / `just db-down` recipes call `podman compose` directly. Run `docker compose up -d postgres` / `docker compose down` instead, and continue with the rest of the Quickstart unchanged. (Decision D-11.) ## Project layout ``` backend/ cmd/ web/main.go # HTTP server entry point worker/main.go # background worker — river periodic jobs (Phase 6) internal/ db/ # pgxpool wiring + sqlc-generated queries web/ # chi router, handlers, middleware, design-system ui/ # custom templ component library (Button, Card, Badge) session/ # placeholder — Phase 2 tablos/ # placeholder — Phase 3 tasks/ # placeholder — Phase 4 files/ # placeholder — Phase 5 migrations/ # goose .sql migrations templates/ # .templ files (layout, index, fragments) static/ htmx.min.js # bootstrap-downloaded by `just bootstrap`; gitignored; no runtime CDN tailwind.css # generated by the Tailwind standalone CLI bin/ # gitignored — tailwindcss CLI binary, etc. .air.toml # air live-reload config .env.example # committed; copy to .env compose.yaml # local Postgres go.mod / go.sum justfile # task runner recipes — the source of truth for commands sqlc.yaml tailwind.input.css README.md ``` HTMX is served from `/static/htmx.min.js` at runtime — no CDN. The justfile's bootstrap-time `unpkg.com` URL is the single authoritative version pin (D-10). ## Environment variables `backend/.env` is gitignored; `backend/.env.example` is committed and lists the keys consumed by `cmd/web` and `cmd/worker`. Local Just recipes load `backend/.env` automatically, so `just dev` will pick up provider credentials such as `GOOGLE_CLIENT_ID`. | Variable | Description | Default | | ------------------------ | ------------------------------------------------------------------------ | ---------------------------------------------------------------- | | `DATABASE_URL` | Postgres DSN used by the web + worker binaries and by `just migrate` | `postgres://xtablo:xtablo@localhost:5432/xtablo?sslmode=disable` | | `PORT` | HTTP port for `cmd/web` | `8080` | | `ENV` | `development` enables slog's text handler; `production` switches to JSON | `development` | | `GOOGLE_CLIENT_ID` | Google OAuth client ID | blank | | `GOOGLE_CLIENT_SECRET` | Google OAuth client secret | blank | | `GOOGLE_REDIRECT_URL` | Google callback URL, usually `/auth/google/callback` | `http://localhost:8080/auth/google/callback` | Google config is optional in local development. When it is missing, the login and signup pages keep the Google button visible but disabled with a not-configured label. No real provider secrets should be committed to `.env.example`. Apple sign-in is disabled in the current product surface. ## Common commands Every command in this table is a recipe in `backend/justfile`. | Recipe | What it does | When to use | | ----------------------------------------------- | ---------------------------------------------------------------------------- | -------------------------------------------------------- | | `just bootstrap` | Installs Go CLI tools (`goose`, `templ`, `sqlc`, `air`); bootstrap-downloads `bin/tailwindcss` and `static/htmx.min.js` | Once per clone; re-run after deleting `bin/` or `static/htmx.min.js` | | `just db-up` | Starts the local Postgres container | Before `just migrate up` / `just dev` if not already running | | `just db-down` | Stops the local Postgres container | When you're done for the day | | `just migrate up` / `migrate down` / `migrate status` | Applies / reverts / inspects goose migrations against `DATABASE_URL` | After `just db-up`, or any time you change `migrations/` | | `just generate` | One-shot: `templ generate`, `sqlc generate`, Tailwind compile to `static/tailwind.css` | After editing `.templ`, query SQL, or `tailwind.input.css` | | `just styles-watch` | Tailwind standalone CLI in `--watch` mode | In a second terminal alongside `just dev` (D-14) | | `just dev` | Loads `backend/.env`, brings up Postgres, runs `just generate`, then runs `air` for Go live-reload on `:8080` | Main dev loop, terminal 1 | | `just test` | `templ generate` then `go test ./...` | Before committing | | `just lint` | `go vet ./...` and `gofmt -l` check | Before committing | | `just build` | Generates assets, then builds `bin/web` and `bin/worker` | Producing release binaries locally | | `just clean` | Removes `bin/`, `tmp/`, `static/htmx.min.js`, `static/tailwind.css`, and `*_templ.go` files | Reset to a fresh-clone state without dropping the Postgres volume | ## Running the Worker `cmd/worker` is the background job processor. It runs river periodic jobs against the same Postgres as `cmd/web`. Start it with: ``` just worker ``` This requires `just db-up` (handled automatically as a dependency) and MinIO running (used by the orphan-file cleanup job). If MinIO is not running, the worker will exit on startup with "file store init failed". ### What to expect - Structured logs appear immediately at startup. - A `"worker ready"` log line appears within a few seconds after `rivermigrate` and S3 init complete. - A `"worker heartbeat"` log line appears almost immediately (the heartbeat job is configured with `RunOnStart: true`, so it fires on the first scheduler tick which happens within seconds of startup). - Subsequent heartbeat logs appear every ~1 minute. - The orphan-file cleanup job runs every hour (no `RunOnStart` — first run is ~1 hour after startup). ### Single-worker constraint **Run only one worker process at a time (v1).** River uses advisory locks for leader election and concurrent rivermigrate runs are unsafe. Do not run multiple worker instances against the same database in this version. ### Graceful shutdown Send SIGINT (Ctrl+C) and observe: ``` {"level":"INFO","msg":"shutting down"} {"level":"INFO","msg":"shutdown complete"} ``` The worker calls `riverClient.StopAndCancel` with a 10-second timeout, which cancels in-flight job contexts and waits for goroutines to exit before closing the pool. ### Observing failed job retries River logs each failure via the `SlogErrorHandler`. A failed job produces a log line like: ``` {"level":"ERROR","msg":"job error","job_id":42,"job_kind":"heartbeat","attempt":1,"max_attempts":25,"err":"..."} ``` River retries up to 25 times with exponential backoff (`attempts^4` + jitter). After 25 failed attempts the job is moved to the discarded state in `river_job`. ## Troubleshooting The three issues most likely to trip you up on a fresh clone: - **"Fresh clone fails to build with `undefined: templates.Index`"** — Templ generates `*_templ.go` files from `.templ` sources, and those generated files are not committed. Run `just generate` (or `just dev`, which calls it) before invoking `go build` directly. (Pitfall 1.) - **"First request to `/healthz` returns 503 right after `just db-up`"** — The Postgres container needs ~5–10 seconds to become healthy after `podman compose up -d` returns. Check `podman compose ps` (or `docker compose ps`) for the `healthy` status, or just wait and retry. Subsequent calls succeed. The 503 during warm-up is correct behavior, not a bug. (Pitfall 2.) - **"Tailwind classes used in `.templ` files don't appear in the compiled CSS"** — Tailwind v4 only scans content paths declared via `@source` in `tailwind.input.css`. Confirm the file contains `@source "../templates/**/*.templ";` (and equivalent globs for `internal/web/**/*.go`). Re-run `just styles-watch` so the watcher picks up the config change. (Pitfall 3.) If something else is wrong and you want a clean slate without dropping the Postgres volume: ``` just clean # removes bin/, tmp/, static/htmx.min.js, static/tailwind.css, *_templ.go just bootstrap # re-download tools and assets just dev # back to a working state ``` Run `just db-down` first if you also want to drop the Postgres container. ## What Phase 1 ships (and doesn't) **Ships:** - Project scaffold (`go.mod`, justfile, `.air.toml`, `tailwind.input.css`, `sqlc.yaml`, `compose.yaml`) - Local Postgres via `compose.yaml` (`pg_isready` healthcheck) - goose migration pipeline (`migrations/0001_init.sql` is a no-op bootstrap) - chi router with `/`, `/healthz`, `/demo/time`, `/static/*` - slog-based structured logging with RequestID middleware - Graceful HTTP shutdown - pgxpool wiring exercised by `/healthz` - templ + HTMX demo (root page + `hx-get` round-trip to a templ fragment) - Custom templ design-system package at `internal/web/ui/` (Button, Card, Badge) - Live-reload dev loop (`just dev` + `just styles-watch`) - `cmd/worker` skeleton (boot, log, idle, shutdown) **Does not ship — deferred:** - Authentication, sessions, users → Phase 2 - Tablos CRUD → Phase 3 - Tasks / kanban → Phase 4 - File uploads + R2/S3 → Phase 5 - Real worker jobs → Phase 6 - Production deploy, Dockerfile, `/readyz` → Phase 7 ## Deploy The production host is a Hetzner VM running plain Docker Compose (D-01, D-02). No Kubernetes or managed orchestration is needed — `docker compose up -d` on the VM is the entire deployment mechanism. Postgres runs inside the compose stack (D-03); there is no external managed database. ### Prerequisites Install on the production VM before first deploy: - **Docker** ≥ 24 with the **Docker Compose** plugin (`docker compose` — not the standalone `docker-compose` binary) - **git** (optional — useful for pulling the repo directly onto the VM) No other runtimes are needed. Go, Node, and all build tooling run in the Dockerfile's multi-stage build and are not required on the VM. ### First-time setup Run all commands on the VM via SSH unless noted otherwise. 1. **SSH to the VM.** ``` ssh user@ ``` 2. **Copy the `backend/` directory to the VM** (or clone the repo). ``` # Option A — rsync from local machine: rsync -av --exclude '.git' backend/ user@:~/xtablo/ # Option B — clone the repo directly on the VM: git clone ~/xtablo && cd ~/xtablo/backend ``` 3. **Create `.env.prod`** by copying `.env.example` and filling in real values. ``` cp .env.example .env.prod chmod 600 .env.prod # restrict read access — file contains secrets (T-07-10) ``` Mandatory variables to set in `.env.prod`: | Variable | Value | |---|---| | `DATABASE_URL` | `postgres://xtablo:@postgres:5432/xtablo?sslmode=disable` (internal compose network — hostname is `postgres`) | | `POSTGRES_PASSWORD` | Strong random password (also used by the postgres service). Example: `openssl rand -hex 24` | | `POSTGRES_USER` | `xtablo` (or your custom user; must match `DATABASE_URL`) | | `POSTGRES_DB` | `xtablo` (or your custom db; must match `DATABASE_URL`) | | `SESSION_SECRET` | 32 random bytes hex-encoded. Generate with: `openssl rand -hex 32` | | `S3_ENDPOINT` | R2 endpoint URL: `https://.r2.cloudflarestorage.com` | | `S3_BUCKET` | R2 bucket name | | `S3_ACCESS_KEY` | R2 API token key ID | | `S3_SECRET_KEY` | R2 API token secret | | `S3_USE_PATH_STYLE` | `false` for Cloudflare R2 (virtual-hosted-style URLs) | | `S3_REGION` | `auto` or `us-east-1` (R2 accepts both) | | `MAX_UPLOAD_SIZE_MB` | `25` (or your preferred limit) | | `ENV` | `production` (activates JSON slog handler) | | `PORT` | `8080` | | `DOMAIN` | `app.yourdomain.com` (Caddy reads this for TLS) | Do **not** include `TEST_DATABASE_URL` in `.env.prod` — it is a dev/test-only variable and is not used by the runtime binaries. 4. **Build the Docker image** (from inside `backend/` — either locally or on the VM). ``` # From inside backend/ docker build -f Dockerfile -t ghcr.io/yourusername/xtablo:v0.1.0 . ``` If building locally, push to a registry and pull on the VM: ``` docker push ghcr.io/yourusername/xtablo:v0.1.0 # On the VM: docker pull ghcr.io/yourusername/xtablo:v0.1.0 ``` 5. **Set image coordinates as environment variables** (used by `docker-compose.prod.yaml`). ``` export IMAGE=ghcr.io/yourusername/xtablo export TAG=v0.1.0 ``` 6. **Start the stack.** ``` docker compose -f docker-compose.prod.yaml --env-file .env.prod up -d ``` The postgres service must pass its healthcheck before web and worker start. Migrations run automatically at web startup via `goose.Up()` (D-10). 7. **Verify the deployment.** ``` curl https://app.yourdomain.com/healthz # → {"status":"ok"} curl https://app.yourdomain.com/readyz # → {"status":"ok","db":"ok"} ``` If the domain is not yet configured, use the VM's public IP temporarily with HTTP (Caddy will not yet have a certificate): ``` curl http://:80/healthz ``` 8. **Let's Encrypt staging (for initial TLS testing).** To avoid hitting Let's Encrypt production rate limits (5 duplicate certificates per week per domain) during initial setup, uncomment the staging global block in `deploy/Caddyfile`: ``` { acme_ca https://acme-staging-v02.api.letsencrypt.org/directory } ``` Restart Caddy after editing (`docker compose -f docker-compose.prod.yaml restart caddy`), verify TLS works (browsers will show a staging cert warning — that is expected), then remove the global block and clear the `caddy_data` volume to issue a real production certificate. ### Deploying a new version 1. **Build and tag the new image** (same as first-time, with a new tag): ``` docker build -f Dockerfile -t ghcr.io/yourusername/xtablo:v0.2.0 . docker push ghcr.io/yourusername/xtablo:v0.2.0 # if using a registry ``` 2. **On the VM** — update `TAG` in `.env.prod`: ``` # Edit .env.prod: TAG=v0.2.0 ``` Or pass it inline without editing the file: ``` export TAG=v0.2.0 ``` 3. **Pull and recreate only the changed services:** ``` docker compose -f docker-compose.prod.yaml --env-file .env.prod up -d ``` Compose recreates only the web and worker containers (their image tag changed). Postgres and Caddy are unaffected. Migrations run automatically at web startup (D-10) — `goose.Up()` is idempotent and skips already-applied migrations. ## Rollback Rollback means redeploying the previous image tag (D-11). No special tooling is required — it is the same as deploying a new version, but with an older tag. 1. **On the VM** — set `TAG` to the previous tag in `.env.prod` (or inline): ``` export TAG=v0.1.0 ``` 2. **Redeploy:** ``` docker compose -f docker-compose.prod.yaml --env-file .env.prod up -d ``` Compose recreates web and worker with the old image. The rollback is complete. ### Schema rollback (break-glass) `goose.Up()` is idempotent — rolling back to a previous binary does not automatically run `goose down`. In most cases this is fine: the old binary ignores columns it does not know about. If a migration introduced a schema change that is **incompatible** with the old binary (e.g. a NOT NULL column without a default that the old binary does not supply), run a manual goose down as a break-glass step: 1. Connect to Postgres inside the container: ``` docker exec -it psql -U xtablo -d xtablo ``` (Find the container name with `docker compose -f docker-compose.prod.yaml ps`.) 2. The production image is distroless — the `goose` CLI is not inside the runtime container. Install the goose CLI separately on the VM or use the goose Docker image against the internal network: ``` # Install goose CLI on the VM: go install github.com/pressly/goose/v3/cmd/goose@latest goose -dir ./migrations postgres "$DATABASE_URL" down ``` Or use an ephemeral container on the same compose network: ``` docker run --rm --network \ -e GOOSE_DRIVER=postgres \ -e GOOSE_DBSTRING="postgres://xtablo:@postgres:5432/xtablo?sslmode=disable" \ -v $(pwd)/migrations:/migrations \ ghcr.io/kukymbr/goose-docker:latest \ goose -dir /migrations down ``` After reverting the migration, the old binary will start cleanly. ## Incident Runbook ### /readyz returns 503 `/readyz` pings Postgres. A 503 means the web container cannot reach the database. 1. Check container status: ``` docker compose -f docker-compose.prod.yaml ps ``` 2. If `postgres` is down or unhealthy, restart it: ``` docker compose -f docker-compose.prod.yaml up -d postgres ``` Then restart web and worker (they will wait for postgres to be healthy): ``` docker compose -f docker-compose.prod.yaml up -d ``` 3. Check web logs for the actual error: ``` docker compose -f docker-compose.prod.yaml logs web --tail=50 ``` All application logs are JSON when `ENV=production` is set. Look for `"level":"ERROR"` lines with a `"msg":"db ping failed"` or similar. ### Caddy TLS certificate errors 1. Check caddy logs: ``` docker compose -f docker-compose.prod.yaml logs caddy --tail=50 ``` 2. If you see "too many certificates already issued for" (Let's Encrypt rate limit, RESEARCH Pitfall 4): - Caddy hit the 5 duplicate certificates per week limit for the domain. - Confirm the `caddy_data` named volume exists and is mounted — if the volume was accidentally deleted, Caddy cannot reuse the cached certificate and must re-issue on every restart, quickly exhausting the rate limit. - Recovery options: - Wait up to 1 week for the rate limit window to reset. - Switch to the Let's Encrypt staging endpoint temporarily (see "Let's Encrypt staging" in the First-time setup section above). - Restore from a `caddy_data` volume backup if available. 3. If the `caddy_data` volume was lost: ``` # Verify the volume still exists: docker volume ls | grep caddy_data # If missing, the volume must be recreated (certificates will be re-issued): docker compose -f docker-compose.prod.yaml up -d caddy ``` ### Checking logs Follow logs for any service: ``` docker compose -f docker-compose.prod.yaml logs web --tail=100 --follow docker compose -f docker-compose.prod.yaml logs worker --tail=100 --follow docker compose -f docker-compose.prod.yaml logs caddy --tail=100 --follow docker compose -f docker-compose.prod.yaml logs postgres --tail=50 ``` All application logs are JSON in production (`ENV=production` activates the slog JSON handler). Pipe through `jq` for readable output: ``` docker compose -f docker-compose.prod.yaml logs web --follow --no-log-prefix | jq . ``` ### Debugging the distroless container The runtime image (`gcr.io/distroless/static-debian12:nonroot`) has **no shell** (RESEARCH Pitfall 7). You cannot `docker exec -it sh`. To debug network or filesystem issues, attach an ephemeral busybox container to the same network: ``` # Find the web container ID: docker compose -f docker-compose.prod.yaml ps # Attach busybox to the web container's network namespace: docker run --rm -it --network container: busybox sh ``` From the busybox shell you can run `wget`, `nc`, `ping`, etc. to diagnose connectivity. To inspect the compose network directly (e.g. reach `postgres:5432`): ``` docker run --rm -it \ --network $(docker inspect --format '{{range .NetworkSettings.Networks}}{{.NetworkID}}{{end}}') \ busybox sh ```