History

Arthur Belleville 2e8c9de24e chore(dev): watch static JS files in air — rebuild on discussion-sse.js changes Remove static/ from exclude_dir, add js to include_ext. Exclude static/tailwind.css via regex to prevent rebuild loop from the Tailwind output file triggering its own regeneration. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>		2026-05-17 12:55:14 +02:00
..
bin	feat(01-01): create directory skeleton and per-package doc.go placeholders	2026-05-14 17:53:55 +02:00
cmd	feat(12-03): add discussion SSE stream	2026-05-16 10:18:33 +02:00
deploy	feat(07-03): add docker-compose.prod.yaml and deploy/Caddyfile	2026-05-15 18:23:13 +02:00
internal	fix(17): skip own-user SSE messages in JS to eliminate left-then-right flash	2026-05-17 12:47:21 +02:00
migrations	feat(12-01): add discussion schema and queries	2026-05-16 10:07:12 +02:00
static	fix(17): skip own-user SSE messages in JS to eliminate left-then-right flash	2026-05-17 12:47:21 +02:00
templates	fix(17): skip own-user SSE messages in JS to eliminate left-then-right flash	2026-05-17 12:47:21 +02:00
.air-catalog.toml	fix(13-05): catalog air config watches css and rebuilds tailwind on change	2026-05-16 18:05:15 +02:00
.air.toml	chore(dev): watch static JS files in air — rebuild on discussion-sse.js changes	2026-05-17 12:55:14 +02:00
.env.example	feat(08): disable apple sign-in	2026-05-15 21:41:22 +02:00
.gitignore	feat(01-01): compose file, env example, gitignore, bootstrap migration	2026-05-14 17:54:18 +02:00
compose.yaml	test(05-01): add RED test scaffold for FILE-01..06 and MinIO in compose.yaml	2026-05-15 12:19:23 +02:00
docker-compose.prod.yaml	fix(07): replace minioadmin placeholder creds and add worker->web migration gate	2026-05-15 18:46:30 +02:00
Dockerfile	feat(07-02): multi-stage Dockerfile for web + worker binaries	2026-05-15 18:19:32 +02:00
embed.go	feat(07-01): embed.go + RunMigrations + HealthzHandler()/ReadyzHandler() split	2026-05-15 18:14:26 +02:00
go.mod	feat(08-02): add google social sign-in flow	2026-05-15 21:03:30 +02:00
go.sum	feat(08-02): add google social sign-in flow	2026-05-15 21:03:30 +02:00
justfile	fix(13-05): catalog air config watches css and rebuilds tailwind on change	2026-05-16 18:05:15 +02:00
README.md	feat(08): disable apple sign-in	2026-05-15 21:41:22 +02:00
sqlc.yaml	feat(02-01): add sqlc queries + citext/uuid overrides; generate bindings	2026-05-14 21:52:48 +02:00
tailwind.input.css	feat(15-02): port sidebar + project-card CSS into app.css and register in tailwind	2026-05-16 21:41:58 +02:00

README.md

Xtablo backend

Go + HTMX + Postgres. Phase 1: Walking Skeleton.

This README is the contract for FOUND-05: a developer with the prerequisites below should be able to clone the repo, follow the Quickstart, and see the HTMX-driven page within ~5 minutes.

Prerequisites

Install these on your dev machine before starting:

Go ≥ 1.22 (this project's go.mod declares 1.26)
just — task runner (brew install just on macOS, cargo install just, or see https://github.com/casey/just)
podman with podman compose (preferred per D-11) or docker with docker compose
curl
git

You do not need to install goose, templ, sqlc, air, the Tailwind CLI, or htmx.min.js — just bootstrap installs the Go tools into $GOBIN and bootstrap-downloads the Tailwind binary and HTMX script into local, gitignored paths.

Quickstart

Clone-to-running-page in ~5 minutes. Run from inside backend/.

cd backend
cp .env.example .env       # adjust DATABASE_URL if Postgres is not on localhost:5432
just bootstrap             # installs goose/templ/sqlc/air; bootstrap-downloads tailwindcss + htmx.min.js
just db-up                 # starts postgres via podman compose (see fallback below)
just migrate up            # applies migrations from ./migrations
just dev                   # terminal 1: brings up db, runs generate, then air on :8080

# in a SECOND terminal:
just styles-watch          # rebuilds static/tailwind.css on .templ / .go changes

# open http://localhost:8080

The page should render with a "Fetch server time" button. Clicking it swaps an ISO-8601 timestamp into the page via HTMX. If the page shows "No time fetched yet." and nothing happens on click, see Troubleshooting.

bootstrap is the slowest step (Go tool installs + two HTTP downloads). It only needs to run once per clone.

docker compose fallback

compose.yaml is portable across podman and docker — the service definition is identical. If you don't have podman:

Replace podman compose with docker compose mentally throughout this README.
The just db-up / just db-down recipes call podman compose directly. Run docker compose up -d postgres / docker compose down instead, and continue with the rest of the Quickstart unchanged.

(Decision D-11.)

Project layout

backend/
  cmd/
    web/main.go            # HTTP server entry point
    worker/main.go         # background worker — river periodic jobs (Phase 6)
  internal/
    db/                    # pgxpool wiring + sqlc-generated queries
    web/                   # chi router, handlers, middleware, design-system
      ui/                  # custom templ component library (Button, Card, Badge)
    session/               # placeholder — Phase 2
    tablos/                # placeholder — Phase 3
    tasks/                 # placeholder — Phase 4
    files/                 # placeholder — Phase 5
  migrations/              # goose .sql migrations
  templates/               # .templ files (layout, index, fragments)
  static/
    htmx.min.js            # bootstrap-downloaded by `just bootstrap`; gitignored; no runtime CDN
    tailwind.css           # generated by the Tailwind standalone CLI
  bin/                     # gitignored — tailwindcss CLI binary, etc.
  .air.toml                # air live-reload config
  .env.example             # committed; copy to .env
  compose.yaml             # local Postgres
  go.mod / go.sum
  justfile                 # task runner recipes — the source of truth for commands
  sqlc.yaml
  tailwind.input.css
  README.md

HTMX is served from /static/htmx.min.js at runtime — no CDN. The justfile's bootstrap-time unpkg.com URL is the single authoritative version pin (D-10).

Environment variables

backend/.env is gitignored; backend/.env.example is committed and lists the keys consumed by cmd/web and cmd/worker. Local Just recipes load backend/.env automatically, so just dev will pick up provider credentials such as GOOGLE_CLIENT_ID.

Variable	Description	Default
`DATABASE_URL`	Postgres DSN used by the web + worker binaries and by `just migrate`	`postgres://xtablo:xtablo@localhost:5432/xtablo?sslmode=disable`
`PORT`	HTTP port for `cmd/web`	`8080`
`ENV`	`development` enables slog's text handler; `production` switches to JSON	`development`
`GOOGLE_CLIENT_ID`	Google OAuth client ID	blank
`GOOGLE_CLIENT_SECRET`	Google OAuth client secret	blank
`GOOGLE_REDIRECT_URL`	Google callback URL, usually `/auth/google/callback`	`http://localhost:8080/auth/google/callback`

Google config is optional in local development. When it is missing, the login and signup pages keep the Google button visible but disabled with a not-configured label. No real provider secrets should be committed to .env.example. Apple sign-in is disabled in the current product surface.

Common commands

Every command in this table is a recipe in backend/justfile.

Recipe	What it does	When to use
`just bootstrap`	Installs Go CLI tools (`goose`, `templ`, `sqlc`, `air`); bootstrap-downloads `bin/tailwindcss` and `static/htmx.min.js`	Once per clone; re-run after deleting `bin/` or `static/htmx.min.js`
`just db-up`	Starts the local Postgres container	Before `just migrate up` / `just dev` if not already running
`just db-down`	Stops the local Postgres container	When you're done for the day
`just migrate up` / `migrate down` / `migrate status`	Applies / reverts / inspects goose migrations against `DATABASE_URL`	After `just db-up`, or any time you change `migrations/`
`just generate`	One-shot: `templ generate`, `sqlc generate`, Tailwind compile to `static/tailwind.css`	After editing `.templ`, query SQL, or `tailwind.input.css`
`just styles-watch`	Tailwind standalone CLI in `--watch` mode	In a second terminal alongside `just dev` (D-14)
`just dev`	Loads `backend/.env`, brings up Postgres, runs `just generate`, then runs `air` for Go live-reload on `:8080`	Main dev loop, terminal 1
`just test`	`templ generate` then `go test ./...`	Before committing
`just lint`	`go vet ./...` and `gofmt -l` check	Before committing
`just build`	Generates assets, then builds `bin/web` and `bin/worker`	Producing release binaries locally
`just clean`	Removes `bin/`, `tmp/`, `static/htmx.min.js`, `static/tailwind.css`, and `*_templ.go` files	Reset to a fresh-clone state without dropping the Postgres volume

Running the Worker

cmd/worker is the background job processor. It runs river periodic jobs against the same Postgres as cmd/web. Start it with:

just worker

This requires just db-up (handled automatically as a dependency) and MinIO running (used by the orphan-file cleanup job). If MinIO is not running, the worker will exit on startup with "file store init failed".

What to expect

Structured logs appear immediately at startup.
A "worker ready" log line appears within a few seconds after rivermigrate and S3 init complete.
A "worker heartbeat" log line appears almost immediately (the heartbeat job is configured with RunOnStart: true, so it fires on the first scheduler tick which happens within seconds of startup).
Subsequent heartbeat logs appear every ~1 minute.
The orphan-file cleanup job runs every hour (no RunOnStart — first run is ~1 hour after startup).

Single-worker constraint

Run only one worker process at a time (v1). River uses advisory locks for leader election and concurrent rivermigrate runs are unsafe. Do not run multiple worker instances against the same database in this version.

Graceful shutdown

Send SIGINT (Ctrl+C) and observe:

{"level":"INFO","msg":"shutting down"}
{"level":"INFO","msg":"shutdown complete"}

The worker calls riverClient.StopAndCancel with a 10-second timeout, which cancels in-flight job contexts and waits for goroutines to exit before closing the pool.

Observing failed job retries

River logs each failure via the SlogErrorHandler. A failed job produces a log line like:

{"level":"ERROR","msg":"job error","job_id":42,"job_kind":"heartbeat","attempt":1,"max_attempts":25,"err":"..."}

River retries up to 25 times with exponential backoff (attempts^4 + jitter). After 25 failed attempts the job is moved to the discarded state in river_job.

Troubleshooting

The three issues most likely to trip you up on a fresh clone:

"Fresh clone fails to build with undefined: templates.Index" — Templ generates *_templ.go files from .templ sources, and those generated files are not committed. Run just generate (or just dev, which calls it) before invoking go build directly. (Pitfall 1.)
"First request to /healthz returns 503 right after just db-up" — The Postgres container needs ~5–10 seconds to become healthy after podman compose up -d returns. Check podman compose ps (or docker compose ps) for the healthy status, or just wait and retry. Subsequent calls succeed. The 503 during warm-up is correct behavior, not a bug. (Pitfall 2.)
"Tailwind classes used in .templ files don't appear in the compiled CSS" — Tailwind v4 only scans content paths declared via @source in tailwind.input.css. Confirm the file contains @source "../templates/**/*.templ"; (and equivalent globs for internal/web/**/*.go). Re-run just styles-watch so the watcher picks up the config change. (Pitfall 3.)

If something else is wrong and you want a clean slate without dropping the Postgres volume:

just clean              # removes bin/, tmp/, static/htmx.min.js, static/tailwind.css, *_templ.go
just bootstrap          # re-download tools and assets
just dev                # back to a working state

Run just db-down first if you also want to drop the Postgres container.

What Phase 1 ships (and doesn't)

Ships:

Project scaffold (go.mod, justfile, .air.toml, tailwind.input.css, sqlc.yaml, compose.yaml)
Local Postgres via compose.yaml (pg_isready healthcheck)
goose migration pipeline (migrations/0001_init.sql is a no-op bootstrap)
chi router with /, /healthz, /demo/time, /static/*
slog-based structured logging with RequestID middleware
Graceful HTTP shutdown
pgxpool wiring exercised by /healthz
templ + HTMX demo (root page + hx-get round-trip to a templ fragment)
Custom templ design-system package at internal/web/ui/ (Button, Card, Badge)
Live-reload dev loop (just dev + just styles-watch)
cmd/worker skeleton (boot, log, idle, shutdown)

Does not ship — deferred:

Authentication, sessions, users → Phase 2
Tablos CRUD → Phase 3
Tasks / kanban → Phase 4
File uploads + R2/S3 → Phase 5
Real worker jobs → Phase 6
Production deploy, Dockerfile, /readyz → Phase 7

Deploy

The production host is a Hetzner VM running plain Docker Compose (D-01, D-02). No Kubernetes or managed orchestration is needed — docker compose up -d on the VM is the entire deployment mechanism. Postgres runs inside the compose stack (D-03); there is no external managed database.

Prerequisites

Install on the production VM before first deploy:

Docker ≥ 24 with the Docker Compose plugin (docker compose — not the standalone docker-compose binary)
git (optional — useful for pulling the repo directly onto the VM)

No other runtimes are needed. Go, Node, and all build tooling run in the Dockerfile's multi-stage build and are not required on the VM.

First-time setup

Run all commands on the VM via SSH unless noted otherwise.

SSH to the VM.
```
ssh user@<vm-ip>
```

Copy the backend/ directory to the VM (or clone the repo).

# Option A — rsync from local machine:
rsync -av --exclude '.git' backend/ user@<vm-ip>:~/xtablo/

# Option B — clone the repo directly on the VM:
git clone <repo-url> ~/xtablo && cd ~/xtablo/backend

Create .env.prod by copying .env.example and filling in real values.

cp .env.example .env.prod
chmod 600 .env.prod      # restrict read access — file contains secrets (T-07-10)

Mandatory variables to set in .env.prod:

Variable	Value
`DATABASE_URL`	`postgres://xtablo:<POSTGRES_PASSWORD>@postgres:5432/xtablo?sslmode=disable` (internal compose network — hostname is `postgres`)
`POSTGRES_PASSWORD`	Strong random password (also used by the postgres service). Example: `openssl rand -hex 24`
`POSTGRES_USER`	`xtablo` (or your custom user; must match `DATABASE_URL`)
`POSTGRES_DB`	`xtablo` (or your custom db; must match `DATABASE_URL`)
`SESSION_SECRET`	32 random bytes hex-encoded. Generate with: `openssl rand -hex 32`
`S3_ENDPOINT`	R2 endpoint URL: `https://<account-id>.r2.cloudflarestorage.com`
`S3_BUCKET`	R2 bucket name
`S3_ACCESS_KEY`	R2 API token key ID
`S3_SECRET_KEY`	R2 API token secret
`S3_USE_PATH_STYLE`	`false` for Cloudflare R2 (virtual-hosted-style URLs)
`S3_REGION`	`auto` or `us-east-1` (R2 accepts both)
`MAX_UPLOAD_SIZE_MB`	`25` (or your preferred limit)
`ENV`	`production` (activates JSON slog handler)
`PORT`	`8080`
`DOMAIN`	`app.yourdomain.com` (Caddy reads this for TLS)

Do not include TEST_DATABASE_URL in .env.prod — it is a dev/test-only variable and is not used by the runtime binaries.

Build the Docker image (from inside backend/ — either locally or on the VM).

# From inside backend/
docker build -f Dockerfile -t ghcr.io/yourusername/xtablo:v0.1.0 .

If building locally, push to a registry and pull on the VM:

docker push ghcr.io/yourusername/xtablo:v0.1.0
# On the VM:
docker pull ghcr.io/yourusername/xtablo:v0.1.0

Set image coordinates as environment variables (used by docker-compose.prod.yaml).
```
export IMAGE=ghcr.io/yourusername/xtablo
export TAG=v0.1.0
```
Start the stack.
```
docker compose -f docker-compose.prod.yaml --env-file .env.prod up -d
```
The postgres service must pass its healthcheck before web and worker start. Migrations run automatically at web startup via goose.Up() (D-10).

Verify the deployment.

curl https://app.yourdomain.com/healthz   # → {"status":"ok"}
curl https://app.yourdomain.com/readyz    # → {"status":"ok","db":"ok"}

If the domain is not yet configured, use the VM's public IP temporarily with HTTP (Caddy will not yet have a certificate):

curl http://<vm-ip>:80/healthz

Let's Encrypt staging (for initial TLS testing).

To avoid hitting Let's Encrypt production rate limits (5 duplicate certificates per week per domain) during initial setup, uncomment the staging global block in deploy/Caddyfile:
```
{
  acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
}
```
Restart Caddy after editing (docker compose -f docker-compose.prod.yaml restart caddy), verify TLS works (browsers will show a staging cert warning — that is expected), then remove the global block and clear the caddy_data volume to issue a real production certificate.

Deploying a new version

Build and tag the new image (same as first-time, with a new tag):

docker build -f Dockerfile -t ghcr.io/yourusername/xtablo:v0.2.0 .
docker push ghcr.io/yourusername/xtablo:v0.2.0   # if using a registry

On the VM — update TAG in .env.prod:
```
# Edit .env.prod:
TAG=v0.2.0
```
Or pass it inline without editing the file:
```
export TAG=v0.2.0
```
Pull and recreate only the changed services:
```
docker compose -f docker-compose.prod.yaml --env-file .env.prod up -d
```
Compose recreates only the web and worker containers (their image tag changed). Postgres and Caddy are unaffected. Migrations run automatically at web startup (D-10) — goose.Up() is idempotent and skips already-applied migrations.

Rollback

Rollback means redeploying the previous image tag (D-11). No special tooling is required — it is the same as deploying a new version, but with an older tag.

On the VM — set TAG to the previous tag in .env.prod (or inline):
```
export TAG=v0.1.0
```
Redeploy:
```
docker compose -f docker-compose.prod.yaml --env-file .env.prod up -d
```
Compose recreates web and worker with the old image. The rollback is complete.

Schema rollback (break-glass)

goose.Up() is idempotent — rolling back to a previous binary does not automatically run goose down. In most cases this is fine: the old binary ignores columns it does not know about.

If a migration introduced a schema change that is incompatible with the old binary (e.g. a NOT NULL column without a default that the old binary does not supply), run a manual goose down as a break-glass step:

Connect to Postgres inside the container:
```
docker exec -it <postgres-container-name> psql -U xtablo -d xtablo
```
(Find the container name with docker compose -f docker-compose.prod.yaml ps.)

The production image is distroless — the goose CLI is not inside the runtime container. Install the goose CLI separately on the VM or use the goose Docker image against the internal network:

# Install goose CLI on the VM:
go install github.com/pressly/goose/v3/cmd/goose@latest
goose -dir ./migrations postgres "$DATABASE_URL" down

Or use an ephemeral container on the same compose network:

docker run --rm --network <compose-network> \
  -e GOOSE_DRIVER=postgres \
  -e GOOSE_DBSTRING="postgres://xtablo:<password>@postgres:5432/xtablo?sslmode=disable" \
  -v $(pwd)/migrations:/migrations \
  ghcr.io/kukymbr/goose-docker:latest \
  goose -dir /migrations down

After reverting the migration, the old binary will start cleanly.

Incident Runbook

/readyz returns 503

/readyz pings Postgres. A 503 means the web container cannot reach the database.

Check container status:

docker compose -f docker-compose.prod.yaml ps

If postgres is down or unhealthy, restart it:
```
docker compose -f docker-compose.prod.yaml up -d postgres
```
Then restart web and worker (they will wait for postgres to be healthy):
```
docker compose -f docker-compose.prod.yaml up -d
```
Check web logs for the actual error:
```
docker compose -f docker-compose.prod.yaml logs web --tail=50
```
All application logs are JSON when ENV=production is set. Look for "level":"ERROR" lines with a "msg":"db ping failed" or similar.

Caddy TLS certificate errors

Check caddy logs:

docker compose -f docker-compose.prod.yaml logs caddy --tail=50

If you see "too many certificates already issued for" (Let's Encrypt rate limit, RESEARCH Pitfall 4):
- Caddy hit the 5 duplicate certificates per week limit for the domain.
- Confirm the caddy_data named volume exists and is mounted — if the volume was accidentally deleted, Caddy cannot reuse the cached certificate and must re-issue on every restart, quickly exhausting the rate limit.
- Recovery options:
  - Wait up to 1 week for the rate limit window to reset.
  - Switch to the Let's Encrypt staging endpoint temporarily (see "Let's Encrypt staging" in the First-time setup section above).
  - Restore from a caddy_data volume backup if available.

If the caddy_data volume was lost:

# Verify the volume still exists:
docker volume ls | grep caddy_data

# If missing, the volume must be recreated (certificates will be re-issued):
docker compose -f docker-compose.prod.yaml up -d caddy

Checking logs

Follow logs for any service:

docker compose -f docker-compose.prod.yaml logs web --tail=100 --follow
docker compose -f docker-compose.prod.yaml logs worker --tail=100 --follow
docker compose -f docker-compose.prod.yaml logs caddy --tail=100 --follow
docker compose -f docker-compose.prod.yaml logs postgres --tail=50

All application logs are JSON in production (ENV=production activates the slog JSON handler). Pipe through jq for readable output:

docker compose -f docker-compose.prod.yaml logs web --follow --no-log-prefix | jq .

Debugging the distroless container

The runtime image (gcr.io/distroless/static-debian12:nonroot) has no shell (RESEARCH Pitfall 7). You cannot docker exec -it <web-container> sh.

To debug network or filesystem issues, attach an ephemeral busybox container to the same network:

# Find the web container ID:
docker compose -f docker-compose.prod.yaml ps

# Attach busybox to the web container's network namespace:
docker run --rm -it --network container:<web-container-id> busybox sh

From the busybox shell you can run wget, nc, ping, etc. to diagnose connectivity. To inspect the compose network directly (e.g. reach postgres:5432):

docker run --rm -it \
  --network $(docker inspect <web-container-id> --format '{{range .NetworkSettings.Networks}}{{.NetworkID}}{{end}}') \
  busybox sh

README.md Unescape Escape

Xtablo backend

Prerequisites

Quickstart

docker compose fallback

Project layout

Environment variables

Common commands

Running the Worker

What to expect

Single-worker constraint

Graceful shutdown

Observing failed job retries

Troubleshooting

What Phase 1 ships (and doesn't)

Deploy

Prerequisites

First-time setup

Deploying a new version

Rollback

Schema rollback (break-glass)

Incident Runbook

/readyz returns 503

Caddy TLS certificate errors

Checking logs

Debugging the distroless container

README.md