Remove static/ from exclude_dir, add js to include_ext. Exclude static/tailwind.css via regex to prevent rebuild loop from the Tailwind output file triggering its own regeneration. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| bin | ||
| cmd | ||
| deploy | ||
| internal | ||
| migrations | ||
| static | ||
| templates | ||
| .air-catalog.toml | ||
| .air.toml | ||
| .env.example | ||
| .gitignore | ||
| compose.yaml | ||
| docker-compose.prod.yaml | ||
| Dockerfile | ||
| embed.go | ||
| go.mod | ||
| go.sum | ||
| justfile | ||
| README.md | ||
| sqlc.yaml | ||
| tailwind.input.css | ||
Xtablo backend
Go + HTMX + Postgres. Phase 1: Walking Skeleton.
This README is the contract for FOUND-05: a developer with the prerequisites below should be able to clone the repo, follow the Quickstart, and see the HTMX-driven page within ~5 minutes.
Prerequisites
Install these on your dev machine before starting:
- Go ≥ 1.22 (this project's
go.moddeclares 1.26) - just — task runner (
brew install juston macOS,cargo install just, or see https://github.com/casey/just) - podman with
podman compose(preferred per D-11) or docker withdocker compose - curl
- git
You do not need to install goose, templ, sqlc, air, the Tailwind CLI, or
htmx.min.js — just bootstrap installs the Go tools into $GOBIN and
bootstrap-downloads the Tailwind binary and HTMX script into local, gitignored
paths.
Quickstart
Clone-to-running-page in ~5 minutes. Run from inside backend/.
cd backend
cp .env.example .env # adjust DATABASE_URL if Postgres is not on localhost:5432
just bootstrap # installs goose/templ/sqlc/air; bootstrap-downloads tailwindcss + htmx.min.js
just db-up # starts postgres via podman compose (see fallback below)
just migrate up # applies migrations from ./migrations
just dev # terminal 1: brings up db, runs generate, then air on :8080
# in a SECOND terminal:
just styles-watch # rebuilds static/tailwind.css on .templ / .go changes
# open http://localhost:8080
The page should render with a "Fetch server time" button. Clicking it swaps an ISO-8601 timestamp into the page via HTMX. If the page shows "No time fetched yet." and nothing happens on click, see Troubleshooting.
bootstrap is the slowest step (Go tool installs + two HTTP downloads). It only
needs to run once per clone.
docker compose fallback
compose.yaml is portable across podman and docker — the service definition is
identical. If you don't have podman:
- Replace
podman composewithdocker composementally throughout this README. - The
just db-up/just db-downrecipes callpodman composedirectly. Rundocker compose up -d postgres/docker compose downinstead, and continue with the rest of the Quickstart unchanged.
(Decision D-11.)
Project layout
backend/
cmd/
web/main.go # HTTP server entry point
worker/main.go # background worker — river periodic jobs (Phase 6)
internal/
db/ # pgxpool wiring + sqlc-generated queries
web/ # chi router, handlers, middleware, design-system
ui/ # custom templ component library (Button, Card, Badge)
session/ # placeholder — Phase 2
tablos/ # placeholder — Phase 3
tasks/ # placeholder — Phase 4
files/ # placeholder — Phase 5
migrations/ # goose .sql migrations
templates/ # .templ files (layout, index, fragments)
static/
htmx.min.js # bootstrap-downloaded by `just bootstrap`; gitignored; no runtime CDN
tailwind.css # generated by the Tailwind standalone CLI
bin/ # gitignored — tailwindcss CLI binary, etc.
.air.toml # air live-reload config
.env.example # committed; copy to .env
compose.yaml # local Postgres
go.mod / go.sum
justfile # task runner recipes — the source of truth for commands
sqlc.yaml
tailwind.input.css
README.md
HTMX is served from /static/htmx.min.js at runtime — no CDN. The justfile's
bootstrap-time unpkg.com URL is the single authoritative version pin (D-10).
Environment variables
backend/.env is gitignored; backend/.env.example is committed and lists the
keys consumed by cmd/web and cmd/worker. Local Just recipes load
backend/.env automatically, so just dev will pick up provider credentials
such as GOOGLE_CLIENT_ID.
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
Postgres DSN used by the web + worker binaries and by just migrate |
postgres://xtablo:xtablo@localhost:5432/xtablo?sslmode=disable |
PORT |
HTTP port for cmd/web |
8080 |
ENV |
development enables slog's text handler; production switches to JSON |
development |
GOOGLE_CLIENT_ID |
Google OAuth client ID | blank |
GOOGLE_CLIENT_SECRET |
Google OAuth client secret | blank |
GOOGLE_REDIRECT_URL |
Google callback URL, usually /auth/google/callback |
http://localhost:8080/auth/google/callback |
Google config is optional in local development. When it is missing, the login
and signup pages keep the Google button visible but disabled with a
not-configured label. No real provider secrets should be committed to
.env.example. Apple sign-in is disabled in the current product surface.
Common commands
Every command in this table is a recipe in backend/justfile.
| Recipe | What it does | When to use |
|---|---|---|
just bootstrap |
Installs Go CLI tools (goose, templ, sqlc, air); bootstrap-downloads bin/tailwindcss and static/htmx.min.js |
Once per clone; re-run after deleting bin/ or static/htmx.min.js |
just db-up |
Starts the local Postgres container | Before just migrate up / just dev if not already running |
just db-down |
Stops the local Postgres container | When you're done for the day |
just migrate up / migrate down / migrate status |
Applies / reverts / inspects goose migrations against DATABASE_URL |
After just db-up, or any time you change migrations/ |
just generate |
One-shot: templ generate, sqlc generate, Tailwind compile to static/tailwind.css |
After editing .templ, query SQL, or tailwind.input.css |
just styles-watch |
Tailwind standalone CLI in --watch mode |
In a second terminal alongside just dev (D-14) |
just dev |
Loads backend/.env, brings up Postgres, runs just generate, then runs air for Go live-reload on :8080 |
Main dev loop, terminal 1 |
just test |
templ generate then go test ./... |
Before committing |
just lint |
go vet ./... and gofmt -l check |
Before committing |
just build |
Generates assets, then builds bin/web and bin/worker |
Producing release binaries locally |
just clean |
Removes bin/, tmp/, static/htmx.min.js, static/tailwind.css, and *_templ.go files |
Reset to a fresh-clone state without dropping the Postgres volume |
Running the Worker
cmd/worker is the background job processor. It runs river periodic jobs against
the same Postgres as cmd/web. Start it with:
just worker
This requires just db-up (handled automatically as a dependency) and MinIO
running (used by the orphan-file cleanup job). If MinIO is not running, the worker
will exit on startup with "file store init failed".
What to expect
- Structured logs appear immediately at startup.
- A
"worker ready"log line appears within a few seconds afterrivermigrateand S3 init complete. - A
"worker heartbeat"log line appears almost immediately (the heartbeat job is configured withRunOnStart: true, so it fires on the first scheduler tick which happens within seconds of startup). - Subsequent heartbeat logs appear every ~1 minute.
- The orphan-file cleanup job runs every hour (no
RunOnStart— first run is ~1 hour after startup).
Single-worker constraint
Run only one worker process at a time (v1). River uses advisory locks for leader election and concurrent rivermigrate runs are unsafe. Do not run multiple worker instances against the same database in this version.
Graceful shutdown
Send SIGINT (Ctrl+C) and observe:
{"level":"INFO","msg":"shutting down"}
{"level":"INFO","msg":"shutdown complete"}
The worker calls riverClient.StopAndCancel with a 10-second timeout, which
cancels in-flight job contexts and waits for goroutines to exit before closing
the pool.
Observing failed job retries
River logs each failure via the SlogErrorHandler. A failed job produces a log
line like:
{"level":"ERROR","msg":"job error","job_id":42,"job_kind":"heartbeat","attempt":1,"max_attempts":25,"err":"..."}
River retries up to 25 times with exponential backoff (attempts^4 + jitter).
After 25 failed attempts the job is moved to the discarded state in river_job.
Troubleshooting
The three issues most likely to trip you up on a fresh clone:
-
"Fresh clone fails to build with
undefined: templates.Index" — Templ generates*_templ.gofiles from.templsources, and those generated files are not committed. Runjust generate(orjust dev, which calls it) before invokinggo builddirectly. (Pitfall 1.) -
"First request to
/healthzreturns 503 right afterjust db-up" — The Postgres container needs ~5–10 seconds to become healthy afterpodman compose up -dreturns. Checkpodman compose ps(ordocker compose ps) for thehealthystatus, or just wait and retry. Subsequent calls succeed. The 503 during warm-up is correct behavior, not a bug. (Pitfall 2.) -
"Tailwind classes used in
.templfiles don't appear in the compiled CSS" — Tailwind v4 only scans content paths declared via@sourceintailwind.input.css. Confirm the file contains@source "../templates/**/*.templ";(and equivalent globs forinternal/web/**/*.go). Re-runjust styles-watchso the watcher picks up the config change. (Pitfall 3.)
If something else is wrong and you want a clean slate without dropping the Postgres volume:
just clean # removes bin/, tmp/, static/htmx.min.js, static/tailwind.css, *_templ.go
just bootstrap # re-download tools and assets
just dev # back to a working state
Run just db-down first if you also want to drop the Postgres container.
What Phase 1 ships (and doesn't)
Ships:
- Project scaffold (
go.mod, justfile,.air.toml,tailwind.input.css,sqlc.yaml,compose.yaml) - Local Postgres via
compose.yaml(pg_isreadyhealthcheck) - goose migration pipeline (
migrations/0001_init.sqlis a no-op bootstrap) - chi router with
/,/healthz,/demo/time,/static/* - slog-based structured logging with RequestID middleware
- Graceful HTTP shutdown
- pgxpool wiring exercised by
/healthz - templ + HTMX demo (root page +
hx-getround-trip to a templ fragment) - Custom templ design-system package at
internal/web/ui/(Button, Card, Badge) - Live-reload dev loop (
just dev+just styles-watch) cmd/workerskeleton (boot, log, idle, shutdown)
Does not ship — deferred:
- Authentication, sessions, users → Phase 2
- Tablos CRUD → Phase 3
- Tasks / kanban → Phase 4
- File uploads + R2/S3 → Phase 5
- Real worker jobs → Phase 6
- Production deploy, Dockerfile,
/readyz→ Phase 7
Deploy
The production host is a Hetzner VM running plain Docker Compose (D-01, D-02). No
Kubernetes or managed orchestration is needed — docker compose up -d on the VM is
the entire deployment mechanism. Postgres runs inside the compose stack (D-03); there
is no external managed database.
Prerequisites
Install on the production VM before first deploy:
- Docker ≥ 24 with the Docker Compose plugin (
docker compose— not the standalonedocker-composebinary) - git (optional — useful for pulling the repo directly onto the VM)
No other runtimes are needed. Go, Node, and all build tooling run in the Dockerfile's multi-stage build and are not required on the VM.
First-time setup
Run all commands on the VM via SSH unless noted otherwise.
-
SSH to the VM.
ssh user@<vm-ip> -
Copy the
backend/directory to the VM (or clone the repo).# Option A — rsync from local machine: rsync -av --exclude '.git' backend/ user@<vm-ip>:~/xtablo/ # Option B — clone the repo directly on the VM: git clone <repo-url> ~/xtablo && cd ~/xtablo/backend -
Create
.env.prodby copying.env.exampleand filling in real values.cp .env.example .env.prod chmod 600 .env.prod # restrict read access — file contains secrets (T-07-10)Mandatory variables to set in
.env.prod:Variable Value DATABASE_URLpostgres://xtablo:<POSTGRES_PASSWORD>@postgres:5432/xtablo?sslmode=disable(internal compose network — hostname ispostgres)POSTGRES_PASSWORDStrong random password (also used by the postgres service). Example: openssl rand -hex 24POSTGRES_USERxtablo(or your custom user; must matchDATABASE_URL)POSTGRES_DBxtablo(or your custom db; must matchDATABASE_URL)SESSION_SECRET32 random bytes hex-encoded. Generate with: openssl rand -hex 32S3_ENDPOINTR2 endpoint URL: https://<account-id>.r2.cloudflarestorage.comS3_BUCKETR2 bucket name S3_ACCESS_KEYR2 API token key ID S3_SECRET_KEYR2 API token secret S3_USE_PATH_STYLEfalsefor Cloudflare R2 (virtual-hosted-style URLs)S3_REGIONautoorus-east-1(R2 accepts both)MAX_UPLOAD_SIZE_MB25(or your preferred limit)ENVproduction(activates JSON slog handler)PORT8080DOMAINapp.yourdomain.com(Caddy reads this for TLS)Do not include
TEST_DATABASE_URLin.env.prod— it is a dev/test-only variable and is not used by the runtime binaries. -
Build the Docker image (from inside
backend/— either locally or on the VM).# From inside backend/ docker build -f Dockerfile -t ghcr.io/yourusername/xtablo:v0.1.0 .If building locally, push to a registry and pull on the VM:
docker push ghcr.io/yourusername/xtablo:v0.1.0 # On the VM: docker pull ghcr.io/yourusername/xtablo:v0.1.0 -
Set image coordinates as environment variables (used by
docker-compose.prod.yaml).export IMAGE=ghcr.io/yourusername/xtablo export TAG=v0.1.0 -
Start the stack.
docker compose -f docker-compose.prod.yaml --env-file .env.prod up -dThe postgres service must pass its healthcheck before web and worker start. Migrations run automatically at web startup via
goose.Up()(D-10). -
Verify the deployment.
curl https://app.yourdomain.com/healthz # → {"status":"ok"} curl https://app.yourdomain.com/readyz # → {"status":"ok","db":"ok"}If the domain is not yet configured, use the VM's public IP temporarily with HTTP (Caddy will not yet have a certificate):
curl http://<vm-ip>:80/healthz -
Let's Encrypt staging (for initial TLS testing).
To avoid hitting Let's Encrypt production rate limits (5 duplicate certificates per week per domain) during initial setup, uncomment the staging global block in
deploy/Caddyfile:{ acme_ca https://acme-staging-v02.api.letsencrypt.org/directory }Restart Caddy after editing (
docker compose -f docker-compose.prod.yaml restart caddy), verify TLS works (browsers will show a staging cert warning — that is expected), then remove the global block and clear thecaddy_datavolume to issue a real production certificate.
Deploying a new version
-
Build and tag the new image (same as first-time, with a new tag):
docker build -f Dockerfile -t ghcr.io/yourusername/xtablo:v0.2.0 . docker push ghcr.io/yourusername/xtablo:v0.2.0 # if using a registry -
On the VM — update
TAGin.env.prod:# Edit .env.prod: TAG=v0.2.0Or pass it inline without editing the file:
export TAG=v0.2.0 -
Pull and recreate only the changed services:
docker compose -f docker-compose.prod.yaml --env-file .env.prod up -dCompose recreates only the web and worker containers (their image tag changed). Postgres and Caddy are unaffected. Migrations run automatically at web startup (D-10) —
goose.Up()is idempotent and skips already-applied migrations.
Rollback
Rollback means redeploying the previous image tag (D-11). No special tooling is required — it is the same as deploying a new version, but with an older tag.
-
On the VM — set
TAGto the previous tag in.env.prod(or inline):export TAG=v0.1.0 -
Redeploy:
docker compose -f docker-compose.prod.yaml --env-file .env.prod up -dCompose recreates web and worker with the old image. The rollback is complete.
Schema rollback (break-glass)
goose.Up() is idempotent — rolling back to a previous binary does not automatically
run goose down. In most cases this is fine: the old binary ignores columns it does
not know about.
If a migration introduced a schema change that is incompatible with the old binary (e.g. a NOT NULL column without a default that the old binary does not supply), run a manual goose down as a break-glass step:
-
Connect to Postgres inside the container:
docker exec -it <postgres-container-name> psql -U xtablo -d xtablo(Find the container name with
docker compose -f docker-compose.prod.yaml ps.) -
The production image is distroless — the
gooseCLI is not inside the runtime container. Install the goose CLI separately on the VM or use the goose Docker image against the internal network:# Install goose CLI on the VM: go install github.com/pressly/goose/v3/cmd/goose@latest goose -dir ./migrations postgres "$DATABASE_URL" downOr use an ephemeral container on the same compose network:
docker run --rm --network <compose-network> \ -e GOOSE_DRIVER=postgres \ -e GOOSE_DBSTRING="postgres://xtablo:<password>@postgres:5432/xtablo?sslmode=disable" \ -v $(pwd)/migrations:/migrations \ ghcr.io/kukymbr/goose-docker:latest \ goose -dir /migrations downAfter reverting the migration, the old binary will start cleanly.
Incident Runbook
/readyz returns 503
/readyz pings Postgres. A 503 means the web container cannot reach the database.
-
Check container status:
docker compose -f docker-compose.prod.yaml ps -
If
postgresis down or unhealthy, restart it:docker compose -f docker-compose.prod.yaml up -d postgresThen restart web and worker (they will wait for postgres to be healthy):
docker compose -f docker-compose.prod.yaml up -d -
Check web logs for the actual error:
docker compose -f docker-compose.prod.yaml logs web --tail=50All application logs are JSON when
ENV=productionis set. Look for"level":"ERROR"lines with a"msg":"db ping failed"or similar.
Caddy TLS certificate errors
-
Check caddy logs:
docker compose -f docker-compose.prod.yaml logs caddy --tail=50 -
If you see "too many certificates already issued for" (Let's Encrypt rate limit, RESEARCH Pitfall 4):
- Caddy hit the 5 duplicate certificates per week limit for the domain.
- Confirm the
caddy_datanamed volume exists and is mounted — if the volume was accidentally deleted, Caddy cannot reuse the cached certificate and must re-issue on every restart, quickly exhausting the rate limit. - Recovery options:
- Wait up to 1 week for the rate limit window to reset.
- Switch to the Let's Encrypt staging endpoint temporarily (see "Let's Encrypt staging" in the First-time setup section above).
- Restore from a
caddy_datavolume backup if available.
-
If the
caddy_datavolume was lost:# Verify the volume still exists: docker volume ls | grep caddy_data # If missing, the volume must be recreated (certificates will be re-issued): docker compose -f docker-compose.prod.yaml up -d caddy
Checking logs
Follow logs for any service:
docker compose -f docker-compose.prod.yaml logs web --tail=100 --follow
docker compose -f docker-compose.prod.yaml logs worker --tail=100 --follow
docker compose -f docker-compose.prod.yaml logs caddy --tail=100 --follow
docker compose -f docker-compose.prod.yaml logs postgres --tail=50
All application logs are JSON in production (ENV=production activates the slog
JSON handler). Pipe through jq for readable output:
docker compose -f docker-compose.prod.yaml logs web --follow --no-log-prefix | jq .
Debugging the distroless container
The runtime image (gcr.io/distroless/static-debian12:nonroot) has no shell
(RESEARCH Pitfall 7). You cannot docker exec -it <web-container> sh.
To debug network or filesystem issues, attach an ephemeral busybox container to the same network:
# Find the web container ID:
docker compose -f docker-compose.prod.yaml ps
# Attach busybox to the web container's network namespace:
docker run --rm -it --network container:<web-container-id> busybox sh
From the busybox shell you can run wget, nc, ping, etc. to diagnose
connectivity. To inspect the compose network directly (e.g. reach postgres:5432):
docker run --rm -it \
--network $(docker inspect <web-container-id> --format '{{range .NetworkSettings.Networks}}{{.NetworkID}}{{end}}') \
busybox sh