docs(07): research phase domain
This commit is contained in:
parent
e14fd36fdc
commit
588c03dae2
1 changed files with 660 additions and 0 deletions
660
.planning/phases/07-deploy-v1/07-RESEARCH.md
Normal file
660
.planning/phases/07-deploy-v1/07-RESEARCH.md
Normal file
|
|
@ -0,0 +1,660 @@
|
|||
# Phase 7: Deploy v1 - Research
|
||||
|
||||
**Researched:** 2026-05-15
|
||||
**Domain:** Go Docker multi-stage build, docker compose, Caddy reverse proxy, goose programmatic migrations, go:embed static assets, health checks
|
||||
**Confidence:** HIGH
|
||||
|
||||
## Summary
|
||||
|
||||
Phase 7 packages the existing Go backend into a production-ready Docker image, deploys it to a Hetzner VM via plain Docker Compose, and wires Caddy as a TLS-terminating reverse proxy. The phase has five distinct work areas: (1) convert static asset serving from on-disk paths to `go:embed`, (2) add programmatic `goose.Up()` migration call in `cmd/web` startup, (3) build a multi-stage Dockerfile producing `/app/web` and `/app/worker` in a single image, (4) split the existing `/healthz` handler into a liveness route (no DB ping) and a new `/readyz` route (DB ping), and (5) write `docker-compose.prod.yaml`, `deploy/Caddyfile`, and the `backend/README.md` runbook.
|
||||
|
||||
The codebase is well-prepared: `cmd/worker/main.go` already demonstrates the exact programmatic migration pattern (rivermigrate); `cmd/web/main.go` already reads all config from env vars; `signal.NotifyContext` graceful shutdown is in both binaries. The static files are currently served from `./static` on disk via `http.Dir`; they must be switched to `http.FS(embed.FS)` so the final container has zero runtime file dependencies. The existing `HealthzHandler` does a DB ping — that behavior must move to `/readyz`; `/healthz` becomes a pure liveness check.
|
||||
|
||||
**Primary recommendation:** Build in this wave order — Wave 0 (go:embed + `/readyz` split), Wave 1 (goose.Up startup migration), Wave 2 (Dockerfile), Wave 3 (compose + Caddy + env docs), Wave 4 (README runbook). Each wave is independently testable.
|
||||
|
||||
<user_constraints>
|
||||
## User Constraints (from CONTEXT.md)
|
||||
|
||||
### Locked Decisions
|
||||
|
||||
- **D-01:** Production host is a Hetzner VM running Docker Compose. No PaaS, no Kubernetes.
|
||||
- **D-02:** The full stack runs via plain `docker compose` — no Dokploy or Swarm mode in v1.
|
||||
- **D-03:** Postgres runs on the VM inside the compose stack, volume-backed. No managed Postgres service for v1.
|
||||
- **D-04:** Caddy is a service in `docker-compose.prod.yaml`. It proxies to `web:8080` and handles TLS via Let's Encrypt. Config via a bind-mounted Caddyfile.
|
||||
- **D-05:** Production secrets (`SESSION_SECRET`, `DATABASE_URL`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_ENDPOINT_URL`, `AWS_BUCKET`, `PORT`, `ENV`) are stored in a `.env` file on the Hetzner host (gitignored). `docker compose --env-file .env.prod up` reads it. No SOPS, no Docker secrets API.
|
||||
- **D-06:** S3-compatible storage in production is Cloudflare R2. R2 credentials live in the host `.env` file. MinIO remains in `compose.yaml` for local dev only.
|
||||
- **D-07:** A single multi-stage Dockerfile produces one image containing two binaries: `/app/web` (from `cmd/web`) and `/app/worker` (from `cmd/worker`). Both compiled in the builder stage, copied to the final runtime stage.
|
||||
- **D-08:** `docker-compose.prod.yaml` runs the same image twice: one service with `command: /app/web`, one with `command: /app/worker`. No subcommand dispatcher needed in Go code.
|
||||
- **D-09:** All static assets (Tailwind-compiled CSS, HTMX JS, Sortable.js, templ-generated HTML) are embedded via `//go:embed` at build time. No volume mounts for assets.
|
||||
- **D-10:** Migrations run programmatically inside the `web` binary at startup: `web` calls `goose.Up()` via the goose library before binding the HTTP server.
|
||||
- **D-11:** Rollback strategy: redeploy the previous image tag. Normal rollback = update compose image tag + `docker compose up -d`. `goose down` is documented as a break-glass step only.
|
||||
- **D-12:** `/healthz` — liveness: returns 200 OK immediately if the server is up (no DB ping). Used by Caddy / uptime monitor.
|
||||
- **D-13:** `/readyz` — readiness: returns 200 OK only if the DB pool is reachable (one `db.Ping()` call). Returns 503 during startup until migrations complete and the pool is healthy. Worker does not expose HTTP.
|
||||
|
||||
### Claude's Discretion
|
||||
|
||||
- Exact Dockerfile base image for the builder stage (e.g., `golang:1.26-alpine` vs `golang:1.26`).
|
||||
- Final runtime base: `distroless/static` vs `alpine`.
|
||||
- Caddyfile content (reverse proxy config, TLS directive, HTTPS redirect).
|
||||
- Whether `docker-compose.prod.yaml` includes a `healthcheck:` directive for the Postgres service.
|
||||
- Exact docker compose version / syntax used (`compose.yaml` already uses v2 syntax).
|
||||
- Whether the `web` service in prod compose `depends_on` the `postgres` service with a health condition.
|
||||
|
||||
### Deferred Ideas (OUT OF SCOPE)
|
||||
|
||||
- Dokploy layer
|
||||
- CI/CD pipeline
|
||||
- pg_dump backup cron
|
||||
- MinIO for prod
|
||||
</user_constraints>
|
||||
|
||||
<phase_requirements>
|
||||
## Phase Requirements
|
||||
|
||||
| ID | Description | Research Support |
|
||||
|----|-------------|------------------|
|
||||
| DEPLOY-01 | Both binaries build into a single multi-stage Docker image | Multi-stage Dockerfile pattern: builder copies both `cmd/web` and `cmd/worker`, runtime stage copies both binaries |
|
||||
| DEPLOY-02 | Image runs on a single VPS with env-injected config (no Supabase, no GCP) | `docker-compose.prod.yaml` with `env_file` directive; all env vars already read from `os.Getenv` in both binaries |
|
||||
| DEPLOY-03 | Migrations run on deploy without manual intervention | `goose.Up()` with `embed.FS` inside `cmd/web` startup, mirrors rivermigrate pattern in `cmd/worker` |
|
||||
| DEPLOY-04 | Health checks (`/healthz`, `/readyz`) and structured logs | `/healthz` already exists (needs liveness-only refactor); `/readyz` is new; JSON slog already live on `ENV=production` |
|
||||
| DEPLOY-05 | Documented runbook in `backend/README.md` covering local dev, deploy, rollback | Extends existing `backend/README.md`; adds deploy, rollback, incident sections |
|
||||
</phase_requirements>
|
||||
|
||||
## Architectural Responsibility Map
|
||||
|
||||
| Capability | Primary Tier | Secondary Tier | Rationale |
|
||||
|------------|-------------|----------------|-----------|
|
||||
| TLS termination | Caddy (compose service) | — | Caddy owns ACME/Let's Encrypt; web binary speaks plain HTTP on internal network |
|
||||
| Static asset serving | Web binary (go:embed) | — | D-09: embedded at build time; no runtime file mounts |
|
||||
| Database migrations | Web binary startup | — | D-10: goose.Up() before HTTP server binds |
|
||||
| Liveness check (/healthz) | Web binary | — | D-12: no DB dependency; fast 200 as long as server process is up |
|
||||
| Readiness check (/readyz) | Web binary | — | D-13: DB ping; Caddy or uptime monitor checks this before routing traffic |
|
||||
| Background jobs | Worker binary | — | Separate container in compose, same image, command: /app/worker |
|
||||
| Secret injection | Docker Compose env_file | Host .env.prod file | D-05: no secrets API needed |
|
||||
| Postgres persistence | Postgres compose service | Volume | D-03: volume-backed, not managed |
|
||||
|
||||
## Standard Stack
|
||||
|
||||
### Core
|
||||
| Library | Version | Purpose | Why Standard |
|
||||
|---------|---------|---------|--------------|
|
||||
| `github.com/pressly/goose/v3` | v3.27.1 | Programmatic migrations with embed.FS | Already in go.mod; `SetBaseFS` + `Up` is the idiomatic pattern for embedded migrations [VERIFIED: go.mod] |
|
||||
| `embed` (stdlib) | Go 1.16+ | Embed static/ and migrations/ into binary | No dependency, available since Go 1.16; project uses Go 1.26 [VERIFIED: go.mod] |
|
||||
| `io/fs` (stdlib) | Go 1.16+ | `fs.Sub` to strip directory prefix for http.FileServer | Required companion to embed.FS for serving static files [VERIFIED: Go stdlib] |
|
||||
| `gcr.io/distroless/static-debian12` | nonroot tag | Final container runtime base | Smallest option (~2MiB); no shell; correct for CGO_ENABLED=0 Go binaries [VERIFIED: GoogleContainerTools/distroless GitHub] |
|
||||
| `golang:1.26-alpine` | current | Builder stage base | Matches go.mod version; alpine keeps layer small and avoids glibc issues for distroless final [ASSUMED] |
|
||||
| `caddy:2-alpine` | current | Reverse proxy + automatic TLS | Official image; Let's Encrypt auto-cert; simple Caddyfile syntax [CITED: caddyserver.com/docs] |
|
||||
|
||||
### Supporting
|
||||
| Library | Version | Purpose | When to Use |
|
||||
|---------|---------|---------|-------------|
|
||||
| `github.com/jackc/pgx/v5/stdlib` | v5.9.2 | Bridge pgxpool → database/sql for goose | goose.Up requires *sql.DB; pgx/v5/stdlib wraps pgxpool conn string into sql.Open |
|
||||
|
||||
**Installation — no new top-level dependencies needed.** `pgx/v5/stdlib` is already a transitive dependency via pgx/v5 (in go.mod). Only needs to be added as a direct import in the migration helper.
|
||||
|
||||
**Version verification:**
|
||||
```
|
||||
github.com/pressly/goose/v3 v3.27.1 [VERIFIED: go.mod]
|
||||
github.com/jackc/pgx/v5 v5.9.2 [VERIFIED: go.mod]
|
||||
```
|
||||
|
||||
## Architecture Patterns
|
||||
|
||||
### System Architecture Diagram
|
||||
|
||||
```
|
||||
Internet
|
||||
│
|
||||
▼ :80/:443
|
||||
[Caddy] ─── ACME/Let's Encrypt cert management
|
||||
│
|
||||
│ :8080 (internal Docker network)
|
||||
▼
|
||||
[web container] ─── cmd/web binary (go:embed: static/ + migrations/)
|
||||
│ startup: goose.Up() ──► [postgres:5432]
|
||||
│ /healthz → 200 always (liveness)
|
||||
│ /readyz → 200 if DB ping ok (readiness)
|
||||
│ /static/* → http.FS(embed.FS)
|
||||
│ /tablos/* → HTMX handlers → [postgres:5432]
|
||||
│
|
||||
[worker container] ─── cmd/worker binary (same image, command: /app/worker)
|
||||
│ startup: rivermigrate.Up() ──► [postgres:5432]
|
||||
│ river periodic jobs ──► [postgres:5432]
|
||||
│ orphan cleanup ──────► [Cloudflare R2]
|
||||
│
|
||||
[postgres container] ─── postgres:16-alpine, volume: postgres_data
|
||||
[caddy_data volume] ─── TLS certificate persistence
|
||||
|
||||
Host .env.prod ──► docker compose --env-file .env.prod
|
||||
```
|
||||
|
||||
### Recommended Project Structure
|
||||
```
|
||||
backend/
|
||||
├── cmd/
|
||||
│ ├── web/main.go # add goose.Up() before http.ListenAndServe
|
||||
│ └── worker/main.go # unchanged (rivermigrate already wired)
|
||||
├── deploy/
|
||||
│ └── Caddyfile # bind-mounted into caddy container at runtime
|
||||
├── migrations/ # existing SQL files
|
||||
├── static/ # generated at build time; embedded via go:embed
|
||||
├── Dockerfile # new: multi-stage, produces /app/web + /app/worker
|
||||
├── docker-compose.prod.yaml # new: postgres + web + worker + caddy
|
||||
├── .env.example # update: add R2 vars, DOMAIN, remove TEST_DATABASE_URL note
|
||||
└── README.md # update: add Deploy, Rollback, Incident sections
|
||||
```
|
||||
|
||||
### Pattern 1: go:embed for Static Assets
|
||||
|
||||
**What:** Replace `http.Dir(staticDir)` with an embedded `embed.FS`. The `staticDir string` parameter in `NewRouter` becomes an `fs.FS` parameter.
|
||||
|
||||
**When to use:** Production — binary has zero runtime file dependencies.
|
||||
|
||||
**Two options for NewRouter signature change:**
|
||||
|
||||
Option A (recommended — backward compatible for tests): Accept `fs.FS` instead of `string`:
|
||||
```go
|
||||
// Source: Go stdlib io/fs + embed docs
|
||||
//go:embed static
|
||||
var StaticFiles embed.FS
|
||||
|
||||
// In NewRouter, change staticDir string → staticFS fs.FS
|
||||
staticSub, _ := fs.Sub(staticFS, "static")
|
||||
fileHandler := http.FileServer(http.FS(staticSub))
|
||||
r.Get("/static/*", http.StripPrefix("/static/", fileHandler).ServeHTTP)
|
||||
```
|
||||
|
||||
In `cmd/web/main.go`:
|
||||
```go
|
||||
//go:embed static
|
||||
var staticFiles embed.FS
|
||||
// ...
|
||||
router := web.NewRouter(pool, staticFiles, ...)
|
||||
```
|
||||
|
||||
In tests: pass `os.DirFS("./static")` to avoid embedding during unit test runs.
|
||||
|
||||
**Constraint:** The `//go:embed static` directive must live in a file in the same package as the embedded directory, or a parent package. Because `static/` is at the module root (not inside a Go package), the embed directive lives in `cmd/web/main.go` which can reference `../../static` — but embed paths must be relative to the file. The cleanest approach is an `assets` package at the module root:
|
||||
|
||||
```go
|
||||
// backend/assets/assets.go
|
||||
package assets
|
||||
|
||||
import "embed"
|
||||
|
||||
//go:embed static
|
||||
var Static embed.FS
|
||||
```
|
||||
|
||||
Then `cmd/web/main.go` imports `backend/assets` and passes `assets.Static` to `NewRouter`.
|
||||
|
||||
> NOTE: The `//go:embed` directive path must be relative to the .go file containing the directive. `static/` must be reachable from the Go file. Verify during implementation that the embed path resolves correctly relative to `cmd/web/main.go` or an `assets` package.
|
||||
|
||||
### Pattern 2: goose.Up() at Web Startup
|
||||
|
||||
**What:** Before binding the HTTP server, call goose programmatic migrations using the embedded SQL files and a `*sql.DB` derived from the existing pgxpool connection string.
|
||||
|
||||
**Source:** [pressly/goose embed docs](https://pressly.github.io/goose/blog/2021/embed-sql-migrations/) [CITED]
|
||||
|
||||
```go
|
||||
// backend/internal/db/migrate.go (new file)
|
||||
package db
|
||||
|
||||
import (
|
||||
"context"
|
||||
"database/sql"
|
||||
"embed"
|
||||
|
||||
"github.com/jackc/pgx/v5/pgxpool"
|
||||
_ "github.com/jackc/pgx/v5/stdlib" // register "pgx/v5" driver
|
||||
"github.com/pressly/goose/v3"
|
||||
)
|
||||
|
||||
//go:embed ../../migrations/*.sql
|
||||
var migrationFS embed.FS
|
||||
|
||||
// RunMigrations opens a sql.DB from the pool's DSN and runs all pending
|
||||
// goose migrations embedded in the binary.
|
||||
func RunMigrations(ctx context.Context, pool *pgxpool.Pool) error {
|
||||
dsn := pool.Config().ConnConfig.ConnString()
|
||||
db, err := sql.Open("pgx/v5", dsn)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
goose.SetBaseFS(migrationFS)
|
||||
if err := goose.SetDialect("postgres"); err != nil {
|
||||
return err
|
||||
}
|
||||
return goose.Up(db, "migrations")
|
||||
}
|
||||
```
|
||||
|
||||
Called in `cmd/web/main.go` after pool creation and before router/server setup:
|
||||
```go
|
||||
if err := db.RunMigrations(ctx, pool); err != nil {
|
||||
slog.Error("migrations failed", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
```
|
||||
|
||||
> IMPORTANT: The `//go:embed` path `../../migrations/*.sql` only works if `migrate.go` is in `backend/internal/db/`. Verify the relative path at implementation time. Alternative: use an `assets` package or place the embed directive in `cmd/web/main.go` directly.
|
||||
|
||||
**Idempotency:** `goose.Up()` is idempotent — already-applied versions are skipped via the `goose_db_version` table. Safe to call on every startup.
|
||||
|
||||
### Pattern 3: Multi-Stage Dockerfile
|
||||
|
||||
**What:** Single Dockerfile, builder compiles both binaries with CGO_ENABLED=0, distroless runtime copies both.
|
||||
|
||||
```dockerfile
|
||||
# Source: GoogleContainerTools/distroless README + Go multi-stage build docs [CITED]
|
||||
|
||||
# ── Stage 1: Generate assets ──────────────────────────────────────────────────
|
||||
FROM node:20-alpine AS assets
|
||||
WORKDIR /app
|
||||
# Download Tailwind standalone CLI (pinned version from justfile)
|
||||
RUN apk add --no-cache curl && \
|
||||
curl -sSL -o /usr/local/bin/tailwindcss \
|
||||
"https://github.com/tailwindlabs/tailwindcss/releases/download/v4.0.0/tailwindcss-linux-x64" && \
|
||||
chmod +x /usr/local/bin/tailwindcss && \
|
||||
curl -sSL -o static/htmx.min.js "https://unpkg.com/htmx.org@2/dist/htmx.min.js" && \
|
||||
curl -sSL -o static/sortable.min.js "https://cdn.jsdelivr.net/npm/sortablejs@1.15.7/Sortable.min.js"
|
||||
COPY tailwind.input.css .
|
||||
COPY templates/ templates/
|
||||
RUN tailwindcss -i tailwind.input.css -o static/tailwind.css --minify
|
||||
|
||||
# ── Stage 2: Build Go binaries ────────────────────────────────────────────────
|
||||
FROM golang:1.26-alpine AS builder
|
||||
WORKDIR /app
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download
|
||||
COPY . .
|
||||
COPY --from=assets /app/static ./static
|
||||
|
||||
# templ generate must run before go build (templates compile to .go files)
|
||||
RUN go install github.com/a-h/templ/cmd/templ@v0.3.1020 && templ generate
|
||||
|
||||
RUN CGO_ENABLED=0 GOOS=linux \
|
||||
go build -ldflags="-s -w" -trimpath -o /app/web ./cmd/web
|
||||
RUN CGO_ENABLED=0 GOOS=linux \
|
||||
go build -ldflags="-s -w" -trimpath -o /app/worker ./cmd/worker
|
||||
|
||||
# ── Stage 3: Runtime ──────────────────────────────────────────────────────────
|
||||
FROM gcr.io/distroless/static-debian12:nonroot
|
||||
COPY --from=builder /app/web /app/web
|
||||
COPY --from=builder /app/worker /app/worker
|
||||
EXPOSE 8080
|
||||
# No CMD or ENTRYPOINT — compose overrides with `command: /app/web` or `/app/worker`
|
||||
```
|
||||
|
||||
**Planner note on Dockerfile stages:** The assets stage (Tailwind build + JS downloads) could be merged into the Go builder stage to reduce complexity, at the cost of a heavier builder image. Two dedicated stages is cleaner but either approach is valid.
|
||||
|
||||
### Pattern 4: /healthz and /readyz Split
|
||||
|
||||
**What:** Current `HealthzHandler` pings the DB and is registered at `/healthz`. D-12 requires `/healthz` to be a pure liveness check (no DB ping); D-13 requires `/readyz` to do the DB ping.
|
||||
|
||||
**Existing code:** `HealthzHandler(pinger Pinger)` in `handlers.go` — it already uses the `Pinger` interface. Simply:
|
||||
1. Rename `HealthzHandler` → `ReadyzHandler` (or keep the name and change behavior — see below)
|
||||
2. Add a new `HealthzHandler` that returns 200 unconditionally
|
||||
3. Register `/healthz` → new liveness handler, `/readyz` → DB-pinging handler
|
||||
|
||||
```go
|
||||
// Liveness — no dependencies
|
||||
func HealthzHandler() http.HandlerFunc {
|
||||
return func(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
_, _ = w.Write([]byte(`{"status":"ok"}`))
|
||||
}
|
||||
}
|
||||
|
||||
// Readiness — DB ping
|
||||
func ReadyzHandler(pinger Pinger) http.HandlerFunc {
|
||||
return func(w http.ResponseWriter, r *http.Request) {
|
||||
ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
|
||||
defer cancel()
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
if err := pinger.Ping(ctx); err != nil {
|
||||
w.WriteHeader(http.StatusServiceUnavailable)
|
||||
_, _ = w.Write([]byte(`{"status":"degraded","db":"down"}`))
|
||||
return
|
||||
}
|
||||
w.WriteHeader(http.StatusOK)
|
||||
_, _ = w.Write([]byte(`{"status":"ok","db":"ok"}`))
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Existing tests:** `TestHealthz_OK` and `TestHealthz_Down` in `handlers_test.go` test the current DB-pinging behavior. These must be updated to test the split: one test for the new liveness `HealthzHandler`, two tests for `ReadyzHandler`.
|
||||
|
||||
### Pattern 5: docker-compose.prod.yaml
|
||||
|
||||
```yaml
|
||||
# Source: D-02 through D-09; v2 compose syntax matching existing compose.yaml
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:16-alpine
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
POSTGRES_DB: ${POSTGRES_DB:-xtablo}
|
||||
POSTGRES_USER: ${POSTGRES_USER:-xtablo}
|
||||
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-xtablo}"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 10
|
||||
# No ports: exposed — only reachable within compose network
|
||||
|
||||
web:
|
||||
image: ${IMAGE:-ghcr.io/yourusername/xtablo}:${TAG:-latest}
|
||||
command: /app/web
|
||||
restart: unless-stopped
|
||||
env_file: .env.prod
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
expose:
|
||||
- "8080"
|
||||
# No ports: — Caddy handles external traffic
|
||||
|
||||
worker:
|
||||
image: ${IMAGE:-ghcr.io/yourusername/xtablo}:${TAG:-latest}
|
||||
command: /app/worker
|
||||
restart: unless-stopped
|
||||
env_file: .env.prod
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
|
||||
caddy:
|
||||
image: caddy:2-alpine
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "80:80"
|
||||
- "443:443"
|
||||
- "443:443/udp" # HTTP/3
|
||||
volumes:
|
||||
- ./deploy/Caddyfile:/etc/caddy/Caddyfile:ro
|
||||
- caddy_data:/data
|
||||
- caddy_config:/config
|
||||
|
||||
volumes:
|
||||
postgres_data:
|
||||
caddy_data:
|
||||
caddy_config:
|
||||
```
|
||||
|
||||
### Pattern 6: Caddyfile
|
||||
|
||||
```caddyfile
|
||||
# Source: caddyserver.com/docs/caddyfile [CITED]
|
||||
# Place at: backend/deploy/Caddyfile
|
||||
# Caddy automatically provisions and renews TLS via Let's Encrypt.
|
||||
# Domain is read from env via {$DOMAIN} interpolation.
|
||||
|
||||
{$DOMAIN} {
|
||||
reverse_proxy web:8080
|
||||
}
|
||||
```
|
||||
|
||||
For HTTPS redirect (HTTP → HTTPS), Caddy handles this automatically when a domain name is specified — no explicit redirect directive is needed. [CITED: caddyserver.com/docs/automatic-https]
|
||||
|
||||
### Anti-Patterns to Avoid
|
||||
|
||||
- **Volume-mounting static/ at runtime:** D-09 prohibits this. Assets must be embedded. A volume mount for assets would break the self-contained binary requirement.
|
||||
- **Separate goose CLI binary in the image:** D-10 prohibits this. Migrations run inside the web binary via `goose.Up()`.
|
||||
- **CMD /app/web in Dockerfile:** D-08 says compose overrides the command; having a default CMD is fine as documentation but the planner should use `command:` in compose to make the intent explicit. Prefer no CMD in the Dockerfile so the compose `command:` is the single source of truth.
|
||||
- **Exposing postgres port to host:** Postgres should only be reachable inside the compose network. Bind a host port only for break-glass debug access, not permanently.
|
||||
- **Single large env_file commit:** The `.env.prod` on the host is gitignored. The repo only contains `.env.example` updated with new R2 vars.
|
||||
- **CGO_ENABLED=1 with distroless/static:** distroless/static has no C libraries. CGO must be disabled.
|
||||
- **Missing `caddy_data` volume:** Without a persistent volume for Caddy's `/data`, TLS certificates are re-issued on every container restart, which will hit Let's Encrypt rate limits.
|
||||
|
||||
## Don't Hand-Roll
|
||||
|
||||
| Problem | Don't Build | Use Instead | Why |
|
||||
|---------|-------------|-------------|-----|
|
||||
| TLS certificate lifecycle | Custom ACME client | Caddy automatic HTTPS | ACME, renewal, stapling, redirects all handled transparently |
|
||||
| Database migration versioning | Custom version table | goose.Up() | Race conditions, rollback tracking, idempotency already solved |
|
||||
| Static file embedding | Custom asset bundler | `//go:embed` + `http.FS` | Stdlib; zero dependencies; correct path resolution |
|
||||
| Let's Encrypt rate limit management | Manual cert issuance | Caddy + persistent `caddy_data` volume | Caddy manages staging/prod issuance and renewal automatically |
|
||||
|
||||
**Key insight:** Every custom solution in this domain (cert management, migration versioning, asset bundling) replicates work the standard tools already do correctly with far fewer failure modes.
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### Pitfall 1: embed.FS path relative to Go file, not working directory
|
||||
**What goes wrong:** `//go:embed ../../static` fails — embed paths cannot traverse above the module root or use `..`.
|
||||
**Why it happens:** `go:embed` paths are relative to the Go source file and cannot reference paths outside the module root.
|
||||
**How to avoid:** Either place the embed directive in a file that is a sibling or ancestor of the `static/` directory (e.g., an `assets` package at `backend/assets/`), or place `static/` inside the same directory tree as `cmd/web/`. Since `static/` is at `backend/static/` and cmd/web is at `backend/cmd/web/`, an `assets` package at `backend/assets/` with `//go:embed ../static` works because it's still within the module. Verify the path during implementation.
|
||||
**Warning signs:** Build error "pattern ../static: invalid pattern syntax" or "pattern must not begin with `..`".
|
||||
|
||||
### Pitfall 2: goose needs *sql.DB, not *pgxpool.Pool
|
||||
**What goes wrong:** Attempting to pass pgxpool.Pool directly to goose.Up() fails to compile — goose's API requires `*database/sql.DB`.
|
||||
**Why it happens:** goose predates the pgx native pool API and abstracts over `database/sql`.
|
||||
**How to avoid:** Extract the connection string via `pool.Config().ConnConfig.ConnString()` and open a `*sql.DB` with `sql.Open("pgx/v5", connStr)` after importing `_ "github.com/jackc/pgx/v5/stdlib"`. Close the sql.DB after migrations complete — the pool remains open for application use.
|
||||
**Warning signs:** Compile error "cannot use pool (type *pgxpool.Pool) as type *sql.DB".
|
||||
|
||||
### Pitfall 3: goose_db_version table collision with test schema
|
||||
**What goes wrong:** Integration tests that create isolated schemas via `goose.SetTableName` in dev continue to work, but in production the goose_db_version table name must remain the default `goose_db_version` in the `public` schema.
|
||||
**Why it happens:** Tests use `goose.SetTableName("schema.goose_db_version")` to namespace the version table. Production must use the plain default.
|
||||
**How to avoid:** The migration helper (Pattern 2 above) calls `goose.SetDialect` and `goose.Up` without `SetTableName`. The global goose state is shared — if tests run in the same process as migrations, order matters. Keep the migration helper stateless and only set the base FS and dialect.
|
||||
**Warning signs:** Production migrations running twice or out of sync with what tests created.
|
||||
|
||||
### Pitfall 4: Missing caddy_data volume = Let's Encrypt rate limit
|
||||
**What goes wrong:** Restarting Caddy without a persistent `/data` volume triggers a new ACME certificate request on every restart. Let's Encrypt allows 50 certificates per registered domain per week; repeated restarts during setup exhaust the quota.
|
||||
**Why it happens:** Caddy stores issued certificates in `/data`. Without a named volume, the directory is ephemeral.
|
||||
**How to avoid:** Always mount a named volume at `/data` and `/config` for the caddy service (see Pattern 5). Test with Let's Encrypt staging (`tls { ca https://acme-staging-v02.api.letsencrypt.org/directory }`) before switching to production.
|
||||
**Warning signs:** Caddy logs "too many certificates already issued for this domain" or certificate errors after restart.
|
||||
|
||||
### Pitfall 5: web service starts before postgres is ready
|
||||
**What goes wrong:** The web binary attempts `goose.Up()` immediately at startup; if postgres is still initializing, the DB connection fails and the process exits.
|
||||
**Why it happens:** Docker Compose `depends_on: service_started` (the default) only waits for the container to start, not for postgres to accept connections.
|
||||
**How to avoid:** Use `depends_on: postgres: condition: service_healthy` in `docker-compose.prod.yaml`. This requires the postgres service to have a `healthcheck:` directive (see Pattern 5 above). The existing `compose.yaml` already uses this pattern — mirror it exactly.
|
||||
**Warning signs:** web container exits at startup with "db connect failed" or "migrations failed"; postgres logs show "database system is starting up".
|
||||
|
||||
### Pitfall 6: templ-generated .go files not committed = Docker build fails
|
||||
**What goes wrong:** `go build ./cmd/web` inside Docker fails because `*_templ.go` files are in `.gitignore` and `COPY . .` does not include them.
|
||||
**Why it happens:** `templ generate` is a dev-time step; the generated files are gitignored per project convention (STATE.md).
|
||||
**How to avoid:** Run `templ generate` inside the Dockerfile builder stage before `go build`. Install the templ CLI in the builder image at the pinned version from justfile (`v0.3.1020`).
|
||||
**Warning signs:** Build error "undefined: templates.TablosPage" or similar undefined references to templ-generated component functions.
|
||||
|
||||
### Pitfall 7: distroless has no shell — debugging requires :debug tag
|
||||
**What goes wrong:** `docker exec -it web sh` fails because distroless/static has no shell.
|
||||
**Why it happens:** distroless deliberately removes all OS tooling to minimize attack surface.
|
||||
**How to avoid:** Use the `:debug` tag variant during initial setup: `gcr.io/distroless/static-debian12:debug`. Switch to `:nonroot` for production. Document in runbook how to use an ephemeral debug container (`docker run --rm -it --network container:<id> busybox sh`) when debugging production.
|
||||
**Warning signs:** `docker exec` returns "OCI runtime exec failed: exec: 'sh': executable file not found".
|
||||
|
||||
### Pitfall 8: /healthz currently pings DB — tests must be updated
|
||||
**What goes wrong:** After splitting `/healthz` (liveness) from `/readyz` (readiness), the existing `TestHealthz_OK` and `TestHealthz_Down` tests fail because they expect the DB-pinging behavior on `/healthz`.
|
||||
**Why it happens:** The current `HealthzHandler` does both jobs. D-12/D-13 require them to be separate routes with separate handlers.
|
||||
**How to avoid:** Update `handlers_test.go` in the same plan that refactors the handlers. Add `TestReadyz_OK` and `TestReadyz_Down` mirroring the current test structure; update `TestHealthz_OK` to verify 200 with no pinger dependency.
|
||||
**Warning signs:** Test failures after the handler refactor: "status = 503; want 200" for the new liveness check.
|
||||
|
||||
## Code Examples
|
||||
|
||||
### go:embed with fs.Sub for static files
|
||||
```go
|
||||
// Source: Go stdlib docs — embed + io/fs [CITED: pkg.go.dev/embed]
|
||||
// In backend/assets/assets.go:
|
||||
package assets
|
||||
|
||||
import "embed"
|
||||
|
||||
//go:embed static
|
||||
var Static embed.FS
|
||||
|
||||
// In internal/web/router.go — NewRouter now accepts fs.FS:
|
||||
import "io/fs"
|
||||
|
||||
func NewRouter(pinger Pinger, staticFS fs.FS, ...) http.Handler {
|
||||
// ...
|
||||
sub, err := fs.Sub(staticFS, "static")
|
||||
if err != nil {
|
||||
panic("static embed sub failed: " + err.Error())
|
||||
}
|
||||
r.Get("/static/*", http.StripPrefix("/static/",
|
||||
http.FileServer(http.FS(sub))).ServeHTTP)
|
||||
}
|
||||
```
|
||||
|
||||
### goose programmatic migration with pgxpool bridge
|
||||
```go
|
||||
// Source: pressly/goose embed docs + pgx/v5/stdlib [CITED: pressly.github.io/goose]
|
||||
import (
|
||||
"database/sql"
|
||||
_ "github.com/jackc/pgx/v5/stdlib"
|
||||
"github.com/pressly/goose/v3"
|
||||
)
|
||||
|
||||
func RunMigrations(pool *pgxpool.Pool, migrationsFS embed.FS) error {
|
||||
dsn := pool.Config().ConnConfig.ConnString()
|
||||
db, err := sql.Open("pgx/v5", dsn)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer db.Close()
|
||||
goose.SetBaseFS(migrationsFS)
|
||||
if err := goose.SetDialect("postgres"); err != nil {
|
||||
return err
|
||||
}
|
||||
return goose.Up(db, "migrations")
|
||||
}
|
||||
```
|
||||
|
||||
## State of the Art
|
||||
|
||||
| Old Approach | Current Approach | When Changed | Impact |
|
||||
|--------------|------------------|--------------|--------|
|
||||
| Separate goose CLI binary | `goose.SetBaseFS` + programmatic `goose.Up` with `embed.FS` | goose v3 (2021) | Binary is self-contained; no CLI tool needed in final image |
|
||||
| `http.Dir(staticDir)` | `http.FS(embed.FS)` via `fs.Sub` | Go 1.16 (2021) | Binary has no runtime file dependencies |
|
||||
| Separate migration service in compose | Migration in app startup | — | Fewer moving parts; migrations atomic with app start |
|
||||
|
||||
**Deprecated/outdated:**
|
||||
- `gcr.io/distroless/static` (without Debian variant suffix): The versioned tag `gcr.io/distroless/static-debian12` is preferred. The unversioned `static` tag still works but `static-debian12:nonroot` is more explicit about the security posture.
|
||||
|
||||
## Assumptions Log
|
||||
|
||||
| # | Claim | Section | Risk if Wrong |
|
||||
|---|-------|---------|---------------|
|
||||
| A1 | `golang:1.26-alpine` is the correct builder base (go.mod says `go 1.26.1`) | Standard Stack | If Alpine's musl causes subtle issues, switch to `golang:1.26` (debian); distroless is still compatible with CGO_ENABLED=0 |
|
||||
| A2 | `//go:embed` can reference `static/` from an `assets` package at `backend/assets/` | Pattern 1 | If embed path resolution differs, the alternative is to place the static/ directory inside cmd/web/ or use a different package layout |
|
||||
| A3 | Tailwind standalone binary download in Docker builder is reliable during CI/CD | Pattern 3 (Dockerfile) | If the external download is flaky, bake the Tailwind binary into the builder image or add it to the repo as a committed artifact |
|
||||
| A4 | `pool.Config().ConnConfig.ConnString()` reconstructs a DSN compatible with `sql.Open("pgx/v5", ...)` | Pattern 2 (goose bridge) | If ConnString() omits sslmode or other params, pass the original DATABASE_URL env var directly to sql.Open instead |
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **embed.FS path for migrations/ in internal/db/migrate.go**
|
||||
- What we know: `//go:embed` paths are relative to the Go source file and cannot use `..` to go above the module root.
|
||||
- What's unclear: Whether `backend/internal/db/migrate.go` can embed `../../migrations/*.sql` — the `migrations/` directory is at `backend/migrations/`, and `internal/db/` is 2 levels deep.
|
||||
- Recommendation: Test during Wave 1 implementation. If `../../migrations` is rejected, move the embed directive to an `assets` package at `backend/assets/` or to `cmd/web/main.go` itself and pass the `embed.FS` into `RunMigrations`.
|
||||
|
||||
2. **templ generate in Docker build**
|
||||
- What we know: `*_templ.go` files are gitignored (STATE.md); the build fails without them.
|
||||
- What's unclear: Whether `RUN go install github.com/a-h/templ/cmd/templ@v0.3.1020 && templ generate` in the builder stage is fast enough or needs caching.
|
||||
- Recommendation: Use `--mount=type=cache,target=/root/.cache/go-build` on the `go build` steps; the templ generate step is fast (it's pure Go → Go codegen, no compilation).
|
||||
|
||||
3. **go:embed and files starting with `.` or `_`**
|
||||
- What we know: By default, `//go:embed` excludes files/dirs starting with `.` or `_`.
|
||||
- What's unclear: Whether `static/` contains any such files (e.g., `.gitkeep`).
|
||||
- Recommendation: Check `ls -la backend/static/` during implementation. If such files exist, use `//go:embed all:static`.
|
||||
|
||||
## Environment Availability
|
||||
|
||||
| Dependency | Required By | Available | Version | Fallback |
|
||||
|------------|------------|-----------|---------|----------|
|
||||
| Docker / docker compose | Image build, compose stack | [ASSUMED: yes on Hetzner VM] | — | podman compose (used in dev) |
|
||||
| Go 1.26 | Dockerfile builder stage | ✓ (pulled from registry) | golang:1.26-alpine | — |
|
||||
| Caddy 2 | TLS proxy | ✓ (pulled from registry) | caddy:2-alpine | — |
|
||||
| postgres:16-alpine | DB service | ✓ (pulled from registry) | 16-alpine | — |
|
||||
| gcr.io/distroless/static-debian12 | Final image | ✓ (pulled from registry) | nonroot | alpine (has shell, larger) |
|
||||
|
||||
**Missing dependencies with no fallback:** None identified.
|
||||
|
||||
**Missing dependencies with fallback:** If the Hetzner VM has only `podman`, use `podman compose` — the `docker-compose.prod.yaml` syntax is identical and podman compose supports it.
|
||||
|
||||
## Validation Architecture
|
||||
|
||||
### Test Framework
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Framework | Go test (stdlib) + httptest |
|
||||
| Config file | none |
|
||||
| Quick run command | `cd backend && go test ./internal/web/... -run TestHealthz -v` |
|
||||
| Full suite command | `cd backend && go test ./...` |
|
||||
|
||||
### Phase Requirements → Test Map
|
||||
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|
||||
|--------|----------|-----------|-------------------|-------------|
|
||||
| DEPLOY-01 | Docker image builds with both binaries | smoke | `docker build -f backend/Dockerfile backend/ --target builder` | ❌ Wave 2 |
|
||||
| DEPLOY-02 | web binary reads all config from env | unit | `go test ./cmd/web/... -run TestEnvConfig` | ❌ Wave 3 (optional; main.go logic is simple) |
|
||||
| DEPLOY-03 | goose.Up() runs on startup without error | unit | `go test ./internal/db/... -run TestRunMigrations` | ❌ Wave 1 |
|
||||
| DEPLOY-04 | /healthz returns 200 (no pinger) | unit | `go test ./internal/web/... -run TestHealthz` | ✅ (needs refactor) |
|
||||
| DEPLOY-04 | /readyz returns 200 when DB ok | unit | `go test ./internal/web/... -run TestReadyz_OK` | ❌ Wave 0 |
|
||||
| DEPLOY-04 | /readyz returns 503 when DB down | unit | `go test ./internal/web/... -run TestReadyz_Down` | ❌ Wave 0 |
|
||||
| DEPLOY-05 | README runbook exists and covers all sections | manual | read backend/README.md | ❌ Wave 4 |
|
||||
|
||||
### Sampling Rate
|
||||
- **Per task commit:** `cd backend && go test ./internal/web/... -count=1`
|
||||
- **Per wave merge:** `cd backend && go test ./... -count=1`
|
||||
- **Phase gate:** Full suite green before `/gsd-verify-work`
|
||||
|
||||
### Wave 0 Gaps
|
||||
- [ ] `backend/internal/web/handlers.go` — refactor HealthzHandler (liveness) + add ReadyzHandler
|
||||
- [ ] `backend/internal/web/handlers_test.go` — update TestHealthz_* + add TestReadyz_*
|
||||
- [ ] `backend/internal/web/router.go` — add `/readyz` route; update `/healthz` to use new liveness handler
|
||||
|
||||
## Security Domain
|
||||
|
||||
### Applicable ASVS Categories
|
||||
|
||||
| ASVS Category | Applies | Standard Control |
|
||||
|---------------|---------|-----------------|
|
||||
| V2 Authentication | no | (sessions already implemented in Phase 2) |
|
||||
| V3 Session Management | no | (already implemented in Phase 2) |
|
||||
| V4 Access Control | no | (health endpoints are public by design — no auth on /healthz or /readyz) |
|
||||
| V5 Input Validation | no | (no new user input in this phase) |
|
||||
| V6 Cryptography | no | (SESSION_SECRET already handled; TLS delegated to Caddy) |
|
||||
| V14 Configuration | yes | Secrets in host .env file (D-05); file must be chmod 600; not committed to git |
|
||||
|
||||
### Known Threat Patterns for deploy phase
|
||||
|
||||
| Pattern | STRIDE | Standard Mitigation |
|
||||
|---------|--------|---------------------|
|
||||
| Secrets in Docker image layers | Information Disclosure | Never COPY .env into image; use `env_file:` at runtime in compose |
|
||||
| .env.prod committed to git | Information Disclosure | .env.prod in .gitignore; .env.example has no real values |
|
||||
| Health endpoint information leakage | Information Disclosure | /healthz and /readyz return minimal JSON; no version strings, no stack traces |
|
||||
| Postgres exposed to internet | Elevation of Privilege | No `ports:` directive on postgres service; only accessible within compose network |
|
||||
| Caddy data volume not backed up | Denial of Service | Document in runbook that caddy_data volume loss requires waiting for cert re-issuance (or restoring from backup) |
|
||||
|
||||
## Sources
|
||||
|
||||
### Primary (HIGH confidence)
|
||||
- `backend/go.mod` — verified goose v3.27.1, pgx/v5 v5.9.2, Go 1.26.1
|
||||
- `backend/cmd/web/main.go` — verified env var reading, pgxpool, static file serving pattern
|
||||
- `backend/cmd/worker/main.go` — verified rivermigrate startup pattern (model for goose.Up)
|
||||
- `backend/internal/web/router.go` + `handlers.go` — verified current healthz handler
|
||||
- `backend/compose.yaml` — verified postgres healthcheck pattern to mirror in prod compose
|
||||
- [pressly/goose embed docs](https://pressly.github.io/goose/blog/2021/embed-sql-migrations/) — programmatic Up with embed.FS
|
||||
- [pkg.go.dev/embed](https://pkg.go.dev/embed) — embed.FS stdlib documentation
|
||||
|
||||
### Secondary (MEDIUM confidence)
|
||||
- [GoogleContainerTools/distroless GitHub](https://github.com/GoogleContainerTools/distroless) — distroless/static-debian12 verified, CGO_ENABLED=0 requirement
|
||||
- [caddyserver.com/docs/caddyfile/patterns](https://caddyserver.com/docs/caddyfile/patterns) — reverse proxy Caddyfile pattern
|
||||
|
||||
### Tertiary (LOW confidence)
|
||||
- WebSearch results on Caddy + Docker Compose — consistent with official docs; using official Caddyfile reference as primary
|
||||
|
||||
## Metadata
|
||||
|
||||
**Confidence breakdown:**
|
||||
- Standard stack: HIGH — all libraries already in go.mod; only embed.FS and fs.Sub are new patterns
|
||||
- Architecture: HIGH — compose + Caddy + distroless is well-established; codebase already has all the pieces
|
||||
- Pitfalls: HIGH — derived from codebase inspection (embed path constraints, goose/pgx bridge, templ codegen, existing healthz tests)
|
||||
|
||||
**Research date:** 2026-05-15
|
||||
**Valid until:** 2026-08-15 (stable ecosystem — go:embed, goose v3, Caddy 2 are stable)
|
||||
Loading…
Reference in a new issue