diff --git a/.planning/phases/05-files/05-RESEARCH.md b/.planning/phases/05-files/05-RESEARCH.md new file mode 100644 index 0000000..a351f12 --- /dev/null +++ b/.planning/phases/05-files/05-RESEARCH.md @@ -0,0 +1,805 @@ +# Phase 5: Files - Research + +**Researched:** 2026-05-15 +**Domain:** Go multipart file upload, S3-compatible storage (aws-sdk-go-v2), HTMX tab navigation, sqlc/goose migration +**Confidence:** HIGH + +## Summary + +Phase 5 introduces file attach/list/download/delete on the tablo detail page, backed by S3-compatible object storage. The implementation follows three parallel tracks: (1) a new `internal/files` package implementing S3 operations via `aws-sdk-go-v2`, (2) a new `tablo_files` migration + sqlc queries, and (3) a restructured tablo detail page with 3-tab navigation (Overview / Tasks / Files). + +The S3 client must be configured for path-style URLs because MinIO (local dev) does not support virtual-host-style bucket addressing. The `aws-sdk-go-v2` SDK supports this via `o.UsePathStyle = true` on the client options, and custom endpoints via `o.BaseEndpoint`. The same client code works against Cloudflare R2 in production by pointing `S3_ENDPOINT` at the R2 S3 API URL. + +Content-type sniffing via `net/http.DetectContentType` reads the first 512 bytes of the multipart file part. Because those bytes must also be streamed to S3, the handler uses `io.MultiReader(sniffBuf, fileRemaining)` to reconstruct the full body before calling `PutObject`. Size enforcement uses `http.MaxBytesReader` wrapping the request body before any parsing begins; `errors.As(err, &(*http.MaxBytesError)(nil))` distinguishes size violations from other errors. + +**Primary recommendation:** Implement the `internal/files` package as a thin S3 wrapper (client constructor + Upload/Delete/PresignDownload methods), inject it as a `*files.Store` into a new `FilesDeps` struct, wire routes inside the existing `RequireAuth` group, and restructure `templates/tablos/detail.templ` into a tab layout that owns Overview, Tasks, and Files tabs. + + +## User Constraints (from CONTEXT.md) + +### Locked Decisions +- **D-01:** Server-proxied upload — client submits multipart POST to Go server; Go streams to S3. No presigned PUT for v1. +- **D-02:** Storage fully configurable via `S3_ENDPOINT`, `S3_BUCKET`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`. Same code works with R2 (prod) or MinIO (local dev). +- **D-03:** Add MinIO service to `compose.yaml` with an `mc` init container that creates the bucket on first start. +- **D-04:** S3 key format: `files/{tablo_id}/{uuid}`. +- **D-05:** Content-type server-detected from first 512 bytes via `net/http.DetectContentType`. Browser MIME ignored. Stored in `tablo_files.content_type`. +- **D-06:** No dedup — each upload gets its own UUID, DB row, S3 object. +- **D-07:** Three tabs: Overview / Tasks / Files on tablo detail page. +- **D-08:** Tab switching via HTMX + `hx-push-url`. Each tab is a server-rendered sub-route. Full page for direct load; fragment for HTMX request. +- **D-09:** Default max 25 MB (`MAX_UPLOAD_SIZE_MB` env var). Enforced via `http.MaxBytesReader`. Friendly error above upload form. +- **D-10:** All file types accepted; no allowlist for v1. + +### Claude's Discretion +- Exact Tailwind styling for file list rows. +- Tab active/inactive visual treatment. +- Which tab is default on `/tablos/{id}` visit (Overview is natural default). +- Whether upload form is always visible or collapsed behind a button. +- HTTP verb for file upload: `POST /tablos/{id}/files`. +- Whether file delete uses inline confirmation or modal. +- Signed URL TTL (suggestion: 5 minutes). +- Whether download is 302 redirect or link with `target="_blank"`. + +### Deferred Ideas (OUT OF SCOPE) +- Self-hosted MinIO in production. +- File type restriction / allowlist. +- Per-user storage quota. +- File preview / thumbnail generation. +- Orphan file cleanup job (Phase 6). + + + +## Phase Requirements + +| ID | Description | Research Support | +|----|-------------|------------------| +| FILE-01 | Upload a file to a tablo; metadata in Postgres, bytes in S3 | `PutObject` via aws-sdk-go-v2 + `InsertTabloFile` sqlc query after sniffing content-type | +| FILE-02 | Server-proxied upload (locked by D-01) | Multipart POST → Go → S3; `r.FormFile("file")` + `io.MultiReader` for sniff-and-forward | +| FILE-03 | List files with original filename and size | `ListFilesByTablo` sqlc query; display via templ component in Files tab | +| FILE-04 | Download via signed time-limited URL | `PresignGetObject` from `s3.NewPresignClient`; 302 redirect | +| FILE-05 | Delete file (DB row + S3 object) | `DeleteObject` then `DeleteTabloFile`; log partial failures | +| FILE-06 | Authorization: only tablo owner can upload/list/download/delete | `loadOwnedTablo` preamble (already enforces ownership via `user_id` filter) | + + +## Architectural Responsibility Map + +| Capability | Primary Tier | Secondary Tier | Rationale | +|------------|-------------|----------------|-----------| +| File upload (bytes) | API / Backend | — | Server proxies to S3; no client-side S3 creds | +| File upload (metadata) | Database | — | `tablo_files` row inserted after S3 put succeeds | +| Content-type detection | API / Backend | — | `net/http.DetectContentType` runs in handler before S3 write | +| Size enforcement | API / Backend | — | `http.MaxBytesReader` wraps request body in handler | +| File listing | API / Backend | Database | Query by tablo_id; rendered as HTML fragment | +| Download URL generation | API / Backend | — | `PresignGetObject` in handler; client gets 302 | +| File deletion | API / Backend | Database | Delete S3 object then DB row; log if S3 fails | +| Tab navigation | Browser / Client | Frontend Server | HTMX `hx-get` + `hx-push-url`; server renders fragment or full page | +| Auth / ownership | API / Backend | — | `loadOwnedTablo` with `user_id` filter; 404 on non-owner | + +## Standard Stack + +### Core (already in go.mod) +| Library | Version | Purpose | Why Standard | +|---------|---------|---------|--------------| +| `github.com/aws/aws-sdk-go-v2` | v1.41.7 | AWS SDK core types | Required by all aws-sdk-go-v2 service modules | +| `github.com/aws/aws-sdk-go-v2/service/s3` | v1.101.0 | S3 API: PutObject, DeleteObject, PresignGetObject | Official AWS Go SDK; works with R2 and MinIO via custom endpoint | +| `github.com/aws/aws-sdk-go-v2/config` | v1.32.17 | SDK config loader | Wires region + credentials into client | +| `github.com/aws/aws-sdk-go-v2/credentials` | v1.19.16 | Static credentials provider | Used for MinIO (access key + secret) | +| `net/http` (stdlib) | Go 1.26.1 | `DetectContentType`, `MaxBytesReader`, `MaxBytesError` | No external dep needed | +| `io` (stdlib) | Go 1.26.1 | `MultiReader` for sniff-and-forward | No external dep needed | +| `github.com/google/uuid` | v1.6.0 (already pinned) | UUID for S3 object keys | Already used in codebase | +| `github.com/pressly/goose/v3` | v3.27.1 (already pinned) | Migration `0005_files.sql` | Already in toolchain | +| `sqlc` | v1.31.1 (CLI, already pinned) | Generate files query methods | Already in toolchain | + +**Versions verified:** [VERIFIED: `go list -m` against Go proxy, 2026-05-15] + +### New modules to add (not yet in go.mod) +```bash +cd backend +go get github.com/aws/aws-sdk-go-v2@latest +go get github.com/aws/aws-sdk-go-v2/config@latest +go get github.com/aws/aws-sdk-go-v2/credentials@latest +go get github.com/aws/aws-sdk-go-v2/service/s3@latest +``` + +These four modules are the minimal set. The `feature/s3/manager` module is NOT needed — direct `PutObject` (streaming body) is correct for server-proxy uploads. + +### Alternatives Considered +| Instead of | Could Use | Tradeoff | +|------------|-----------|----------| +| aws-sdk-go-v2 | minio-go client | minio-go is MinIO-only; aws-sdk-go-v2 works with R2 + MinIO with one config change — use the standard SDK | +| `r.FormFile` + io.MultiReader sniff | Parse entire file into memory first | Memory-first approach breaks at 25 MB under load; stream-and-sniff is correct | +| `http.MaxBytesReader` | Manual `r.ContentLength` check | `ContentLength` is attacker-controlled; `MaxBytesReader` enforces at the read level | + +## Architecture Patterns + +### System Architecture Diagram + +``` +Browser (multipart POST /tablos/{id}/files) + │ + ▼ +Go HTTP Handler (FileUploadHandler) + │ + ├─ http.MaxBytesReader wraps r.Body (size gate) + │ + ├─ r.FormFile("file") → file, header + │ + ├─ io.ReadFull(file, sniffBuf[0:512]) → DetectContentType + │ + ├─ io.MultiReader(sniffBuf, file) → reconstructed body + │ + ├─ uuid.New() → object key: files/{tablo_id}/{uuid} + │ + ├─ s3.PutObject(bucket, key, body, contentType) ─────────► S3 / MinIO / R2 + │ + └─ sqlc.InsertTabloFile(tablo_id, key, filename, size, content_type) + │ + ▼ + Postgres (tablo_files table) + +Browser (GET /tablos/{id}/files/{file_id}/download) + │ + ▼ +Go HTTP Handler (FileDownloadHandler) + │ + ├─ loadOwnedTablo + load file row (ownership verified) + │ + ├─ s3.PresignGetObject(key, TTL=5min) → presignedURL + │ + └─ http.Redirect(w, r, presignedURL, 302) + │ + ▼ + Browser fetches directly from S3/R2/MinIO + +Browser (POST /tablos/{id}/files/{file_id}/delete) + │ + ▼ +Go HTTP Handler (FileDeleteHandler) + │ + ├─ loadOwnedTablo + load file row + │ + ├─ s3.DeleteObject(bucket, key) ──────────────────────────► S3 + │ + └─ sqlc.DeleteTabloFile(file_id, tablo_id) + │ + ▼ + DB row removed + +Browser (GET /tablos/{id}/files — HTMX) + │ + ▼ +Go HTTP Handler (TabloFilesTabHandler) + │ + ├─ HX-Request: true → render files tab fragment only + └─ HX-Request: false → render full Layout with files tab active +``` + +### Recommended Project Structure + +New files only (existing structure unchanged): + +``` +backend/ +├── migrations/ +│ └── 0005_files.sql # tablo_files table +├── internal/ +│ ├── db/ +│ │ ├── queries/ +│ │ │ └── files.sql # sqlc queries for tablo_files +│ │ └── sqlc/ # (regenerated after sqlc generate) +│ │ ├── files.sql.go +│ │ └── models.go # TabloFile struct added +│ ├── files/ +│ │ ├── doc.go # already exists (placeholder) +│ │ └── store.go # S3 client constructor + Upload/Delete/PresignDownload +│ └── web/ +│ ├── handlers_files.go # FilesDeps, upload/list/download/delete handlers +│ └── handlers_files_test.go # RED scaffold (Wave 0) +└── templates/ + ├── tablos.templ # TabloDetailPage restructured into tab layout + └── files.templ # FilesTab, FileRow, UploadForm templ components +``` + +### Pattern 1: S3 Client Construction (with custom endpoint for MinIO) + +```go +// Source: aws-sdk-go-v2 docs + verified WebSearch 2026-05-15 +// internal/files/store.go + +package files + +import ( + "context" + "time" + + "github.com/aws/aws-sdk-go-v2/aws" + "github.com/aws/aws-sdk-go-v2/config" + "github.com/aws/aws-sdk-go-v2/credentials" + "github.com/aws/aws-sdk-go-v2/service/s3" +) + +type Store struct { + client *s3.Client + bucket string +} + +// NewStore constructs a Store pointed at an S3-compatible endpoint. +// endpoint: e.g. "http://localhost:9000" (MinIO) or "https://.r2.cloudflarestorage.com" (R2) +// usePathStyle: true for MinIO (required), false for R2 +func NewStore(ctx context.Context, endpoint, bucket, region, accessKey, secretKey string, usePathStyle bool) (*Store, error) { + cfg, err := config.LoadDefaultConfig(ctx, + config.WithRegion(region), + config.WithCredentialsProvider( + credentials.NewStaticCredentialsProvider(accessKey, secretKey, ""), + ), + ) + if err != nil { + return nil, err + } + + client := s3.NewFromConfig(cfg, func(o *s3.Options) { + o.BaseEndpoint = aws.String(endpoint) + o.UsePathStyle = usePathStyle + }) + + return &Store{client: client, bucket: bucket}, nil +} +``` + +### Pattern 2: Content-Type Sniff + Stream Upload + +```go +// Source: net/http stdlib docs [VERIFIED: go doc net/http, 2026-05-15] +// The key: read first 512 bytes, reconstruct body via io.MultiReader, then stream to S3 + +func (s *Store) Upload(ctx context.Context, key string, file io.Reader) (string, error) { + // Sniff content type from first 512 bytes + var sniffBuf [512]byte + n, _ := io.ReadFull(file, sniffBuf[:]) + contentType := http.DetectContentType(sniffBuf[:n]) + + // Reconstruct full body: sniffed bytes + remaining reader + body := io.MultiReader(bytes.NewReader(sniffBuf[:n]), file) + + _, err := s.client.PutObject(ctx, &s3.PutObjectInput{ + Bucket: aws.String(s.bucket), + Key: aws.String(key), + Body: body, + ContentType: aws.String(contentType), + }) + return contentType, err +} +``` + +### Pattern 3: Presigned Download URL (5-minute TTL) + +```go +// Source: aws-sdk-go-v2 docs [VERIFIED: go list, Context7, 2026-05-15] + +func (s *Store) PresignDownload(ctx context.Context, key string) (string, error) { + presignClient := s3.NewPresignClient(s.client) + req, err := presignClient.PresignGetObject(ctx, &s3.GetObjectInput{ + Bucket: aws.String(s.bucket), + Key: aws.String(key), + }, func(o *s3.PresignOptions) { + o.Expires = 5 * time.Minute + }) + if err != nil { + return "", err + } + return req.URL, nil +} +``` + +### Pattern 4: MaxBytesReader Size Enforcement in Handler + +```go +// Source: net/http stdlib [VERIFIED: go doc net/http, 2026-05-15] + +func FileUploadHandler(deps FilesDeps) http.HandlerFunc { + return func(w http.ResponseWriter, r *http.Request) { + maxBytes := int64(deps.MaxUploadMB) * 1024 * 1024 + r.Body = http.MaxBytesReader(w, r.Body, maxBytes) + + // ParseMultipartForm memory limit (< maxBytes so overflow goes to disk) + if err := r.ParseMultipartForm(2 << 20); err != nil { + var mbErr *http.MaxBytesError + if errors.As(err, &mbErr) { + // Render friendly error fragment above upload form + renderUploadError(w, r, "File too large (max 25 MB).", deps) + return + } + http.Error(w, "bad request", http.StatusBadRequest) + return + } + + file, header, err := r.FormFile("file") + // ... rest of handler + } +} +``` + +### Pattern 5: Tab Navigation in HTMX (hx-push-url) + +```go +// Source: HTMX docs [ASSUMED] — pattern mirrors existing HX-Request detection +// in handlers_tablos.go + +func TabloFilesTabHandler(deps FilesDeps) http.HandlerFunc { + return func(w http.ResponseWriter, r *http.Request) { + tablo, user, ok := loadOwnedTablo(w, r, TablosDeps{Queries: deps.Queries}) + if !ok { + return + } + files, _ := deps.Queries.ListFilesByTablo(r.Context(), tablo.ID) + + w.Header().Set("Content-Type", "text/html; charset=utf-8") + if r.Header.Get("HX-Request") == "true" { + // Fragment only — tab content area + _ = templates.FilesTabFragment(tablo, files, csrf.Token(r)).Render(r.Context(), w) + return + } + // Full page — initial navigation or browser refresh + _ = templates.TabloDetailPage(user, csrf.Token(r), tablo, nil, files, "files").Render(r.Context(), w) + } +} +``` + +In templates, the tab bar uses `hx-push-url` so the browser URL updates on tab switch: +```html + +Files +``` + +### Pattern 6: Delete Handler (S3 + DB in sequence) + +```go +// Source: codebase pattern + aws-sdk-go-v2 [VERIFIED] + +func FileDeleteHandler(deps FilesDeps) http.HandlerFunc { + return func(w http.ResponseWriter, r *http.Request) { + tablo, _, ok := loadOwnedTablo(w, r, TablosDeps{Queries: deps.Queries}) + if !ok { return } + + fileID := // parse chi.URLParam "file_id" + + file, err := deps.Queries.GetTabloFileByID(r.Context(), sqlc.GetTabloFileByIDParams{ + ID: fileID, TabloID: tablo.ID, + }) + if err != nil { /* 404 or 500 */ } + + // Delete from S3 first; log error but don't abort DB delete + if err := deps.Files.Delete(r.Context(), file.S3Key); err != nil { + slog.Default().Error("file delete: S3 delete failed", "key", file.S3Key, "err", err) + } + // Always delete DB row (orphan S3 objects cleaned up in Phase 6 worker) + if err := deps.Queries.DeleteTabloFile(r.Context(), sqlc.DeleteTabloFileParams{ + ID: fileID, TabloID: tablo.ID, + }); err != nil { /* 500 */ } + + if r.Header.Get("HX-Request") == "true" { + // OOB remove the file row from the list + w.WriteHeader(http.StatusOK) + return + } + http.Redirect(w, r, "/tablos/"+tablo.ID.String()+"/files", http.StatusSeeOther) + } +} +``` + +### Pattern 7: compose.yaml MinIO Service + +```yaml +# Source: MinIO Docker Hub docs [VERIFIED: hub.docker.com/r/minio/minio, 2026-05-15] +# Add to backend/compose.yaml alongside existing postgres service + + minio: + image: minio/minio:latest + container_name: xtablo-backend-minio + restart: unless-stopped + environment: + MINIO_ROOT_USER: minioadmin + MINIO_ROOT_PASSWORD: minioadmin + ports: + - "9000:9000" # S3 API + - "9001:9001" # Console UI + command: server /data --console-address ":9001" + volumes: + - minio_data:/data + healthcheck: + test: ["CMD", "mc", "ready", "local"] + interval: 5s + timeout: 5s + retries: 10 + + minio-init: + image: minio/mc:latest + depends_on: + minio: + condition: service_healthy + entrypoint: > + /bin/sh -c " + mc alias set local http://minio:9000 minioadmin minioadmin; + mc mb --ignore-existing local/xtablo-dev; + echo 'bucket ready'; + " + restart: "no" + +volumes: + minio_data: +``` + +### Pattern 8: TabloDetailPage Refactored Signature + +The existing `TabloDetailPage(user, csrfToken, tablo, tasks)` must be extended to carry the active tab and optionally the files list. The cleanest approach is adding an `activeTab string` parameter and passing files when needed: + +```go +// templates/tablos.templ — updated signature +templ TabloDetailPage(user *auth.User, csrfToken string, tablo sqlc.Tablo, tasks []sqlc.Task, files []sqlc.TabloFile, activeTab string) +``` + +The tab bar renders inside Layout, and the active content area dispatches on `activeTab`. Callers that don't need files pass `nil`. + +### Anti-Patterns to Avoid + +- **Storing the original filename as the S3 key:** S3 keys must be path-safe. Use `files/{tablo_id}/{uuid}` only. Store original filename in DB. +- **Reading `r.ContentLength` to enforce size:** ContentLength is set by the client and can be spoofed. Always use `http.MaxBytesReader` at the read level. +- **Deleting DB row before S3 object on cleanup failure:** Delete S3 first, log failures, always delete the DB row. Orphan S3 objects are cleaned in Phase 6. +- **Virtual-host-style URLs with MinIO:** MinIO requires `UsePathStyle: true`. Without it, the SDK produces URLs like `http://xtablo-dev.localhost:9000/...` which MinIO cannot serve. +- **Calling `PutObject` with `io.TeeReader` to sniff and stream simultaneously:** This approach is correct but requires the body to still be readable after sniffing. Use `io.MultiReader(sniffBytes, remainingBody)` instead — it is simpler and avoids the tee pipe complexity. +- **Registering `/tablos/{id}/files` (static) after `/tablos/{id}/files/{file_id}` (parametric):** chi v5 requires static segments BEFORE parametric (Pitfall 1 in existing codebase). Route registration order: `GET /files` → `POST /files` → `GET /files/{file_id}/download` → `POST /files/{file_id}/delete`. +- **Not wrapping `r.Body` before calling `ParseMultipartForm`:** `MaxBytesReader` must wrap the body BEFORE `ParseMultipartForm` is called. Calling `ParseMultipartForm` first then checking `ContentLength` does not enforce the limit at the read level. + +## Don't Hand-Roll + +| Problem | Don't Build | Use Instead | Why | +|---------|-------------|-------------|-----| +| S3 request signing | Custom HMAC-SHA256 AWS SigV4 signer | `aws-sdk-go-v2/service/s3` | SigV4 has 30+ edge cases; presigned URL logic is subtle | +| Content-type detection | Custom magic-byte table | `net/http.DetectContentType` (stdlib) | Implements the WHATWG MIME-sniffing spec; covers 200+ types | +| File streaming to S3 | Manual HTTP PUT with net/http | `s3.PutObject` with `io.Reader` body | Handles chunked encoding, retries, error wrapping | +| UUID generation | rand-based ID | `github.com/google/uuid` (already in go.mod) | Already a dep; collision-free | + +**Key insight:** The S3 API and request signing are non-trivial. The aws-sdk-go-v2 SDK handles SigV4, retries, error normalization, and presigned URL construction. Do not attempt to replicate any of this. + +## Common Pitfalls + +### Pitfall 1: Virtual-host-style URLs break MinIO +**What goes wrong:** `aws-sdk-go-v2` defaults to virtual-host-style URLs (`bucket.localhost:9000/key`). MinIO does not support virtual hosting by default and returns a "bucket not found" or DNS failure. +**Why it happens:** The SDK transforms the URL to move the bucket into the hostname unless `UsePathStyle: true` is set. +**How to avoid:** Always set `o.UsePathStyle = true` when `S3_ENDPOINT` is set and the host is not `*.amazonaws.com` or `*.r2.cloudflarestorage.com`. Simplest: always set it for local dev via env; R2 supports both styles. +**Warning signs:** Upload returns 404 or "NoSuchBucket" despite the bucket existing in MinIO console. + +### Pitfall 2: MaxBytesReader must wrap r.Body before ParseMultipartForm +**What goes wrong:** Calling `r.ParseMultipartForm(maxMemory)` before wrapping `r.Body` in `MaxBytesReader` means the size limit is not enforced — Go's multipart parser reads as much as it can. +**Why it happens:** `ParseMultipartForm`'s `maxMemory` argument only controls how much goes to memory vs. temp file, NOT the total size read from the body. +**How to avoid:** `r.Body = http.MaxBytesReader(w, r.Body, maxBytes)` MUST be the first operation in the handler, before any form parsing. +**Warning signs:** Files larger than 25 MB upload successfully in tests. + +### Pitfall 3: io.ReadFull returns io.ErrUnexpectedEOF for files < 512 bytes +**What goes wrong:** `io.ReadFull(file, buf[0:512])` returns `io.ErrUnexpectedEOF` if the file is smaller than 512 bytes. If the handler treats this as a fatal error, uploads of small files fail. +**Why it happens:** `io.ReadFull` returns `io.ErrUnexpectedEOF` when fewer bytes than requested are read before EOF — this is expected for small files. +**How to avoid:** Accept both `nil` and `io.ErrUnexpectedEOF` as non-fatal: `if err != nil && !errors.Is(err, io.ErrUnexpectedEOF) { return err }`. Pass `sniffBuf[:n]` (not `sniffBuf[:512]`) to `DetectContentType`. +**Warning signs:** Uploading a file < 512 bytes returns 500. + +### Pitfall 4: TabloDetailPage signature change breaks existing tests +**What goes wrong:** Changing `TabloDetailPage` from `(user, csrfToken, tablo, tasks)` to `(user, csrfToken, tablo, tasks, files, activeTab)` breaks every call site and test that uses the old signature. +**Why it happens:** Go requires all arguments; there are no optional params. +**How to avoid:** Update all call sites in the same PR. Add `files []sqlc.TabloFile` and `activeTab string` as new trailing parameters. Callers on Overview/Tasks tabs pass `nil, "overview"` or `nil, "tasks"`. +**Warning signs:** Compile error across `handlers_tablos.go`, `handlers_tablos_test.go`, `handlers_tasks_test.go`. + +### Pitfall 5: Delete S3 object but not DB row (or vice versa) on partial failure +**What goes wrong:** S3 delete succeeds but DB delete fails → orphan S3 object; DB delete succeeds but S3 delete fails → "ghost" file in UI. +**Why it happens:** Two external systems involved; no distributed transaction. +**How to avoid:** Delete S3 first, log the error but do not abort. Always attempt DB delete. Orphan S3 objects are cleaned by Phase 6 worker. If DB delete fails, return 500 (the file remains visible; user can retry). +**Warning signs:** File row appears in list after delete, or vice versa. + +### Pitfall 6: chi route registration order for file sub-routes +**What goes wrong:** Registering `GET /tablos/{id}/files/{file_id}/download` before `GET /tablos/{id}/files` causes chi to shadow the static route. +**Why it happens:** chi v5 selects the first matching route. Parametric routes consume any segment including static ones if registered first. +**How to avoid:** Register in this order inside the RequireAuth group: `GET /tablos/{id}/files` → `POST /tablos/{id}/files` → `GET /tablos/{id}/files/{file_id}/download` → `GET /tablos/{id}/files/{file_id}/delete-confirm` → `POST /tablos/{id}/files/{file_id}/delete`. Static first, parametric last (identical to existing task route ordering). +**Warning signs:** `GET /tablos/{id}/files` returns 404 or wrong handler. + +### Pitfall 7: minio-init container exits and podman restarts it +**What goes wrong:** `minio-init` finishes its `mc mb` command and exits with code 0. If `restart: unless-stopped` is set, podman/docker restarts it in a loop. +**Why it happens:** Init containers are supposed to run once. +**How to avoid:** Set `restart: "no"` on the `minio-init` service (shown in Pattern 7 above). [VERIFIED: standard compose pattern] + +### Pitfall 8: multipart.FileHeader.Size is 0 until file is fully read +**What goes wrong:** `header.Size` may be 0 (not set by all browsers/clients) if the file size was not sent in the multipart header. Using `header.Size` as the stored file size results in `0` bytes recorded in the DB. +**Why it happens:** The `Content-Length` on a multipart file part is optional. +**How to avoid:** After streaming to S3, use the number of bytes actually written (track with an `io.CountingWriter` or equivalent) as the authoritative size. Alternatively, store `header.Size` as a fallback only when streaming is complete. The `PutObjectOutput` does not return byte count; wrap the body reader with a counter before passing to `PutObject`. +**Warning signs:** `tablo_files.size_bytes` is always 0 in the DB. + +## Code Examples + +### S3 Client Construction for MinIO (local dev) +```go +// Source: aws-sdk-go-v2 docs [VERIFIED: Context7 + WebSearch, 2026-05-15] +s3Client := s3.NewFromConfig(cfg, func(o *s3.Options) { + o.BaseEndpoint = aws.String("http://localhost:9000") + o.UsePathStyle = true // required for MinIO +}) +``` + +### DeleteObject +```go +// Source: aws-sdk-go-v2/service/s3 [VERIFIED: go list, 2026-05-15] +_, err := s.client.DeleteObject(ctx, &s3.DeleteObjectInput{ + Bucket: aws.String(s.bucket), + Key: aws.String(objectKey), +}) +``` + +### PresignGetObject with 5-minute TTL +```go +// Source: aws-sdk-go-v2 PresignClient [VERIFIED: Context7, 2026-05-15] +presignClient := s3.NewPresignClient(s.client) +req, err := presignClient.PresignGetObject(ctx, &s3.GetObjectInput{ + Bucket: aws.String(s.bucket), + Key: aws.String(objectKey), +}, func(o *s3.PresignOptions) { + o.Expires = 5 * time.Minute +}) +// req.URL is the signed URL +``` + +### Migration 0005_files.sql +```sql +-- Source: migration style from 0004_tasks.sql [VERIFIED: codebase] +-- migrations/0005_files.sql + +-- +goose Up +CREATE TABLE tablo_files ( + id uuid PRIMARY KEY DEFAULT gen_random_uuid(), + tablo_id uuid NOT NULL REFERENCES tablos(id) ON DELETE CASCADE, + s3_key text NOT NULL, + filename text NOT NULL, + content_type text NOT NULL DEFAULT 'application/octet-stream', + size_bytes bigint NOT NULL DEFAULT 0, + created_at timestamptz NOT NULL DEFAULT now() +); + +CREATE INDEX tablo_files_tablo_id_idx ON tablo_files(tablo_id, created_at DESC); + +-- +goose Down +DROP TABLE IF EXISTS tablo_files; +``` + +Notes on schema decisions: +- No `updated_at` — files are immutable (upload once, delete only). +- `size_bytes bigint` — not `integer`; 25 MB fits in int32 but bigint future-proofs for larger limits. +- `ON DELETE CASCADE` — if the tablo is deleted, file rows are cleaned up. S3 orphan cleanup is Phase 6. +- `s3_key` stored for S3 operations; `filename` stored for display. + +### sqlc queries (files.sql) +```sql +-- Source: tasks.sql style [VERIFIED: codebase] +-- internal/db/queries/files.sql + +-- name: InsertTabloFile :one +INSERT INTO tablo_files (tablo_id, s3_key, filename, content_type, size_bytes) +VALUES ($1, $2, $3, $4, $5) +RETURNING id, tablo_id, s3_key, filename, content_type, size_bytes, created_at; + +-- name: ListFilesByTablo :many +SELECT id, tablo_id, s3_key, filename, content_type, size_bytes, created_at +FROM tablo_files +WHERE tablo_id = $1 +ORDER BY created_at DESC; + +-- name: GetTabloFileByID :one +SELECT id, tablo_id, s3_key, filename, content_type, size_bytes, created_at +FROM tablo_files +WHERE id = $1 AND tablo_id = $2; + +-- name: DeleteTabloFile :exec +DELETE FROM tablo_files WHERE id = $1 AND tablo_id = $2; +``` + +### FilesDeps Struct (mirrors TasksDeps) +```go +// Source: TasksDeps pattern [VERIFIED: codebase handlers_tasks.go] +type FilesDeps struct { + Queries *sqlc.Queries + Files *files.Store + MaxUploadMB int // parsed from MAX_UPLOAD_SIZE_MB env var (default 25) +} +``` + +### NewRouter updated signature +```go +// Source: router.go [VERIFIED: codebase] +func NewRouter(pinger Pinger, staticDir string, deps AuthDeps, tabloDeps TablosDeps, taskDeps TasksDeps, fileDeps FilesDeps, csrfKey []byte, env string, trustedOrigins ...string) http.Handler +``` + +### Route additions inside RequireAuth group +```go +// Source: router.go pattern [VERIFIED: codebase] +// Static before parametric (Pitfall 6 / existing Pitfall 1): +r.Get("/tablos/{id}/files", TabloFilesTabHandler(fileDeps)) +r.Post("/tablos/{id}/files", FileUploadHandler(fileDeps)) +// Parametric file routes: +r.Get("/tablos/{id}/files/{file_id}/download", FileDownloadHandler(fileDeps)) +r.Get("/tablos/{id}/files/{file_id}/delete-confirm", FileDeleteConfirmHandler(fileDeps)) +r.Post("/tablos/{id}/files/{file_id}/delete", FileDeleteHandler(fileDeps)) +``` + +### Tab routes for Overview and Tasks +Phase 5 introduces tab sub-routes. The existing `GET /tablos/{id}` becomes the Overview tab. New routes needed: + +```go +// Overview (existing handler, now also serves Overview tab content) +r.Get("/tablos/{id}", TabloDetailHandler(tabloDeps)) // already registered + +// Tasks tab — new route, re-uses existing task data fetching +r.Get("/tablos/{id}/tasks", TabloTasksTabHandler(taskDeps)) // new handler + +// Files tab +r.Get("/tablos/{id}/files", TabloFilesTabHandler(fileDeps)) +``` + +The `/tablos/{id}/tasks` GET route did not exist in Phase 4 (tasks were always embedded in the full detail page). Phase 5 adds it as a tab entry point. + +### .env.example additions +```bash +# S3-compatible object storage +S3_ENDPOINT=http://localhost:9000 +S3_BUCKET=xtablo-dev +S3_ACCESS_KEY=minioadmin +S3_SECRET_KEY=minioadmin +S3_REGION=us-east-1 +S3_USE_PATH_STYLE=true # true for MinIO; false or omit for R2 + +# Max upload size in megabytes (default: 25) +MAX_UPLOAD_SIZE_MB=25 +``` + +## State of the Art + +| Old Approach | Current Approach | When Changed | Impact | +|--------------|------------------|--------------|--------| +| `aws-sdk-go-v1` (single module) | `aws-sdk-go-v2` (modular) | 2021 | Each service is a separate Go module; import only what you use | +| `EndpointResolver` (deprecated) | `BaseEndpoint` option on client | v2 SDK 2023+ | Simpler custom endpoint config for MinIO / local dev | +| Multipart file stored entirely in memory | `http.MaxBytesReader` + streaming to S3 | Go 1.16+ | Memory-safe for large uploads | + +**Not deprecated but worth noting:** +- `feature/s3/manager.Uploader` is for multipart *uploads* (> 5 MB chunks, parallel). For server-proxy where we're just forwarding a user's file, direct `PutObject` with an `io.Reader` body is correct and simpler. + +## Assumptions Log + +| # | Claim | Section | Risk if Wrong | +|---|-------|---------|---------------| +| A1 | HTMX `hx-push-url` works correctly with tab switching; fragment swap targets `#tab-content` div | Patterns 5, tab navigation | Tab URL doesn't update on switch; planner should verify HTMX `hx-push-url` with `hx-target` != `body` | +| A2 | Cloudflare R2 does not require `UsePathStyle` (virtual-host-style works) | Pitfall 1 | R2 endpoints use `{bucket}.{account}.r2.cloudflarestorage.com` — virtual-host-style is the norm; path-style may also work | +| A3 | `multipart.FileHeader.Size` is unreliable (some clients send 0) | Pitfall 8, Code Examples | If reliable, a counting wrapper is unnecessary; but safest to always count bytes written | +| A4 | MinIO `latest` image includes `mc` binary (for healthcheck `mc ready local`) | Pattern 7 compose.yaml | `mc` may not be bundled in `minio/minio` image; separate `minio/mc` init container is safer (used in Pattern 7) | + +## Open Questions + +1. **Tab route for `/tablos/{id}/tasks` conflicts with existing `/tablos/{id}/tasks/*` child routes** + - What we know: `GET /tablos/{id}/tasks` is NOT currently registered. All task routes are `GET/POST /tablos/{id}/tasks/...` (with a trailing segment). + - What's unclear: chi v5's behavior when registering `GET /tablos/{id}/tasks` alongside `GET /tablos/{id}/tasks/new` and `POST /tablos/{id}/tasks/reorder`. Static + trailing segments should be fine, but the exact chi precedence should be verified. + - Recommendation: Add `r.Get("/tablos/{id}/tasks", TabloTasksTabHandler(...))` BEFORE the existing task sub-routes. Write a test that `GET /tablos/{id}/tasks` returns 200 with the kanban board, and `GET /tablos/{id}/tasks/new` still returns the new-task form. + +2. **Byte-counting for `size_bytes` column** + - What we know: `s3.PutObject` does not return the number of bytes written. `multipart.FileHeader.Size` may be 0. + - What's unclear: The cleanest counting approach (wrap body in a custom `io.Reader` that increments a counter, vs. use `header.Size` as-is and accept that it may be browser-dependent). + - Recommendation: Wrap the body with a simple `byteCountReader` before passing to `PutObject`. After upload, use the counter value for `InsertTabloFileParams.SizeBytes`. + +## Environment Availability + +| Dependency | Required By | Available | Version | Fallback | +|------------|------------|-----------|---------|----------| +| Podman | MinIO local dev via compose.yaml | ✓ | 5.8.2 | Docker (drop-in) | +| Docker | MinIO local dev (alternative) | ✗ | — | Podman (primary) | +| MinIO image | compose.yaml minio service | ✓ (via registry) | latest (2025-09-07) | — | +| aws-sdk-go-v2/service/s3 | files.Store | ✗ (not in go.mod yet) | v1.101.0 available | — | +| aws-sdk-go-v2/config | files.Store | ✗ (not in go.mod yet) | v1.32.17 available | — | +| aws-sdk-go-v2/credentials | files.Store | ✗ (not in go.mod yet) | v1.19.16 available | — | +| sqlc CLI | regenerate after files.sql | ✓ (pinned in justfile) | v1.31.1 | — | +| templ CLI | regenerate after .templ changes | ✓ (pinned in justfile) | v0.3.1020 | — | + +**Missing dependencies with no fallback:** +- `aws-sdk-go-v2/service/s3`, `aws-sdk-go-v2/config`, `aws-sdk-go-v2/credentials` — must be added via `go get` in Wave 0. + +**Missing dependencies with fallback:** +- None. + +## Validation Architecture + +### Test Framework +| Property | Value | +|----------|-------| +| Framework | Go testing (stdlib) + httptest | +| Config file | none (go test ./...) | +| Quick run command | `go test ./internal/web/ -run TestFile -v` | +| Full suite command | `go test ./... -count=1` | + +### Phase Requirements → Test Map + +| Req ID | Behavior | Test Type | Automated Command | File Exists? | +|--------|----------|-----------|-------------------|-------------| +| FILE-01 | Upload creates DB row + S3 object | integration | `go test ./internal/web/ -run TestFileUpload -v` | ❌ Wave 0 | +| FILE-02 | Server-proxied (no presigned PUT) | integration | (covered by FILE-01 test) | ❌ Wave 0 | +| FILE-03 | Files listed with filename + size, newest first | integration | `go test ./internal/web/ -run TestFilesList -v` | ❌ Wave 0 | +| FILE-04 | Download returns 302 to signed URL | integration | `go test ./internal/web/ -run TestFileDownload -v` | ❌ Wave 0 | +| FILE-05 | Delete removes DB row (S3 mocked in unit tests) | integration | `go test ./internal/web/ -run TestFileDelete -v` | ❌ Wave 0 | +| FILE-06 | Non-owner gets 404 on all file routes | integration | `go test ./internal/web/ -run TestFileOwnership -v` | ❌ Wave 0 | + +**S3 mocking strategy for tests:** The `files.Store` should be defined as an interface (`FileStorer`) so handler tests can inject a stub without requiring a real MinIO instance. Integration tests that need real S3 require MinIO running (skip if `S3_ENDPOINT` unset, following the `TEST_DATABASE_URL` skip pattern from auth tests). + +### Sampling Rate +- **Per task commit:** `go test ./internal/web/ -run TestFile -v` +- **Per wave merge:** `go test ./... -count=1` +- **Phase gate:** Full suite green before `/gsd-verify-work` + +### Wave 0 Gaps +- [ ] `backend/internal/web/handlers_files_test.go` — RED scaffold covering FILE-01..06 + IDOR test +- [ ] `backend/internal/files/store_test.go` — unit tests for Store methods (with mock S3 client or real MinIO) +- [ ] `backend/internal/files/store.go` — stub that compiles (placeholder methods) +- [ ] `go get` for aws-sdk-go-v2 modules — must run before any file code compiles + +## Security Domain + +### Applicable ASVS Categories + +| ASVS Category | Applies | Standard Control | +|---------------|---------|-----------------| +| V2 Authentication | no (RequireAuth already covers) | existing RequireAuth middleware | +| V3 Session Management | no | existing session middleware | +| V4 Access Control | yes | `loadOwnedTablo` + `tablo_id` filter in all file queries (FILE-06) | +| V5 Input Validation | yes | `http.MaxBytesReader` for size; `DetectContentType` for MIME; filename stored as display only (not used in key) | +| V6 Cryptography | yes | SigV4 signing via aws-sdk-go-v2 (never hand-rolled) | + +### Known Threat Patterns + +| Pattern | STRIDE | Standard Mitigation | +|---------|--------|---------------------| +| Path traversal via filename | Tampering | Filename stored in DB only; S3 key is `files/{uuid}` — filename never touches S3 key | +| MIME-type spoofing (client sends wrong Content-Type) | Spoofing | Server-detected via `DetectContentType` (D-05); browser MIME ignored | +| Upload flood / DoS | Denial of Service | `http.MaxBytesReader(maxBytes)` + configurable `MAX_UPLOAD_SIZE_MB` | +| IDOR — accessing another user's files | Elevation of Privilege | All file queries include `tablo_id` filter; `loadOwnedTablo` verifies `user_id` (FILE-06) | +| CSRF on upload/delete | Tampering | `gorilla/csrf` middleware already in stack; all state-changing POSTs require CSRF token | +| Presigned URL exposure | Information Disclosure | Short TTL (5 min); URL is generated per-request and not stored | + +## Sources + +### Primary (HIGH confidence) +- `backend/internal/web/handlers_tablos.go` — handler pattern, loadOwnedTablo, HX-Request detection +- `backend/internal/web/handlers_tasks.go` — TasksDeps pattern to mirror for FilesDeps +- `backend/internal/web/router.go` — route registration order, RequireAuth group +- `backend/migrations/0004_tasks.sql` — migration style reference +- `backend/internal/db/queries/tasks.sql` — sqlc query style +- `backend/compose.yaml` — existing service definition to extend +- `backend/go.mod` — pinned versions, no aws-sdk-go-v2 present +- Context7 `/aws/aws-sdk-go-v2` — S3 PutObject, GetObject, PresignGetObject, client construction +- Context7 `/websites/aws_amazon_sdk-for-go_v2_developer-guide` — BaseEndpoint, UsePathStyle for custom endpoints +- `go doc net/http DetectContentType` — [VERIFIED: stdlib] +- `go doc net/http MaxBytesReader` — [VERIFIED: stdlib] +- `go doc mime/multipart FileHeader` — [VERIFIED: stdlib] +- `go list -m github.com/aws/aws-sdk-go-v2/service/s3@latest` — v1.101.0 [VERIFIED] + +### Secondary (MEDIUM confidence) +- WebSearch: aws-sdk-go-v2 + MinIO + UsePathStyle + PresignGetObject — cross-referenced with Context7 docs +- MinIO Docker Hub: `minio/minio:latest` (2025-09-07), `minio/mc:latest` — [VERIFIED: hub.docker.com API] + +### Tertiary (LOW confidence — flagged as [ASSUMED]) +- HTMX `hx-push-url` behavior with scoped `hx-target` (not `body`) — see Assumptions Log A1 + +## Metadata + +**Confidence breakdown:** +- Standard stack: HIGH — all module versions verified via `go list -m` +- Architecture: HIGH — mirrors established codebase patterns exactly +- S3 patterns: HIGH — verified via Context7 + official AWS docs +- Pitfalls: HIGH — derived from stdlib docs + codebase conventions +- Compose / MinIO: MEDIUM — MinIO image version verified; init container pattern is standard but not run against the project's podman-compose setup +- HTMX tab navigation: MEDIUM — pattern is well-known; specific `hx-push-url` + fragment behavior marked LOW for one edge case + +**Research date:** 2026-05-15 +**Valid until:** 2026-06-15 (aws-sdk-go-v2 updates frequently; verify module versions before installation)