docs(05): research phase files domain

This commit is contained in:
Arthur Belleville 2026-05-15 11:46:09 +02:00
parent 740d367dc4
commit 261c85ae73
No known key found for this signature in database

View file

@ -0,0 +1,805 @@
# Phase 5: Files - Research
**Researched:** 2026-05-15
**Domain:** Go multipart file upload, S3-compatible storage (aws-sdk-go-v2), HTMX tab navigation, sqlc/goose migration
**Confidence:** HIGH
## Summary
Phase 5 introduces file attach/list/download/delete on the tablo detail page, backed by S3-compatible object storage. The implementation follows three parallel tracks: (1) a new `internal/files` package implementing S3 operations via `aws-sdk-go-v2`, (2) a new `tablo_files` migration + sqlc queries, and (3) a restructured tablo detail page with 3-tab navigation (Overview / Tasks / Files).
The S3 client must be configured for path-style URLs because MinIO (local dev) does not support virtual-host-style bucket addressing. The `aws-sdk-go-v2` SDK supports this via `o.UsePathStyle = true` on the client options, and custom endpoints via `o.BaseEndpoint`. The same client code works against Cloudflare R2 in production by pointing `S3_ENDPOINT` at the R2 S3 API URL.
Content-type sniffing via `net/http.DetectContentType` reads the first 512 bytes of the multipart file part. Because those bytes must also be streamed to S3, the handler uses `io.MultiReader(sniffBuf, fileRemaining)` to reconstruct the full body before calling `PutObject`. Size enforcement uses `http.MaxBytesReader` wrapping the request body before any parsing begins; `errors.As(err, &(*http.MaxBytesError)(nil))` distinguishes size violations from other errors.
**Primary recommendation:** Implement the `internal/files` package as a thin S3 wrapper (client constructor + Upload/Delete/PresignDownload methods), inject it as a `*files.Store` into a new `FilesDeps` struct, wire routes inside the existing `RequireAuth` group, and restructure `templates/tablos/detail.templ` into a tab layout that owns Overview, Tasks, and Files tabs.
<user_constraints>
## User Constraints (from CONTEXT.md)
### Locked Decisions
- **D-01:** Server-proxied upload — client submits multipart POST to Go server; Go streams to S3. No presigned PUT for v1.
- **D-02:** Storage fully configurable via `S3_ENDPOINT`, `S3_BUCKET`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`. Same code works with R2 (prod) or MinIO (local dev).
- **D-03:** Add MinIO service to `compose.yaml` with an `mc` init container that creates the bucket on first start.
- **D-04:** S3 key format: `files/{tablo_id}/{uuid}`.
- **D-05:** Content-type server-detected from first 512 bytes via `net/http.DetectContentType`. Browser MIME ignored. Stored in `tablo_files.content_type`.
- **D-06:** No dedup — each upload gets its own UUID, DB row, S3 object.
- **D-07:** Three tabs: Overview / Tasks / Files on tablo detail page.
- **D-08:** Tab switching via HTMX + `hx-push-url`. Each tab is a server-rendered sub-route. Full page for direct load; fragment for HTMX request.
- **D-09:** Default max 25 MB (`MAX_UPLOAD_SIZE_MB` env var). Enforced via `http.MaxBytesReader`. Friendly error above upload form.
- **D-10:** All file types accepted; no allowlist for v1.
### Claude's Discretion
- Exact Tailwind styling for file list rows.
- Tab active/inactive visual treatment.
- Which tab is default on `/tablos/{id}` visit (Overview is natural default).
- Whether upload form is always visible or collapsed behind a button.
- HTTP verb for file upload: `POST /tablos/{id}/files`.
- Whether file delete uses inline confirmation or modal.
- Signed URL TTL (suggestion: 5 minutes).
- Whether download is 302 redirect or link with `target="_blank"`.
### Deferred Ideas (OUT OF SCOPE)
- Self-hosted MinIO in production.
- File type restriction / allowlist.
- Per-user storage quota.
- File preview / thumbnail generation.
- Orphan file cleanup job (Phase 6).
</user_constraints>
<phase_requirements>
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|------------------|
| FILE-01 | Upload a file to a tablo; metadata in Postgres, bytes in S3 | `PutObject` via aws-sdk-go-v2 + `InsertTabloFile` sqlc query after sniffing content-type |
| FILE-02 | Server-proxied upload (locked by D-01) | Multipart POST → Go → S3; `r.FormFile("file")` + `io.MultiReader` for sniff-and-forward |
| FILE-03 | List files with original filename and size | `ListFilesByTablo` sqlc query; display via templ component in Files tab |
| FILE-04 | Download via signed time-limited URL | `PresignGetObject` from `s3.NewPresignClient`; 302 redirect |
| FILE-05 | Delete file (DB row + S3 object) | `DeleteObject` then `DeleteTabloFile`; log partial failures |
| FILE-06 | Authorization: only tablo owner can upload/list/download/delete | `loadOwnedTablo` preamble (already enforces ownership via `user_id` filter) |
</phase_requirements>
## Architectural Responsibility Map
| Capability | Primary Tier | Secondary Tier | Rationale |
|------------|-------------|----------------|-----------|
| File upload (bytes) | API / Backend | — | Server proxies to S3; no client-side S3 creds |
| File upload (metadata) | Database | — | `tablo_files` row inserted after S3 put succeeds |
| Content-type detection | API / Backend | — | `net/http.DetectContentType` runs in handler before S3 write |
| Size enforcement | API / Backend | — | `http.MaxBytesReader` wraps request body in handler |
| File listing | API / Backend | Database | Query by tablo_id; rendered as HTML fragment |
| Download URL generation | API / Backend | — | `PresignGetObject` in handler; client gets 302 |
| File deletion | API / Backend | Database | Delete S3 object then DB row; log if S3 fails |
| Tab navigation | Browser / Client | Frontend Server | HTMX `hx-get` + `hx-push-url`; server renders fragment or full page |
| Auth / ownership | API / Backend | — | `loadOwnedTablo` with `user_id` filter; 404 on non-owner |
## Standard Stack
### Core (already in go.mod)
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| `github.com/aws/aws-sdk-go-v2` | v1.41.7 | AWS SDK core types | Required by all aws-sdk-go-v2 service modules |
| `github.com/aws/aws-sdk-go-v2/service/s3` | v1.101.0 | S3 API: PutObject, DeleteObject, PresignGetObject | Official AWS Go SDK; works with R2 and MinIO via custom endpoint |
| `github.com/aws/aws-sdk-go-v2/config` | v1.32.17 | SDK config loader | Wires region + credentials into client |
| `github.com/aws/aws-sdk-go-v2/credentials` | v1.19.16 | Static credentials provider | Used for MinIO (access key + secret) |
| `net/http` (stdlib) | Go 1.26.1 | `DetectContentType`, `MaxBytesReader`, `MaxBytesError` | No external dep needed |
| `io` (stdlib) | Go 1.26.1 | `MultiReader` for sniff-and-forward | No external dep needed |
| `github.com/google/uuid` | v1.6.0 (already pinned) | UUID for S3 object keys | Already used in codebase |
| `github.com/pressly/goose/v3` | v3.27.1 (already pinned) | Migration `0005_files.sql` | Already in toolchain |
| `sqlc` | v1.31.1 (CLI, already pinned) | Generate files query methods | Already in toolchain |
**Versions verified:** [VERIFIED: `go list -m` against Go proxy, 2026-05-15]
### New modules to add (not yet in go.mod)
```bash
cd backend
go get github.com/aws/aws-sdk-go-v2@latest
go get github.com/aws/aws-sdk-go-v2/config@latest
go get github.com/aws/aws-sdk-go-v2/credentials@latest
go get github.com/aws/aws-sdk-go-v2/service/s3@latest
```
These four modules are the minimal set. The `feature/s3/manager` module is NOT needed — direct `PutObject` (streaming body) is correct for server-proxy uploads.
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| aws-sdk-go-v2 | minio-go client | minio-go is MinIO-only; aws-sdk-go-v2 works with R2 + MinIO with one config change — use the standard SDK |
| `r.FormFile` + io.MultiReader sniff | Parse entire file into memory first | Memory-first approach breaks at 25 MB under load; stream-and-sniff is correct |
| `http.MaxBytesReader` | Manual `r.ContentLength` check | `ContentLength` is attacker-controlled; `MaxBytesReader` enforces at the read level |
## Architecture Patterns
### System Architecture Diagram
```
Browser (multipart POST /tablos/{id}/files)
Go HTTP Handler (FileUploadHandler)
├─ http.MaxBytesReader wraps r.Body (size gate)
├─ r.FormFile("file") → file, header
├─ io.ReadFull(file, sniffBuf[0:512]) → DetectContentType
├─ io.MultiReader(sniffBuf, file) → reconstructed body
├─ uuid.New() → object key: files/{tablo_id}/{uuid}
├─ s3.PutObject(bucket, key, body, contentType) ─────────► S3 / MinIO / R2
└─ sqlc.InsertTabloFile(tablo_id, key, filename, size, content_type)
Postgres (tablo_files table)
Browser (GET /tablos/{id}/files/{file_id}/download)
Go HTTP Handler (FileDownloadHandler)
├─ loadOwnedTablo + load file row (ownership verified)
├─ s3.PresignGetObject(key, TTL=5min) → presignedURL
└─ http.Redirect(w, r, presignedURL, 302)
Browser fetches directly from S3/R2/MinIO
Browser (POST /tablos/{id}/files/{file_id}/delete)
Go HTTP Handler (FileDeleteHandler)
├─ loadOwnedTablo + load file row
├─ s3.DeleteObject(bucket, key) ──────────────────────────► S3
└─ sqlc.DeleteTabloFile(file_id, tablo_id)
DB row removed
Browser (GET /tablos/{id}/files — HTMX)
Go HTTP Handler (TabloFilesTabHandler)
├─ HX-Request: true → render files tab fragment only
└─ HX-Request: false → render full Layout with files tab active
```
### Recommended Project Structure
New files only (existing structure unchanged):
```
backend/
├── migrations/
│ └── 0005_files.sql # tablo_files table
├── internal/
│ ├── db/
│ │ ├── queries/
│ │ │ └── files.sql # sqlc queries for tablo_files
│ │ └── sqlc/ # (regenerated after sqlc generate)
│ │ ├── files.sql.go
│ │ └── models.go # TabloFile struct added
│ ├── files/
│ │ ├── doc.go # already exists (placeholder)
│ │ └── store.go # S3 client constructor + Upload/Delete/PresignDownload
│ └── web/
│ ├── handlers_files.go # FilesDeps, upload/list/download/delete handlers
│ └── handlers_files_test.go # RED scaffold (Wave 0)
└── templates/
├── tablos.templ # TabloDetailPage restructured into tab layout
└── files.templ # FilesTab, FileRow, UploadForm templ components
```
### Pattern 1: S3 Client Construction (with custom endpoint for MinIO)
```go
// Source: aws-sdk-go-v2 docs + verified WebSearch 2026-05-15
// internal/files/store.go
package files
import (
"context"
"time"
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/credentials"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
type Store struct {
client *s3.Client
bucket string
}
// NewStore constructs a Store pointed at an S3-compatible endpoint.
// endpoint: e.g. "http://localhost:9000" (MinIO) or "https://<account>.r2.cloudflarestorage.com" (R2)
// usePathStyle: true for MinIO (required), false for R2
func NewStore(ctx context.Context, endpoint, bucket, region, accessKey, secretKey string, usePathStyle bool) (*Store, error) {
cfg, err := config.LoadDefaultConfig(ctx,
config.WithRegion(region),
config.WithCredentialsProvider(
credentials.NewStaticCredentialsProvider(accessKey, secretKey, ""),
),
)
if err != nil {
return nil, err
}
client := s3.NewFromConfig(cfg, func(o *s3.Options) {
o.BaseEndpoint = aws.String(endpoint)
o.UsePathStyle = usePathStyle
})
return &Store{client: client, bucket: bucket}, nil
}
```
### Pattern 2: Content-Type Sniff + Stream Upload
```go
// Source: net/http stdlib docs [VERIFIED: go doc net/http, 2026-05-15]
// The key: read first 512 bytes, reconstruct body via io.MultiReader, then stream to S3
func (s *Store) Upload(ctx context.Context, key string, file io.Reader) (string, error) {
// Sniff content type from first 512 bytes
var sniffBuf [512]byte
n, _ := io.ReadFull(file, sniffBuf[:])
contentType := http.DetectContentType(sniffBuf[:n])
// Reconstruct full body: sniffed bytes + remaining reader
body := io.MultiReader(bytes.NewReader(sniffBuf[:n]), file)
_, err := s.client.PutObject(ctx, &s3.PutObjectInput{
Bucket: aws.String(s.bucket),
Key: aws.String(key),
Body: body,
ContentType: aws.String(contentType),
})
return contentType, err
}
```
### Pattern 3: Presigned Download URL (5-minute TTL)
```go
// Source: aws-sdk-go-v2 docs [VERIFIED: go list, Context7, 2026-05-15]
func (s *Store) PresignDownload(ctx context.Context, key string) (string, error) {
presignClient := s3.NewPresignClient(s.client)
req, err := presignClient.PresignGetObject(ctx, &s3.GetObjectInput{
Bucket: aws.String(s.bucket),
Key: aws.String(key),
}, func(o *s3.PresignOptions) {
o.Expires = 5 * time.Minute
})
if err != nil {
return "", err
}
return req.URL, nil
}
```
### Pattern 4: MaxBytesReader Size Enforcement in Handler
```go
// Source: net/http stdlib [VERIFIED: go doc net/http, 2026-05-15]
func FileUploadHandler(deps FilesDeps) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
maxBytes := int64(deps.MaxUploadMB) * 1024 * 1024
r.Body = http.MaxBytesReader(w, r.Body, maxBytes)
// ParseMultipartForm memory limit (< maxBytes so overflow goes to disk)
if err := r.ParseMultipartForm(2 << 20); err != nil {
var mbErr *http.MaxBytesError
if errors.As(err, &mbErr) {
// Render friendly error fragment above upload form
renderUploadError(w, r, "File too large (max 25 MB).", deps)
return
}
http.Error(w, "bad request", http.StatusBadRequest)
return
}
file, header, err := r.FormFile("file")
// ... rest of handler
}
}
```
### Pattern 5: Tab Navigation in HTMX (hx-push-url)
```go
// Source: HTMX docs [ASSUMED] — pattern mirrors existing HX-Request detection
// in handlers_tablos.go
func TabloFilesTabHandler(deps FilesDeps) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
tablo, user, ok := loadOwnedTablo(w, r, TablosDeps{Queries: deps.Queries})
if !ok {
return
}
files, _ := deps.Queries.ListFilesByTablo(r.Context(), tablo.ID)
w.Header().Set("Content-Type", "text/html; charset=utf-8")
if r.Header.Get("HX-Request") == "true" {
// Fragment only — tab content area
_ = templates.FilesTabFragment(tablo, files, csrf.Token(r)).Render(r.Context(), w)
return
}
// Full page — initial navigation or browser refresh
_ = templates.TabloDetailPage(user, csrf.Token(r), tablo, nil, files, "files").Render(r.Context(), w)
}
}
```
In templates, the tab bar uses `hx-push-url` so the browser URL updates on tab switch:
```html
<!-- Source: HTMX hx-push-url pattern [ASSUMED] -->
<a
hx-get="/tablos/{id}/files"
hx-target="#tab-content"
hx-swap="innerHTML"
hx-push-url="/tablos/{id}/files"
class="tab-link"
>Files</a>
```
### Pattern 6: Delete Handler (S3 + DB in sequence)
```go
// Source: codebase pattern + aws-sdk-go-v2 [VERIFIED]
func FileDeleteHandler(deps FilesDeps) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
tablo, _, ok := loadOwnedTablo(w, r, TablosDeps{Queries: deps.Queries})
if !ok { return }
fileID := // parse chi.URLParam "file_id"
file, err := deps.Queries.GetTabloFileByID(r.Context(), sqlc.GetTabloFileByIDParams{
ID: fileID, TabloID: tablo.ID,
})
if err != nil { /* 404 or 500 */ }
// Delete from S3 first; log error but don't abort DB delete
if err := deps.Files.Delete(r.Context(), file.S3Key); err != nil {
slog.Default().Error("file delete: S3 delete failed", "key", file.S3Key, "err", err)
}
// Always delete DB row (orphan S3 objects cleaned up in Phase 6 worker)
if err := deps.Queries.DeleteTabloFile(r.Context(), sqlc.DeleteTabloFileParams{
ID: fileID, TabloID: tablo.ID,
}); err != nil { /* 500 */ }
if r.Header.Get("HX-Request") == "true" {
// OOB remove the file row from the list
w.WriteHeader(http.StatusOK)
return
}
http.Redirect(w, r, "/tablos/"+tablo.ID.String()+"/files", http.StatusSeeOther)
}
}
```
### Pattern 7: compose.yaml MinIO Service
```yaml
# Source: MinIO Docker Hub docs [VERIFIED: hub.docker.com/r/minio/minio, 2026-05-15]
# Add to backend/compose.yaml alongside existing postgres service
minio:
image: minio/minio:latest
container_name: xtablo-backend-minio
restart: unless-stopped
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
ports:
- "9000:9000" # S3 API
- "9001:9001" # Console UI
command: server /data --console-address ":9001"
volumes:
- minio_data:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 5s
timeout: 5s
retries: 10
minio-init:
image: minio/mc:latest
depends_on:
minio:
condition: service_healthy
entrypoint: >
/bin/sh -c "
mc alias set local http://minio:9000 minioadmin minioadmin;
mc mb --ignore-existing local/xtablo-dev;
echo 'bucket ready';
"
restart: "no"
volumes:
minio_data:
```
### Pattern 8: TabloDetailPage Refactored Signature
The existing `TabloDetailPage(user, csrfToken, tablo, tasks)` must be extended to carry the active tab and optionally the files list. The cleanest approach is adding an `activeTab string` parameter and passing files when needed:
```go
// templates/tablos.templ — updated signature
templ TabloDetailPage(user *auth.User, csrfToken string, tablo sqlc.Tablo, tasks []sqlc.Task, files []sqlc.TabloFile, activeTab string)
```
The tab bar renders inside Layout, and the active content area dispatches on `activeTab`. Callers that don't need files pass `nil`.
### Anti-Patterns to Avoid
- **Storing the original filename as the S3 key:** S3 keys must be path-safe. Use `files/{tablo_id}/{uuid}` only. Store original filename in DB.
- **Reading `r.ContentLength` to enforce size:** ContentLength is set by the client and can be spoofed. Always use `http.MaxBytesReader` at the read level.
- **Deleting DB row before S3 object on cleanup failure:** Delete S3 first, log failures, always delete the DB row. Orphan S3 objects are cleaned in Phase 6.
- **Virtual-host-style URLs with MinIO:** MinIO requires `UsePathStyle: true`. Without it, the SDK produces URLs like `http://xtablo-dev.localhost:9000/...` which MinIO cannot serve.
- **Calling `PutObject` with `io.TeeReader` to sniff and stream simultaneously:** This approach is correct but requires the body to still be readable after sniffing. Use `io.MultiReader(sniffBytes, remainingBody)` instead — it is simpler and avoids the tee pipe complexity.
- **Registering `/tablos/{id}/files` (static) after `/tablos/{id}/files/{file_id}` (parametric):** chi v5 requires static segments BEFORE parametric (Pitfall 1 in existing codebase). Route registration order: `GET /files``POST /files``GET /files/{file_id}/download``POST /files/{file_id}/delete`.
- **Not wrapping `r.Body` before calling `ParseMultipartForm`:** `MaxBytesReader` must wrap the body BEFORE `ParseMultipartForm` is called. Calling `ParseMultipartForm` first then checking `ContentLength` does not enforce the limit at the read level.
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| S3 request signing | Custom HMAC-SHA256 AWS SigV4 signer | `aws-sdk-go-v2/service/s3` | SigV4 has 30+ edge cases; presigned URL logic is subtle |
| Content-type detection | Custom magic-byte table | `net/http.DetectContentType` (stdlib) | Implements the WHATWG MIME-sniffing spec; covers 200+ types |
| File streaming to S3 | Manual HTTP PUT with net/http | `s3.PutObject` with `io.Reader` body | Handles chunked encoding, retries, error wrapping |
| UUID generation | rand-based ID | `github.com/google/uuid` (already in go.mod) | Already a dep; collision-free |
**Key insight:** The S3 API and request signing are non-trivial. The aws-sdk-go-v2 SDK handles SigV4, retries, error normalization, and presigned URL construction. Do not attempt to replicate any of this.
## Common Pitfalls
### Pitfall 1: Virtual-host-style URLs break MinIO
**What goes wrong:** `aws-sdk-go-v2` defaults to virtual-host-style URLs (`bucket.localhost:9000/key`). MinIO does not support virtual hosting by default and returns a "bucket not found" or DNS failure.
**Why it happens:** The SDK transforms the URL to move the bucket into the hostname unless `UsePathStyle: true` is set.
**How to avoid:** Always set `o.UsePathStyle = true` when `S3_ENDPOINT` is set and the host is not `*.amazonaws.com` or `*.r2.cloudflarestorage.com`. Simplest: always set it for local dev via env; R2 supports both styles.
**Warning signs:** Upload returns 404 or "NoSuchBucket" despite the bucket existing in MinIO console.
### Pitfall 2: MaxBytesReader must wrap r.Body before ParseMultipartForm
**What goes wrong:** Calling `r.ParseMultipartForm(maxMemory)` before wrapping `r.Body` in `MaxBytesReader` means the size limit is not enforced — Go's multipart parser reads as much as it can.
**Why it happens:** `ParseMultipartForm`'s `maxMemory` argument only controls how much goes to memory vs. temp file, NOT the total size read from the body.
**How to avoid:** `r.Body = http.MaxBytesReader(w, r.Body, maxBytes)` MUST be the first operation in the handler, before any form parsing.
**Warning signs:** Files larger than 25 MB upload successfully in tests.
### Pitfall 3: io.ReadFull returns io.ErrUnexpectedEOF for files < 512 bytes
**What goes wrong:** `io.ReadFull(file, buf[0:512])` returns `io.ErrUnexpectedEOF` if the file is smaller than 512 bytes. If the handler treats this as a fatal error, uploads of small files fail.
**Why it happens:** `io.ReadFull` returns `io.ErrUnexpectedEOF` when fewer bytes than requested are read before EOF — this is expected for small files.
**How to avoid:** Accept both `nil` and `io.ErrUnexpectedEOF` as non-fatal: `if err != nil && !errors.Is(err, io.ErrUnexpectedEOF) { return err }`. Pass `sniffBuf[:n]` (not `sniffBuf[:512]`) to `DetectContentType`.
**Warning signs:** Uploading a file < 512 bytes returns 500.
### Pitfall 4: TabloDetailPage signature change breaks existing tests
**What goes wrong:** Changing `TabloDetailPage` from `(user, csrfToken, tablo, tasks)` to `(user, csrfToken, tablo, tasks, files, activeTab)` breaks every call site and test that uses the old signature.
**Why it happens:** Go requires all arguments; there are no optional params.
**How to avoid:** Update all call sites in the same PR. Add `files []sqlc.TabloFile` and `activeTab string` as new trailing parameters. Callers on Overview/Tasks tabs pass `nil, "overview"` or `nil, "tasks"`.
**Warning signs:** Compile error across `handlers_tablos.go`, `handlers_tablos_test.go`, `handlers_tasks_test.go`.
### Pitfall 5: Delete S3 object but not DB row (or vice versa) on partial failure
**What goes wrong:** S3 delete succeeds but DB delete fails → orphan S3 object; DB delete succeeds but S3 delete fails → "ghost" file in UI.
**Why it happens:** Two external systems involved; no distributed transaction.
**How to avoid:** Delete S3 first, log the error but do not abort. Always attempt DB delete. Orphan S3 objects are cleaned by Phase 6 worker. If DB delete fails, return 500 (the file remains visible; user can retry).
**Warning signs:** File row appears in list after delete, or vice versa.
### Pitfall 6: chi route registration order for file sub-routes
**What goes wrong:** Registering `GET /tablos/{id}/files/{file_id}/download` before `GET /tablos/{id}/files` causes chi to shadow the static route.
**Why it happens:** chi v5 selects the first matching route. Parametric routes consume any segment including static ones if registered first.
**How to avoid:** Register in this order inside the RequireAuth group: `GET /tablos/{id}/files``POST /tablos/{id}/files``GET /tablos/{id}/files/{file_id}/download``GET /tablos/{id}/files/{file_id}/delete-confirm``POST /tablos/{id}/files/{file_id}/delete`. Static first, parametric last (identical to existing task route ordering).
**Warning signs:** `GET /tablos/{id}/files` returns 404 or wrong handler.
### Pitfall 7: minio-init container exits and podman restarts it
**What goes wrong:** `minio-init` finishes its `mc mb` command and exits with code 0. If `restart: unless-stopped` is set, podman/docker restarts it in a loop.
**Why it happens:** Init containers are supposed to run once.
**How to avoid:** Set `restart: "no"` on the `minio-init` service (shown in Pattern 7 above). [VERIFIED: standard compose pattern]
### Pitfall 8: multipart.FileHeader.Size is 0 until file is fully read
**What goes wrong:** `header.Size` may be 0 (not set by all browsers/clients) if the file size was not sent in the multipart header. Using `header.Size` as the stored file size results in `0` bytes recorded in the DB.
**Why it happens:** The `Content-Length` on a multipart file part is optional.
**How to avoid:** After streaming to S3, use the number of bytes actually written (track with an `io.CountingWriter` or equivalent) as the authoritative size. Alternatively, store `header.Size` as a fallback only when streaming is complete. The `PutObjectOutput` does not return byte count; wrap the body reader with a counter before passing to `PutObject`.
**Warning signs:** `tablo_files.size_bytes` is always 0 in the DB.
## Code Examples
### S3 Client Construction for MinIO (local dev)
```go
// Source: aws-sdk-go-v2 docs [VERIFIED: Context7 + WebSearch, 2026-05-15]
s3Client := s3.NewFromConfig(cfg, func(o *s3.Options) {
o.BaseEndpoint = aws.String("http://localhost:9000")
o.UsePathStyle = true // required for MinIO
})
```
### DeleteObject
```go
// Source: aws-sdk-go-v2/service/s3 [VERIFIED: go list, 2026-05-15]
_, err := s.client.DeleteObject(ctx, &s3.DeleteObjectInput{
Bucket: aws.String(s.bucket),
Key: aws.String(objectKey),
})
```
### PresignGetObject with 5-minute TTL
```go
// Source: aws-sdk-go-v2 PresignClient [VERIFIED: Context7, 2026-05-15]
presignClient := s3.NewPresignClient(s.client)
req, err := presignClient.PresignGetObject(ctx, &s3.GetObjectInput{
Bucket: aws.String(s.bucket),
Key: aws.String(objectKey),
}, func(o *s3.PresignOptions) {
o.Expires = 5 * time.Minute
})
// req.URL is the signed URL
```
### Migration 0005_files.sql
```sql
-- Source: migration style from 0004_tasks.sql [VERIFIED: codebase]
-- migrations/0005_files.sql
-- +goose Up
CREATE TABLE tablo_files (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
tablo_id uuid NOT NULL REFERENCES tablos(id) ON DELETE CASCADE,
s3_key text NOT NULL,
filename text NOT NULL,
content_type text NOT NULL DEFAULT 'application/octet-stream',
size_bytes bigint NOT NULL DEFAULT 0,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX tablo_files_tablo_id_idx ON tablo_files(tablo_id, created_at DESC);
-- +goose Down
DROP TABLE IF EXISTS tablo_files;
```
Notes on schema decisions:
- No `updated_at` — files are immutable (upload once, delete only).
- `size_bytes bigint` — not `integer`; 25 MB fits in int32 but bigint future-proofs for larger limits.
- `ON DELETE CASCADE` — if the tablo is deleted, file rows are cleaned up. S3 orphan cleanup is Phase 6.
- `s3_key` stored for S3 operations; `filename` stored for display.
### sqlc queries (files.sql)
```sql
-- Source: tasks.sql style [VERIFIED: codebase]
-- internal/db/queries/files.sql
-- name: InsertTabloFile :one
INSERT INTO tablo_files (tablo_id, s3_key, filename, content_type, size_bytes)
VALUES ($1, $2, $3, $4, $5)
RETURNING id, tablo_id, s3_key, filename, content_type, size_bytes, created_at;
-- name: ListFilesByTablo :many
SELECT id, tablo_id, s3_key, filename, content_type, size_bytes, created_at
FROM tablo_files
WHERE tablo_id = $1
ORDER BY created_at DESC;
-- name: GetTabloFileByID :one
SELECT id, tablo_id, s3_key, filename, content_type, size_bytes, created_at
FROM tablo_files
WHERE id = $1 AND tablo_id = $2;
-- name: DeleteTabloFile :exec
DELETE FROM tablo_files WHERE id = $1 AND tablo_id = $2;
```
### FilesDeps Struct (mirrors TasksDeps)
```go
// Source: TasksDeps pattern [VERIFIED: codebase handlers_tasks.go]
type FilesDeps struct {
Queries *sqlc.Queries
Files *files.Store
MaxUploadMB int // parsed from MAX_UPLOAD_SIZE_MB env var (default 25)
}
```
### NewRouter updated signature
```go
// Source: router.go [VERIFIED: codebase]
func NewRouter(pinger Pinger, staticDir string, deps AuthDeps, tabloDeps TablosDeps, taskDeps TasksDeps, fileDeps FilesDeps, csrfKey []byte, env string, trustedOrigins ...string) http.Handler
```
### Route additions inside RequireAuth group
```go
// Source: router.go pattern [VERIFIED: codebase]
// Static before parametric (Pitfall 6 / existing Pitfall 1):
r.Get("/tablos/{id}/files", TabloFilesTabHandler(fileDeps))
r.Post("/tablos/{id}/files", FileUploadHandler(fileDeps))
// Parametric file routes:
r.Get("/tablos/{id}/files/{file_id}/download", FileDownloadHandler(fileDeps))
r.Get("/tablos/{id}/files/{file_id}/delete-confirm", FileDeleteConfirmHandler(fileDeps))
r.Post("/tablos/{id}/files/{file_id}/delete", FileDeleteHandler(fileDeps))
```
### Tab routes for Overview and Tasks
Phase 5 introduces tab sub-routes. The existing `GET /tablos/{id}` becomes the Overview tab. New routes needed:
```go
// Overview (existing handler, now also serves Overview tab content)
r.Get("/tablos/{id}", TabloDetailHandler(tabloDeps)) // already registered
// Tasks tab — new route, re-uses existing task data fetching
r.Get("/tablos/{id}/tasks", TabloTasksTabHandler(taskDeps)) // new handler
// Files tab
r.Get("/tablos/{id}/files", TabloFilesTabHandler(fileDeps))
```
The `/tablos/{id}/tasks` GET route did not exist in Phase 4 (tasks were always embedded in the full detail page). Phase 5 adds it as a tab entry point.
### .env.example additions
```bash
# S3-compatible object storage
S3_ENDPOINT=http://localhost:9000
S3_BUCKET=xtablo-dev
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin
S3_REGION=us-east-1
S3_USE_PATH_STYLE=true # true for MinIO; false or omit for R2
# Max upload size in megabytes (default: 25)
MAX_UPLOAD_SIZE_MB=25
```
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| `aws-sdk-go-v1` (single module) | `aws-sdk-go-v2` (modular) | 2021 | Each service is a separate Go module; import only what you use |
| `EndpointResolver` (deprecated) | `BaseEndpoint` option on client | v2 SDK 2023+ | Simpler custom endpoint config for MinIO / local dev |
| Multipart file stored entirely in memory | `http.MaxBytesReader` + streaming to S3 | Go 1.16+ | Memory-safe for large uploads |
**Not deprecated but worth noting:**
- `feature/s3/manager.Uploader` is for multipart *uploads* (> 5 MB chunks, parallel). For server-proxy where we're just forwarding a user's file, direct `PutObject` with an `io.Reader` body is correct and simpler.
## Assumptions Log
| # | Claim | Section | Risk if Wrong |
|---|-------|---------|---------------|
| A1 | HTMX `hx-push-url` works correctly with tab switching; fragment swap targets `#tab-content` div | Patterns 5, tab navigation | Tab URL doesn't update on switch; planner should verify HTMX `hx-push-url` with `hx-target` != `body` |
| A2 | Cloudflare R2 does not require `UsePathStyle` (virtual-host-style works) | Pitfall 1 | R2 endpoints use `{bucket}.{account}.r2.cloudflarestorage.com` — virtual-host-style is the norm; path-style may also work |
| A3 | `multipart.FileHeader.Size` is unreliable (some clients send 0) | Pitfall 8, Code Examples | If reliable, a counting wrapper is unnecessary; but safest to always count bytes written |
| A4 | MinIO `latest` image includes `mc` binary (for healthcheck `mc ready local`) | Pattern 7 compose.yaml | `mc` may not be bundled in `minio/minio` image; separate `minio/mc` init container is safer (used in Pattern 7) |
## Open Questions
1. **Tab route for `/tablos/{id}/tasks` conflicts with existing `/tablos/{id}/tasks/*` child routes**
- What we know: `GET /tablos/{id}/tasks` is NOT currently registered. All task routes are `GET/POST /tablos/{id}/tasks/...` (with a trailing segment).
- What's unclear: chi v5's behavior when registering `GET /tablos/{id}/tasks` alongside `GET /tablos/{id}/tasks/new` and `POST /tablos/{id}/tasks/reorder`. Static + trailing segments should be fine, but the exact chi precedence should be verified.
- Recommendation: Add `r.Get("/tablos/{id}/tasks", TabloTasksTabHandler(...))` BEFORE the existing task sub-routes. Write a test that `GET /tablos/{id}/tasks` returns 200 with the kanban board, and `GET /tablos/{id}/tasks/new` still returns the new-task form.
2. **Byte-counting for `size_bytes` column**
- What we know: `s3.PutObject` does not return the number of bytes written. `multipart.FileHeader.Size` may be 0.
- What's unclear: The cleanest counting approach (wrap body in a custom `io.Reader` that increments a counter, vs. use `header.Size` as-is and accept that it may be browser-dependent).
- Recommendation: Wrap the body with a simple `byteCountReader` before passing to `PutObject`. After upload, use the counter value for `InsertTabloFileParams.SizeBytes`.
## Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|------------|------------|-----------|---------|----------|
| Podman | MinIO local dev via compose.yaml | ✓ | 5.8.2 | Docker (drop-in) |
| Docker | MinIO local dev (alternative) | ✗ | — | Podman (primary) |
| MinIO image | compose.yaml minio service | ✓ (via registry) | latest (2025-09-07) | — |
| aws-sdk-go-v2/service/s3 | files.Store | ✗ (not in go.mod yet) | v1.101.0 available | — |
| aws-sdk-go-v2/config | files.Store | ✗ (not in go.mod yet) | v1.32.17 available | — |
| aws-sdk-go-v2/credentials | files.Store | ✗ (not in go.mod yet) | v1.19.16 available | — |
| sqlc CLI | regenerate after files.sql | ✓ (pinned in justfile) | v1.31.1 | — |
| templ CLI | regenerate after .templ changes | ✓ (pinned in justfile) | v0.3.1020 | — |
**Missing dependencies with no fallback:**
- `aws-sdk-go-v2/service/s3`, `aws-sdk-go-v2/config`, `aws-sdk-go-v2/credentials` — must be added via `go get` in Wave 0.
**Missing dependencies with fallback:**
- None.
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | Go testing (stdlib) + httptest |
| Config file | none (go test ./...) |
| Quick run command | `go test ./internal/web/ -run TestFile -v` |
| Full suite command | `go test ./... -count=1` |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| FILE-01 | Upload creates DB row + S3 object | integration | `go test ./internal/web/ -run TestFileUpload -v` | ❌ Wave 0 |
| FILE-02 | Server-proxied (no presigned PUT) | integration | (covered by FILE-01 test) | ❌ Wave 0 |
| FILE-03 | Files listed with filename + size, newest first | integration | `go test ./internal/web/ -run TestFilesList -v` | ❌ Wave 0 |
| FILE-04 | Download returns 302 to signed URL | integration | `go test ./internal/web/ -run TestFileDownload -v` | ❌ Wave 0 |
| FILE-05 | Delete removes DB row (S3 mocked in unit tests) | integration | `go test ./internal/web/ -run TestFileDelete -v` | ❌ Wave 0 |
| FILE-06 | Non-owner gets 404 on all file routes | integration | `go test ./internal/web/ -run TestFileOwnership -v` | ❌ Wave 0 |
**S3 mocking strategy for tests:** The `files.Store` should be defined as an interface (`FileStorer`) so handler tests can inject a stub without requiring a real MinIO instance. Integration tests that need real S3 require MinIO running (skip if `S3_ENDPOINT` unset, following the `TEST_DATABASE_URL` skip pattern from auth tests).
### Sampling Rate
- **Per task commit:** `go test ./internal/web/ -run TestFile -v`
- **Per wave merge:** `go test ./... -count=1`
- **Phase gate:** Full suite green before `/gsd-verify-work`
### Wave 0 Gaps
- [ ] `backend/internal/web/handlers_files_test.go` — RED scaffold covering FILE-01..06 + IDOR test
- [ ] `backend/internal/files/store_test.go` — unit tests for Store methods (with mock S3 client or real MinIO)
- [ ] `backend/internal/files/store.go` — stub that compiles (placeholder methods)
- [ ] `go get` for aws-sdk-go-v2 modules — must run before any file code compiles
## Security Domain
### Applicable ASVS Categories
| ASVS Category | Applies | Standard Control |
|---------------|---------|-----------------|
| V2 Authentication | no (RequireAuth already covers) | existing RequireAuth middleware |
| V3 Session Management | no | existing session middleware |
| V4 Access Control | yes | `loadOwnedTablo` + `tablo_id` filter in all file queries (FILE-06) |
| V5 Input Validation | yes | `http.MaxBytesReader` for size; `DetectContentType` for MIME; filename stored as display only (not used in key) |
| V6 Cryptography | yes | SigV4 signing via aws-sdk-go-v2 (never hand-rolled) |
### Known Threat Patterns
| Pattern | STRIDE | Standard Mitigation |
|---------|--------|---------------------|
| Path traversal via filename | Tampering | Filename stored in DB only; S3 key is `files/{uuid}` — filename never touches S3 key |
| MIME-type spoofing (client sends wrong Content-Type) | Spoofing | Server-detected via `DetectContentType` (D-05); browser MIME ignored |
| Upload flood / DoS | Denial of Service | `http.MaxBytesReader(maxBytes)` + configurable `MAX_UPLOAD_SIZE_MB` |
| IDOR — accessing another user's files | Elevation of Privilege | All file queries include `tablo_id` filter; `loadOwnedTablo` verifies `user_id` (FILE-06) |
| CSRF on upload/delete | Tampering | `gorilla/csrf` middleware already in stack; all state-changing POSTs require CSRF token |
| Presigned URL exposure | Information Disclosure | Short TTL (5 min); URL is generated per-request and not stored |
## Sources
### Primary (HIGH confidence)
- `backend/internal/web/handlers_tablos.go` — handler pattern, loadOwnedTablo, HX-Request detection
- `backend/internal/web/handlers_tasks.go` — TasksDeps pattern to mirror for FilesDeps
- `backend/internal/web/router.go` — route registration order, RequireAuth group
- `backend/migrations/0004_tasks.sql` — migration style reference
- `backend/internal/db/queries/tasks.sql` — sqlc query style
- `backend/compose.yaml` — existing service definition to extend
- `backend/go.mod` — pinned versions, no aws-sdk-go-v2 present
- Context7 `/aws/aws-sdk-go-v2` — S3 PutObject, GetObject, PresignGetObject, client construction
- Context7 `/websites/aws_amazon_sdk-for-go_v2_developer-guide` — BaseEndpoint, UsePathStyle for custom endpoints
- `go doc net/http DetectContentType` — [VERIFIED: stdlib]
- `go doc net/http MaxBytesReader` — [VERIFIED: stdlib]
- `go doc mime/multipart FileHeader` — [VERIFIED: stdlib]
- `go list -m github.com/aws/aws-sdk-go-v2/service/s3@latest` — v1.101.0 [VERIFIED]
### Secondary (MEDIUM confidence)
- WebSearch: aws-sdk-go-v2 + MinIO + UsePathStyle + PresignGetObject — cross-referenced with Context7 docs
- MinIO Docker Hub: `minio/minio:latest` (2025-09-07), `minio/mc:latest` — [VERIFIED: hub.docker.com API]
### Tertiary (LOW confidence — flagged as [ASSUMED])
- HTMX `hx-push-url` behavior with scoped `hx-target` (not `body`) — see Assumptions Log A1
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH — all module versions verified via `go list -m`
- Architecture: HIGH — mirrors established codebase patterns exactly
- S3 patterns: HIGH — verified via Context7 + official AWS docs
- Pitfalls: HIGH — derived from stdlib docs + codebase conventions
- Compose / MinIO: MEDIUM — MinIO image version verified; init container pattern is standard but not run against the project's podman-compose setup
- HTMX tab navigation: MEDIUM — pattern is well-known; specific `hx-push-url` + fragment behavior marked LOW for one edge case
**Research date:** 2026-05-15
**Valid until:** 2026-06-15 (aws-sdk-go-v2 updates frequently; verify module versions before installation)