docs(06): create phase plan
This commit is contained in:
parent
f242b7184f
commit
71d9c32e85
6 changed files with 40 additions and 31 deletions
|
|
@ -153,8 +153,13 @@ Plans:
|
|||
|
||||
**Plans:** 3 plans
|
||||
Plans:
|
||||
**Wave 1**
|
||||
- [ ] 06-01-PLAN.md — Wave 1: go get river + ListOrphanFiles sqlc query + internal/jobs/ package (HeartbeatWorker, OrphanCleanupWorker, SlogErrorHandler) + unit tests (WORK-01 partial, WORK-02, WORK-03, WORK-04)
|
||||
|
||||
**Wave 2** *(blocked on Wave 1 completion)*
|
||||
- [ ] 06-02-PLAN.md — Wave 2: replace cmd/worker/main.go with full river wiring (rivermigrate + Client + periodic jobs + graceful shutdown) + just worker target + README section (WORK-01, WORK-02, WORK-03)
|
||||
|
||||
**Wave 3** *(blocked on Wave 2 completion)*
|
||||
- [ ] 06-03-PLAN.md — Wave 3: Human-verify checkpoint — run just worker, observe heartbeat logs, verify graceful shutdown
|
||||
|
||||
### Phase 7: Deploy v1
|
||||
|
|
|
|||
|
|
@ -3,13 +3,13 @@ gsd_state_version: 1.0
|
|||
milestone: v1.0
|
||||
milestone_name: milestone
|
||||
status: ready_to_plan
|
||||
last_updated: "2026-05-15T13:13:11.604Z"
|
||||
last_updated: "2026-05-15T14:30:16.643Z"
|
||||
progress:
|
||||
total_phases: 7
|
||||
completed_phases: 5
|
||||
total_plans: 22
|
||||
total_plans: 25
|
||||
completed_plans: 22
|
||||
percent: 100
|
||||
percent: 88
|
||||
---
|
||||
|
||||
# STATE
|
||||
|
|
|
|||
|
|
@ -24,11 +24,11 @@ requirements:
|
|||
|
||||
must_haves:
|
||||
truths:
|
||||
- "river and riverpgxv5 are importable from backend/internal/jobs/ without build errors"
|
||||
- "HeartbeatWorker.Work() returns nil and logs 'worker heartbeat' with job_id and attempt fields"
|
||||
- "OrphanCleanupWorker.Work() iterates orphan rows, deletes S3 object then DB row, logs per-run summary"
|
||||
- "SlogErrorHandler implements river.ErrorHandler (both HandleError and HandlePanic compile)"
|
||||
- "ListOrphanFiles sqlc query is generated and returns []ListOrphanFilesRow with {ID, TabloID, S3Key}"
|
||||
- "D-01: river and riverpgxv5 are importable from backend/internal/jobs/ without build errors (go.mod contains github.com/riverqueue/river)"
|
||||
- "D-04: HeartbeatWorker.Work() returns nil and logs 'worker heartbeat' with job_id and attempt fields"
|
||||
- "D-04: OrphanCleanupWorker.Work() iterates orphan rows, deletes S3 object then DB row, logs per-run summary"
|
||||
- "D-06: SlogErrorHandler implements river.ErrorHandler (both HandleError and HandlePanic compile)"
|
||||
- "D-04: ListOrphanFiles sqlc query is generated and returns []ListOrphanFilesRow with {ID, TabloID, S3Key}"
|
||||
- "Unit tests for HeartbeatWorker, OrphanCleanupWorker, and SlogErrorHandler pass"
|
||||
artifacts:
|
||||
- path: "backend/go.mod"
|
||||
|
|
@ -173,8 +173,8 @@ From backend/internal/db/sqlc/files.sql.go (generated pattern — ListOrphanFile
|
|||
- OrphanCleanupWorker struct embeds river.WorkerDefaults[OrphanCleanupArgs]; fields: pool *pgxpool.Pool, store files.FileStorer
|
||||
- NewOrphanCleanupWorker(pool *pgxpool.Pool, store files.FileStorer) *OrphanCleanupWorker constructor
|
||||
- Work(ctx, job): calls sqlc.New(w.pool).ListOrphanFiles(ctx); loops over results; deletes S3 object first (w.store.Delete(ctx, f.S3Key)), then calls sqlc.New(w.pool).DeleteTabloFile(ctx, sqlc.DeleteTabloFileParams{ID: f.ID, TabloID: f.TabloID}) for each; increments deleted/errCount counters; logs slog.Info("orphan cleanup complete", "orphans_found", len(orphans), "deleted", deleted, "errors", errCount); returns nil (does not fail the job for partial errors — logs them)
|
||||
- TestOrphanCleanupWorker_NoOrphans: use mock store + real test DB (skip if no DB), confirm Work returns nil and logs "orphan cleanup complete" with orphans_found=0
|
||||
- TestOrphanCleanupWorker_DeletesOrphan: insert a tablo_files row with nonexistent tablo_id directly (bypassing FK), run Work, confirm the row is gone and s3 delete was called
|
||||
- TestOrphanCleanupWorker_NoOrphans: pure unit test using a mock DB querier (implement a mockQuerier that satisfies the sqlc Querier interface, returning an empty []ListOrphanFilesRow) and a mock FileStorer; confirm Work returns nil
|
||||
- TestOrphanCleanupWorker_DeletesOrphan: pure unit test using a mock DB querier that returns one fake ListOrphanFilesRow{ID: someUUID, TabloID: otherUUID, S3Key: "orphan-key"} and a mock FileStorer that records Delete calls; run Work and assert (a) mock FileStorer.Delete was called with "orphan-key", (b) mock querier.DeleteTabloFile was called with the matching ID and TabloID, (c) Work returns nil. Do NOT use a real test DB for this test — the FK constraint on tablo_files.tablo_id references tablos(id), making a real-DB orphan insert unreliable without complex setup. The mock approach gives deterministic coverage of the loop logic without DB dependency.
|
||||
|
||||
SlogErrorHandler:
|
||||
- SlogErrorHandler struct with no fields
|
||||
|
|
@ -191,7 +191,7 @@ From backend/internal/db/sqlc/files.sql.go (generated pattern — ListOrphanFile
|
|||
|
||||
error_handler.go: Define SlogErrorHandler struct (no fields). Implement HandleError and HandlePanic per the behavior spec. Import "context", "github.com/riverqueue/river", "github.com/riverqueue/river/rivertype", "log/slog".
|
||||
|
||||
For the test files: heartbeat_test.go and error_handler_test.go use pure unit tests (no DB needed). orphan_cleanup_test.go: copy the setupTestDB helper from backend/internal/auth/testdb_test.go (it reads TEST_DATABASE_URL or DATABASE_URL, creates an isolated schema, runs goose migrations, returns pool + cleanup); use a fake in-memory FileStorer mock for the S3 side. Integration test for orphan cleanup inserts a tablo_files row with a uuid that has no matching tablos row (insert directly with no FK enforcement by first creating the file row while tablo exists, then hard-deleting the tablo row using `DELETE FROM tablos WHERE id = $1`; this triggers cascade but that's fine — alternatively, use a nonexistent UUID that was never inserted into tablos, which won't violate the FK constraint since you're inserting into tablo_files with a foreign key — instead, use testDB.Exec("INSERT INTO tablo_files (id, tablo_id, s3_key, filename, content_type, size_bytes) VALUES (gen_random_uuid(), gen_random_uuid(), 'orphan-key', 'orphan.txt', 'text/plain', 0)") to bypass the FK at test time; if the FK check prevents this, wrap in a deferred constraint or skip and only test the no-orphans path).
|
||||
For the test files: heartbeat_test.go and error_handler_test.go use pure unit tests (no DB needed). orphan_cleanup_test.go uses pure unit tests with mock implementations — do NOT use a real test DB. Define a mockQuerier struct in the test file implementing the methods ListOrphanFiles(ctx) and DeleteTabloFile(ctx, params) used by OrphanCleanupWorker.Work; the mock records calls so tests can assert on them. Define a mockFileStorer struct implementing files.FileStorer with a Delete method that records the keys it was called with. TestOrphanCleanupWorker_NoOrphans: mockQuerier returns empty slice, assert Work returns nil. TestOrphanCleanupWorker_DeletesOrphan: mockQuerier returns one row, assert Delete was called with the correct S3 key and DeleteTabloFile was called with the correct params, assert Work returns nil. The verify command is: go test ./internal/jobs/... -run TestOrphanCleanupWorker -v -count=1.
|
||||
|
||||
After writing all files: cd backend && go test ./internal/jobs/...
|
||||
</action>
|
||||
|
|
@ -199,7 +199,7 @@ From backend/internal/db/sqlc/files.sql.go (generated pattern — ListOrphanFile
|
|||
<automated>cd /Users/arthur.belleville/Documents/perso/projects/xtablo-source/backend && go build ./internal/jobs/... && go test ./internal/jobs/... -v -count=1</automated>
|
||||
</verify>
|
||||
<done>
|
||||
go build ./internal/jobs/... exits 0. go test ./internal/jobs/... passes (or skips DB-dependent tests when no DB is available). All three files exist with correct package declarations and exported types.
|
||||
go build ./internal/jobs/... exits 0. go test ./internal/jobs/... passes with no skips required (all tests are pure unit tests with mocks). All three files exist with correct package declarations and exported types.
|
||||
</done>
|
||||
<acceptance_criteria>
|
||||
- `cd backend && go build ./internal/jobs/...` exits 0
|
||||
|
|
@ -210,7 +210,7 @@ From backend/internal/db/sqlc/files.sql.go (generated pattern — ListOrphanFile
|
|||
- `grep "SlogErrorHandler" backend/internal/jobs/error_handler.go` returns a line
|
||||
- `grep "HandleError" backend/internal/jobs/error_handler.go` returns a line
|
||||
- `grep "HandlePanic" backend/internal/jobs/error_handler.go` returns a line
|
||||
- `cd backend && go test ./internal/jobs/... -v -count=1` exits 0 (DB tests may skip, but must not fail)
|
||||
- `cd backend && go test ./internal/jobs/... -v -count=1` exits 0 (all tests pass — no DB required)
|
||||
- `cd backend && go build ./...` exits 0 (whole module compiles)
|
||||
</acceptance_criteria>
|
||||
</task>
|
||||
|
|
@ -238,7 +238,7 @@ From backend/internal/db/sqlc/files.sql.go (generated pattern — ListOrphanFile
|
|||
<verification>
|
||||
After Plan 01 is complete:
|
||||
- `cd backend && go build ./...` exits 0
|
||||
- `cd backend && go test ./internal/jobs/... -v` passes (skips if no DB, but does not error)
|
||||
- `cd backend && go test ./internal/jobs/... -v` passes (all unit tests with mocks — no DB required)
|
||||
- `grep "riverqueue/river" backend/go.mod` shows at least 2 lines (river + riverpgxv5)
|
||||
- `grep "ListOrphanFiles" backend/internal/db/sqlc/files.sql.go` returns at least 1 line
|
||||
- `grep "HeartbeatWorker" backend/internal/jobs/heartbeat.go` returns 1 line
|
||||
|
|
@ -249,7 +249,7 @@ After Plan 01 is complete:
|
|||
- river importable from internal/jobs/ without compile errors
|
||||
- HeartbeatWorker, OrphanCleanupWorker, SlogErrorHandler defined with correct signatures
|
||||
- ListOrphanFiles sqlc query generated with ListOrphanFilesRow type
|
||||
- Unit tests for all three job types pass
|
||||
- Unit tests for all three job types pass (pure mock-based, no DB dependency)
|
||||
- go build ./... exits 0 — whole module compiles clean
|
||||
</success_criteria>
|
||||
|
||||
|
|
|
|||
|
|
@ -17,15 +17,15 @@ requirements:
|
|||
|
||||
must_haves:
|
||||
truths:
|
||||
- "cmd/worker/main.go runs rivermigrate at startup before constructing river.Client"
|
||||
- "cmd/worker/main.go registers HeartbeatWorker and OrphanCleanupWorker via river.AddWorker"
|
||||
- "cmd/worker/main.go passes slog.Default() as river.Config.Logger"
|
||||
- "cmd/worker/main.go starts HeartbeatWorker as a PeriodicJob every 1 minute with RunOnStart: true"
|
||||
- "cmd/worker/main.go starts OrphanCleanupWorker as a PeriodicJob every 1 hour with RunOnStart: false"
|
||||
- "Graceful shutdown calls riverClient.StopAndCancel with a 10-second timeout context"
|
||||
- "D-02: cmd/worker/main.go runs rivermigrate at startup before constructing river.Client"
|
||||
- "D-05: cmd/worker/main.go registers HeartbeatWorker and OrphanCleanupWorker via river.AddWorker"
|
||||
- "D-07: cmd/worker/main.go passes slog.Default() as river.Config.Logger (single river.Client)"
|
||||
- "D-03: cmd/worker/main.go starts HeartbeatWorker as a PeriodicJob every 1 minute with RunOnStart: true (periodic jobs only)"
|
||||
- "D-03: cmd/worker/main.go starts OrphanCleanupWorker as a PeriodicJob every 1 hour with RunOnStart: false"
|
||||
- "D-07: Graceful shutdown calls riverClient.StopAndCancel with a 10-second timeout context"
|
||||
- "go build ./cmd/worker/... exits 0"
|
||||
- "just worker target exists in justfile and passes MinIO dev defaults as env vars"
|
||||
- "README.md documents how to run the worker locally"
|
||||
- "D-08: just worker target exists in justfile; README.md documents single-worker constraint"
|
||||
- "README.md documents how to run the worker locally alongside the web binary"
|
||||
artifacts:
|
||||
- path: "backend/cmd/worker/main.go"
|
||||
provides: "full river wiring — rivermigrate + river.Client + periodic jobs + graceful shutdown"
|
||||
|
|
@ -214,7 +214,7 @@ CRITICAL ORDERING CONSTRAINT (RESEARCH Pitfall 2 + PATTERNS.md):
|
|||
</done>
|
||||
<acceptance_criteria>
|
||||
- `grep "^worker:" backend/justfile` returns exactly 1 line
|
||||
- `grep "S3_ENDPOINT" backend/justfile` returns at least 2 lines (one in dev equivalent path, one in worker target)
|
||||
- `grep -A 20 "^worker:" backend/justfile | grep -c "S3_ENDPOINT"` returns 1 (S3_ENDPOINT confirmed inside the worker target block)
|
||||
- `grep "go run ./cmd/worker" backend/justfile` returns a line inside the worker target
|
||||
- `grep "just worker" backend/README.md` returns at least 1 line
|
||||
- `grep -i "single.*worker\|one worker\|multiple worker" backend/README.md` returns a line (D-08 documented)
|
||||
|
|
|
|||
|
|
@ -56,8 +56,10 @@ Output: Developer confirmation that WORK-01 through WORK-04 are satisfied in a l
|
|||
- `"worker ready"` — confirms binary started without errors
|
||||
- `"worker heartbeat"` — RunOnStart:true fires the heartbeat immediately on first tick; expect this within ~10 seconds
|
||||
|
||||
5. Wait ~60 seconds for a second heartbeat log line — confirms the periodic scheduler is running.
|
||||
Look for: `msg="worker heartbeat" job_id=... attempt=1`
|
||||
5. Wait until you have observed AT LEAST 2 separate `"worker heartbeat"` log lines separated by approximately 60 seconds.
|
||||
The first fires on startup (RunOnStart); the second confirms the PeriodicJob scheduler is running correctly and is not just a one-shot.
|
||||
Look for: `msg="worker heartbeat" job_id=... attempt=1` appearing twice with ~60 seconds between entries.
|
||||
Do not proceed to the next step until both heartbeats have been observed.
|
||||
|
||||
6. Send SIGINT: press Ctrl+C in the worker terminal.
|
||||
7. Confirm these log lines appear in order:
|
||||
|
|
@ -73,7 +75,8 @@ Output: Developer confirmation that WORK-01 through WORK-04 are satisfied in a l
|
|||
8. `cd backend && go test ./... && go build ./...` — must exit 0
|
||||
</how-to-verify>
|
||||
<resume-signal>
|
||||
Type "approved" if all log lines appeared and shutdown was clean.
|
||||
Do not approve until you have observed at least 2 "worker heartbeat" log lines separated by approximately 60 seconds — this confirms the PeriodicJob scheduler is running correctly, not just the RunOnStart fire.
|
||||
Type "approved" once both heartbeats are visible and shutdown was clean.
|
||||
Type "issues: [describe]" if any step failed or was unclear.
|
||||
</resume-signal>
|
||||
</task>
|
||||
|
|
@ -99,7 +102,7 @@ Manual verification only (this plan IS the verification step). Pass criteria:
|
|||
|
||||
1. `"worker ready"` appears in logs within 5 seconds of `just worker`
|
||||
2. `"worker heartbeat"` appears within 10 seconds (RunOnStart fires first heartbeat immediately)
|
||||
3. A second `"worker heartbeat"` appears ~60 seconds later (confirms scheduler is running)
|
||||
3. A second `"worker heartbeat"` appears ~60 seconds later (confirms scheduler is running — REQUIRED before approval)
|
||||
4. Ctrl+C produces `"shutting down"` then `"shutdown complete"` with no panic
|
||||
5. `cd backend && go test ./... && go build ./...` exits 0
|
||||
</verification>
|
||||
|
|
@ -107,6 +110,7 @@ Manual verification only (this plan IS the verification step). Pass criteria:
|
|||
<success_criteria>
|
||||
- Worker starts, connects to Postgres, applies river migrations, and logs "worker ready"
|
||||
- Heartbeat job fires within seconds of startup (RunOnStart: true)
|
||||
- Second heartbeat fires ~60 seconds later confirming the scheduler loop is active
|
||||
- Graceful shutdown works: SIGINT → "shutting down" → "shutdown complete"
|
||||
- Full test suite green: go test ./... && go build ./... exits 0
|
||||
- All WORK-01 through WORK-04 requirements satisfied as observable behaviors
|
||||
|
|
|
|||
|
|
@ -632,17 +632,17 @@ worker-logs:
|
|||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
## Open Questions (RESOLVED)
|
||||
|
||||
1. **S3 env var names for worker**
|
||||
- What we know: Phase 5 wired S3 env vars in `cmd/web`. Worker needs the same vars.
|
||||
- What's unclear: Whether the planner reuses the same env var names or introduces `WORKER_S3_*` variants.
|
||||
- Recommendation: Reuse the same env var names (`S3_ENDPOINT`, `S3_BUCKET`, etc.) — both binaries share the same `.env.example` pattern.
|
||||
- RESOLVED: Reuse the same env var names (`S3_ENDPOINT`, `S3_BUCKET`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`, `S3_REGION`, `S3_USE_PATH_STYLE`) — both binaries share the same `.env.example` and compose.yaml MinIO defaults.
|
||||
|
||||
2. **DeleteTabloFile query takes (id, tablo_id) — orphan rows have no owning tablo**
|
||||
- What we know: `DeleteTabloFile` has signature `WHERE id = $1 AND tablo_id = $2`.
|
||||
- What's unclear: For a true orphan (tablo deleted bypassing cascade), the `tablo_id` column still holds the old UUID even though the tablo row is gone. `$2` matches the stored value, so the DELETE still works.
|
||||
- Recommendation: No change needed — the query works correctly. Planner should note this in task comments.
|
||||
- RESOLVED: No change needed — the query works correctly for orphan rows. The `tablo_id` FK column retains the old UUID value; passing it as `$2` to `DeleteTabloFile` succeeds. Both fields (`ID`, `TabloID`) are available from `ListOrphanFilesRow`.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
Loading…
Reference in a new issue