Pre-Implementation Blueprint
1. Document Role
This document translates the research-oriented specification into an implementation-ready build plan. It exists to reduce ambiguity before coding begins.
Implementation note:
- the repository now includes a working MVP
- this blueprint remains useful as the intended design baseline and gap-analysis reference
It focuses on:
- what must be fixed before implementation
- what v1 should contain
- what boundaries must remain clean
- what delivery order will reduce rework
Related documents:
2. Implementation Principles
The first implementation should optimize for:
- research usefulness over production polish
- reproducibility over raw throughput
- modularity over premature abstraction
- debuggability over hidden automation
- clear data contracts over convenience shortcuts
3. v1 Scope
3.1 In scope
- one local-user research workflow
- one FastAPI backend
- one simple HTML frontend
- one Stable Diffusion or SDXL wrapper via Diffusers
- one persistent store for experiments, sessions, rounds, and feedback
- one replay and export workflow
- low-dimensional steering mode
- fixed-per-round seed mode
- three samplers
- three feedback modes
- three updaters
3.2 Out of scope
- multi-user collaboration
- cloud-scale operations
- production authentication
- distributed job scheduling
- model fine-tuning
- advanced GPU optimization
- enterprise security controls
4. Assumptions to Lock Early
The project should proceed with these default assumptions unless deliberately changed:
- single-machine execution
- one active interactive session at a time in v1
- SQLite-backed local persistence for the current MVP, with PostgreSQL still a plausible next step
- filesystem storage for images and exports
- replay correctness is higher priority than generation speed
- prompts and critique text are stored as research data
- mock generation is available for tests
- the normal app runtime uses real GPU-backed Diffusers inference
- trace logging is a first-class debugging and auditability feature
5. Decisions That Should Be Fixed Before Coding
5.1 Default model baseline
Options:
- Stable Diffusion 1.5 or similar lightweight baseline
- SDXL baseline
Recommendation:
- start with the lighter baseline to reduce development and test friction
5.2 Default steering basis
Options:
- random orthonormal basis
- PCA basis
- prompt-difference basis
Recommendation:
- start with random orthonormal basis for simplicity and deterministic validation
5.3 Default feedback mode
Options:
- scalar rating
- pairwise comparison
- top-k ranking
Recommendation:
- start with scalar rating for the simplest complete end-to-end flow
5.4 Default updater
Options:
- winner-copy
- winner-average
- linear preference updater
Recommendation:
- start with winner-average as the reference behavior
6. System Boundaries
6.1 Frontend boundary
Frontend responsibilities:
- collect prompt and configuration
- request round generation
- display candidates and metadata
- collect feedback
- show session progress and errors
Frontend non-responsibilities:
- preference inference
- steering calculations
- persistence rules
- replay reconstruction logic
6.2 Backend boundary
Backend responsibilities:
- prompt encoding
- steering state management
- candidate generation
- image rendering
- feedback normalization
- preference update logic
- persistence
- replay and export
6.3 Storage boundary
Storage must persist:
- experiment configs
- session state
- rounds
- candidates
- feedback events
- generated image paths
- trace event logs
- exports and manifests
Storage should avoid:
- arbitrary temporary tensors unless they are explicitly required for replay
7. Stable Contracts to Define Before Coding
7.1 Session contract
Every session state should contain:
- session ID
- experiment configuration snapshot
- prompt and negative prompt
- basis configuration
- current steering state
- current round index
- incumbent reference
- status
7.2 Candidate contract
Every candidate should contain:
- candidate ID
- round ID
- steering vector
- seed
- sampler role
- generation parameters
- image location
- predicted score if available
- predicted uncertainty if available
- render status
7.3 Feedback contract
Every normalized feedback object should contain:
- feedback type
- involved candidate IDs
- ordered preference information if derivable
- raw payload
- optional critique text
- timestamp
7.4 Replay contract
Every replay export should contain enough information to reconstruct:
- experiment configuration
- session setup
- round order
- candidate metadata
- image references
- feedback events
- updater outputs
8. Delivery Order
Implementation should proceed in this order:
- define schemas and configuration models
- implement persistence and repositories
- implement prompt encoding and steering basis logic
- implement sampler interface and one baseline sampler
- implement generation wrapper
- implement orchestration
- implement feedback normalization
- implement updater interface and one baseline updater
- expose API endpoints
- build the minimal frontend
- add replay and export
- add remaining strategies
- harden deterministic replay
9. Minimal API Decisions
These rules should be fixed before coding:
- all write endpoints return persisted identifiers
- round-generation responses include candidate metadata and image references
- feedback submission returns updated incumbent state and summary
- errors return structured codes and readable messages
- session configuration snapshots are immutable after creation
10. Non-Functional Requirements
10.1 Reproducibility
The system must:
- persist full config snapshots
- persist seeds for all candidates
- version strategy implementations
- support deterministic replay with a controlled backend
10.2 Debuggability
The system must:
- log round lifecycle events
- log request-level backend traces
- capture frontend interaction traces
- log per-candidate generation failures
- preserve raw feedback payloads
- allow inspection of session state from storage
10.3 Modularity
The system must:
- allow sampler swapping by configuration
- allow updater swapping by configuration
- keep generation logic separate from orchestration logic
11. Design Risks to Address Early
11.1 Hidden coupling
Risk:
- sampler, updater, and persistence logic become entangled
Mitigation:
- define interfaces early and centralize orchestration
11.2 Replay drift
Risk:
- missing config or seed data breaks replay trust
Mitigation:
- treat replay metadata as first-class persisted state
11.3 Generation cost
Risk:
- development slows dramatically if every test requires real generation
Mitigation:
- define a mock generator from day one
11.4 UI overreach
Risk:
- supporting every mode too early makes the frontend brittle
Mitigation:
- build one simple, switchable feedback component first
12. Definition of Implementation Readiness
The project is ready to implement when:
- the baseline model is fixed
- the baseline steering basis is fixed
- the baseline sampler, feedback, and updater set is fixed
- the API payload schemas are agreed
- the persistence model is agreed
- replay requirements are agreed
- acceptance criteria are accepted from the test specification
13. Delivery Milestones
Recommended milestones:
- schema and storage foundation
- single-round generation path
- full session lifecycle
- replay and export
- logging, tracing, and diagnostics hardening
- strategy plug-in expansion
- test hardening and polish
14. Summary
This blueprint is the engineering handoff document. It exists to ensure that implementation starts from fixed assumptions, explicit contracts, and a realistic v1 boundary rather than from a broad but underspecified research description.