Proposal: Restructure test section into 5 categories
What
Replace the current 4-section test layout in the feature-request template with 5 categories ordered from high-level to low-level.
Why
The current structure mixes deterministic and non-deterministic tests in the E2E section. Some tests require AI judgment to verify results — these need a separate category. Ordering top-down (user-visible behavior first, implementation details last) makes the FR more readable as a spec.
Spec
Replace lines 128–148 of vault/_templates/feature-request.md with:
## Test
### Manual tests
| Test | Expected | Actual | Last |
|------|----------|--------|------|
| ... | ... | pending | - |
### AI-verified tests
| Scenario | Expected behavior | Verification method |
|----------|-------------------|---------------------|
| ... | ... | ... |
### E2E tests — `tests/test_<name>_e2e.py`
| Scenario | Assertion |
|----------|-----------|
| ... | ... |
### Integration tests — `tests/test_<name>.py::TestIntegration*`
| Component | Coverage |
|-----------|----------|
| ... | ... |
### Unit tests — `tests/test_<name>.py`
| Component | Tests | Coverage |
|-----------|-------|----------|
| ... | ... | ... |Category definitions
| # | Category | Scope | Verification | CI-safe |
|---|---|---|---|---|
| 1 | Manual | Any | Human judgment | No |
| 2 | AI-verified | Full system | AI/LLM judges output | No |
| 3 | E2E | Full system | Deterministic (exit codes, string match) | Yes |
| 4 | Integration | Components together | Deterministic | Yes |
| 5 | Unit | Isolated function | Deterministic | Yes |
Key distinction
- E2E and AI-verified both test the full system. The difference: E2E has a programmatic assertion (
exit code == 2, file exists, string contains X). AI-verified requires judgment to interpret whether the output is correct. - Categories 3–5 are deterministic and CI-safe. Categories 1–2 are not.
Impact
vault/_templates/feature-request.md— test section restructured- All existing FRs — can adopt incrementally (no breaking change)
- FR-002 specifically: its 2 behavioral E2E tests move to AI-verified