Proposal: Restructure test section into 5 categories

What

Replace the current 4-section test layout in the feature-request template with 5 categories ordered from high-level to low-level.

Why

The current structure mixes deterministic and non-deterministic tests in the E2E section. Some tests require AI judgment to verify results — these need a separate category. Ordering top-down (user-visible behavior first, implementation details last) makes the FR more readable as a spec.

Spec

Replace lines 128–148 of vault/_templates/feature-request.md with:

## Test
 
### Manual tests
| Test | Expected | Actual | Last |
|------|----------|--------|------|
| ...  | ...      | pending | -   |
 
### AI-verified tests
| Scenario | Expected behavior | Verification method |
|----------|-------------------|---------------------|
| ...      | ...               | ...                 |
 
### E2E tests — `tests/test_<name>_e2e.py`
| Scenario | Assertion |
|----------|-----------|
| ...      | ...       |
 
### Integration tests — `tests/test_<name>.py::TestIntegration*`
| Component | Coverage |
|-----------|----------|
| ...       | ...      |
 
### Unit tests — `tests/test_<name>.py`
| Component | Tests | Coverage |
|-----------|-------|----------|
| ...       | ...   | ...      |

Category definitions

#CategoryScopeVerificationCI-safe
1ManualAnyHuman judgmentNo
2AI-verifiedFull systemAI/LLM judges outputNo
3E2EFull systemDeterministic (exit codes, string match)Yes
4IntegrationComponents togetherDeterministicYes
5UnitIsolated functionDeterministicYes

Key distinction

  • E2E and AI-verified both test the full system. The difference: E2E has a programmatic assertion (exit code == 2, file exists, string contains X). AI-verified requires judgment to interpret whether the output is correct.
  • Categories 3–5 are deterministic and CI-safe. Categories 1–2 are not.

Impact

  • vault/_templates/feature-request.md — test section restructured
  • All existing FRs — can adopt incrementally (no breaking change)
  • FR-002 specifically: its 2 behavioral E2E tests move to AI-verified