Autonomous Coding Agency — Architecture Design

Overview

The autonomous coding agency is a 24/7 system that picks up planned feature requests, implements them through a multi-agent pipeline, and delivers PRs — with minimal human intervention.

                         ┌─────────────────────────────────────────┐
                         │            ORCHESTRATOR (FR-056)         │
                         │  Main loop: poll FRs → dispatch → track │
                         └────────────────┬────────────────────────┘
                                          │
                    ┌─────────────────────┼─────────────────────┐
                    │                     │                     │
             ┌──────▼──────┐    ┌────────▼────────┐   ┌───────▼───────┐
             │  COMPLEXITY  │    │   JOB REGISTRY   │   │   GIT WORKFLOW │
             │   ROUTING    │    │  & PRIORITY QUEUE │   │    (FR-058)   │
             │  (FR-045)    │    │    (FR-046)       │   │ branch/PR/merge│
             └──────┬──────┘    └────────┬────────┘   └───────┬───────┘
                    │                     │                     │
                    └─────────────────────┼─────────────────────┘
                                          │
                         ┌────────────────▼────────────────────┐
                         │         AGENT PIPELINE               │
                         │                                      │
                         │   ┌──────────┐    ┌──────────────┐  │
                         │   │ PLANNER  │───▶│   CODER      │  │
                         │   │ (FR-043) │    │   (FR-043)   │  │
                         │   └──────────┘    └──────┬───────┘  │
                         │                          │          │
                         │                   ┌──────▼───────┐  │
                         │                   │   TESTER     │  │
                         │                   │   (FR-043)   │  │
                         │                   └──────┬───────┘  │
                         │                          │          │
                         │                   ┌──────▼───────┐  │
                         │                   │  REVIEWER    │  │
                         │          ┌───fix──│  (FR-057)    │  │
                         │          │        └──────┬───────┘  │
                         │          ▼               │ pass     │
                         │       CODER◄─────────────┘          │
                         │                                      │
                         └──────────────────────────────────────┘
                                          │
                    ┌─────────────────────┼─────────────────────┐
                    │                     │                     │
             ┌──────▼──────┐    ┌────────▼────────┐   ┌───────▼───────┐
             │    LLM       │    │   SANDBOXED     │   │   GIT WORKFLOW │
             │  PROVIDER    │    │   EXECUTION     │   │    (FR-058)   │
             │  (FR-054)    │    │   (FR-055)      │   │   create PR   │
             └─────────────┘    └─────────────────┘   └───────────────┘

How Existing FRs Map to This Architecture

LayerFRRoleStatus
FoundationFR-009 Python Project ScaffoldPackage structure, deps, toolingnew
FoundationFR-010 Testing InfrastructureTest framework for all modulesnew
InfrastructureFR-054 LLM Provider AbstractionModel-agnostic LLM callsnew (this batch)
InfrastructureFR-055 Sandboxed Code ExecutionIsolated agent environmentsnew (this batch)
InfrastructureFR-018 GitHub IntegrationAPI access for PRs, issuesin-progress
OrchestrationFR-045 Complexity RoutingScore FRs, pick execution strategynew
OrchestrationFR-046 Job Registry & Priority QueueTrack and prioritize all worknew
OrchestrationFR-056 Autonomous Coding OrchestratorMain 24/7 loopnew (this batch)
AgentsFR-043 Custom AgentsPlanner, Coder, Tester definitionsnew
QualityFR-057 Code Review PipelineAutomated review + fix loopnew (this batch)
GitFR-058 Agent Git WorkflowBranch, commit, PR automationnew (this batch)
DeploymentFR-019 VPS DeploymentServer to run the agencynew

New FRs Filling the Gaps

The existing FRs cover orchestration logic (FR-045, FR-046) and agent definitions (FR-043), but the following gaps existed:

  1. FR-054 — LLM Provider Abstraction: No way for agents to call LLMs without vendor lock-in.
  2. FR-055 — Sandboxed Code Execution: No safe environment for agents to run code/tests.
  3. FR-056 — Autonomous Coding Orchestrator: No main process tying everything together.
  4. FR-057 — Code Review Pipeline: No automated quality feedback loop.
  5. FR-058 — Agent Git Workflow: No automated branch/PR/merge management.

Build Order / Critical Path

Phase A: Foundation (must come first)
  FR-009 Python Project Scaffold
  FR-010 Testing Infrastructure

Phase B: Infrastructure (parallel, after Phase A)
  FR-054 LLM Provider Abstraction    ←── all agents need this
  FR-055 Sandboxed Code Execution    ←── agents need safe environments
  FR-058 Agent Git Workflow           ←── agents need branch management

Phase C: Agent Layer (after Phase B)
  FR-043 Custom Agents               ←── define planner, coder, tester, reviewer
  FR-057 Code Review Pipeline         ←── reviewer agent + feedback loop

Phase D: Orchestration (after Phase C)
  FR-045 Complexity Routing           ←── scoring + tier selection
  FR-046 Job Registry & Priority Queue ←── job tracking + dispatch

Phase E: Integration (after Phase D)
  FR-056 Autonomous Coding Orchestrator ←── main loop wiring everything together

Phase F: Deployment (after Phase E)
  FR-019 VPS Deployment               ←── run 24/7 on a server

Critical path: FR-009 → FR-054 → FR-043 → FR-056

Key Principles

  • Model-agnostic: All LLM calls go through FR-054’s abstraction layer. No direct Claude API calls in agent code. Provider can be swapped via configuration.
  • Isolation: Each agent runs in its own git worktree (FR-055). No shared mutable state between agents.
  • Human-in-the-loop by default: Phase 1 of FR-056 requires human approval for PRs. Auto-merge is a later phase with confidence thresholds.
  • Incremental delivery: Each FR is independently useful. FR-054 enables manual multi-model usage even without the full orchestrator.
  • Vault as source of truth: FRs, agent definitions, review rules all live in the vault. The orchestrator reads from vault, not hardcoded config.

Data Flow: FR → PR

  1. User marks FR as planned with phases defined
  2. Orchestrator (FR-056) polls vault/10_features/03_planned/ for work
  3. Complexity router (FR-045) scores the FR, selects execution strategy
  4. Job registry (FR-046) creates a job, assigns priority
  5. Orchestrator dispatches to agent pipeline:
    • Planner reads FR spec, outputs implementation plan
    • Coder executes plan in sandboxed worktree (FR-055), using LLM abstraction (FR-054)
    • Tester runs test suite in sandbox
    • Reviewer (FR-057) reads diff, produces structured feedback
    • If issues found → Coder fixes → Tester → Reviewer (loop until pass)
  6. Git workflow (FR-058) creates PR with description from FR + changes
  7. Human reviews PR (or auto-merge if confidence threshold met)
  8. FR status updated to done

Open Questions

References

  • FR-045 — Complexity Routing + Go Command
  • FR-046 — Job Registry & Priority Queue
  • FR-043 — Custom Agents
  • FR-056 — Autonomous Coding Orchestrator
  • Nexie (Sven Hennig) — inspiration for complexity routing and job management