back to work

Three coding agents, one DAG, zero rate-limit cliffs.

A VS Code extension that decomposes natural-language project requests into a typed DAG, scores each task against agent capabilities, and runs Claude, Copilot, and Codex in parallel.

PackAI
Period
Apr 2025 – Jun 2025
Role
Designer & sole engineer

Any single AI coding agent is bottlenecked by two things: rate-limits and capability mismatch. Ask the wrong model for the wrong task and you burn quota for a worse answer. PackAI treats the three big assistants — Claude, Copilot, and Codex — as interchangeable workers behind a planner, and routes each unit of work to whichever one looks best for it.

How it works

  1. Intent analysis. The user describes a project in natural language. An NLP layer extracts project type, target stack, feature list, and a complexity estimate.
  2. DAG planning. The planner expands the intent into a phased task graph (“scaffold → schema → routes → tests → review”), with explicit dependencies so independent tasks can run concurrently.
  3. Capability scoring. Each task is scored against each agent on signals like task type, language, complexity, and recent benchmark behavior. The highest score wins.
  4. Parallel execution. Independent tasks dispatch simultaneously. A rate-limit queue catches throttled agents and re-routes their work to a peer.
  5. Conflict resolution. When two agents touch the same file, the orchestrator stages outputs, runs validation gates, and merges with deterministic precedence rules.
  6. Validation gates. Every output passes through syntax, import-resolution, basic security, and style checks before it lands on disk.

Why a DAG, not a chain

Most multi-agent systems either run agents sequentially (slow) or fully in parallel without coordination (chaotic). A typed DAG was the sweet spot: explicit about what depends on what, parallel by default for everything else, and trivial to visualize in the extension's live dashboard.

3

Agents orchestrated

Claude · Copilot · Codex

741+

Vitest tests

across orchestrator + plan layers

5

Workflow templates

e-commerce, landing, dashboard, blog, generic

0

Hard rate-limit failures

thanks to fallback queueing

Engineering decisions worth flagging

  • State machine over framework. No LangGraph, no external agent library — the orchestrator is a plain TypeScript state machine. Easier to reason about, easier to test, and the 741+ tests reflect that.
  • Capability scores are configurable. Each user can override default weights, so an iOS engineer can bias toward the agent that knows Swift best without forking the extension.
  • Pause / resume / cancel at any node. Long-running sessions failed too often without granular control. The DAG made this almost free.

What surprised me

The capability-scoring layer mattered less than the conflict resolution layer. The agents were “good enough” that most routing decisions barely moved quality — but two agents editing the same file concurrently broke things constantly until the staging-and-merge layer was in place. The reliability win was in coordination, not selection.