// case study · tooling

Plan once. Route each task to the right coding agent.

A published VS Code extension with agent selection, dependency-ordered DAG planning, multi-file coordination, dashboard controls, and reusable task playbooks.

Period: 2026
Role: Designer and engineer
Demo: Open
Repo: GitHub

Any single AI coding agent is bottlenecked by two things: rate-limits and capability mismatch. Ask the wrong model for the wrong task and you burn quota for a worse answer. PackAI treats the three big assistants (Claude, Copilot, and Codex) as interchangeable workers behind a planner, and routes each unit of work to whichever one looks best for it.

How it works

Intent analysis. The user describes a project in natural language. An NLP layer extracts project type, target stack, feature list, and a complexity estimate.
DAG planning. The planner expands the intent into a phased task graph (“scaffold → schema → routes → tests → review”), with explicit dependencies so independent tasks can run concurrently.
Capability scoring. Each task is scored against each agent on signals like task type, language, complexity, and recent benchmark behavior. The highest score wins.
Parallel execution. Independent tasks dispatch simultaneously. A rate-limit queue catches throttled agents and re-routes their work to a peer.
Conflict resolution. When two agents touch the same file, the orchestrator stages outputs, runs validation gates, and merges with deterministic precedence rules.
Validation gates. Every output passes through syntax, import-resolution, basic security, and style checks before it lands on disk.

Why a DAG, not a chain

Most multi-agent systems either run agents sequentially (slow) or fully in parallel without coordination (chaotic). A typed DAG was the sweet spot: explicit about what depends on what, parallel by default for everything else, and trivial to visualize in the extension's live dashboard.

Agents orchestrated

Claude · Copilot · Codex

68%

Benchmark recovery

of previously failed tasks completed

Workflow templates

e-commerce, landing, dashboard, blog, generic

DAG

Execution model

dependency-ordered task planning

Engineering decisions worth flagging

State machine over framework. No LangGraph, no external agent library. The orchestrator is a plain TypeScript state machine, which keeps planning and execution behavior easier to inspect and test.
Capability scores are configurable. Each user can override default weights, so an iOS engineer can bias toward the agent that knows Swift best without forking the extension.
Pause / resume / cancel at any node. Long-running sessions failed too often without granular control. The DAG made this almost free.

What surprised me

The capability-scoring layer mattered less than the conflict resolution layer. The agents were “good enough” that most routing decisions barely moved quality, but two agents editing the same file concurrently broke things constantly until the staging-and-merge layer was in place. The reliability win was in coordination, not selection.