░█▀▄░█▀▀░█░█░█░░░█▀█░█▀█░█▀█
░█░█░█▀▀░▀▄▀░█░░░█░█░█░█░█▀▀
░▀▀░░▀▀▀░░▀░░▀▀▀░▀▀▀░▀▀▀░▀░░

Spec-driven code/review loop that runs until all acceptance criteria are met and bugs fixed by agents.

$ curl -fsSL https://devloop.sh/install | bash

Requires Codex and Claude Code. Runs on macOS.

## WHY

Developers now spend most of their time in code review, assisted by coding agents. But the same model family can't both code and review. It favors its own outputs. Devloop makes Codex implement and Claude Code review adversarially (or vice versa), until all criteria are met and reviews addressed, so humans only come in where judgment pays off: at the spec and PR sign-off stages.

## HOW IT WORKS

  • Spec: /devloop-spec skill generates a short markdown spec with a clear problem, outcome, acceptance criteria, and what's in scope as much as what's out.
  • Code: Codex implements the spec in isolated git worktrees.
  • Review: Claude Code runs /devloop-review against the spec to ensure all acceptance gates along with engineering gates (security, maintainability, correctness, simplicity, etc.) are met.
  • Loop: Iterates on reject, stops on accept or max passes, and generates a PR ready for human review.

## WHEN (NOT) TO USE

When to use

  • The work can be pinned down in a clear spec through an interview with the coding agent.
  • The change is large or multi-step enough that adversarial review actually catches something.
  • You want the agents to iterate unattended and only pull you in for the spec and PR sign-off.

When not to use

  • The task is one-shotable and the scope is tiny: just ask the agent directly.
  • The work is exploratory or open-ended, where acceptance criteria can't be written up front.
  • You want hands-on control of every edit rather than a loop running on its own.

## FAQ

Will it touch my working tree?

No. It runs in isolated sibling worktrees and never pushes or opens a PR unless you pass --create-pr. Pass --in-place to work in your current checkout instead.

Which agent codes, which reviews?

Codex codes and Claude reviews by default. Swap either with --coder / --reviewer.

Can it run my tests?

Yes. Add a .devloop/verify hook with your build, lint, or test command. It gates acceptance and blocks the loop when it fails.

When does the loop stop?

On ACCEPT, on a stall, or when it runs out of passes (up to 5 by default, clamped 1-10). A 30-minute timeout caps each run.

Any telemetry?

None. One auditable Bash script; tracks and reports land under .devloop/.