// agent-native engineering workflow

File-backed sessions, not chat memory.

repo-harness turns Claude and Codex coding sessions into a repeatable, repo-local workflow. Hand the agent an approved plan or sprint — your loop is just review and next.

Adopt your repo Read the source

WORKS WITH Claude Codex ChatGPT Pro

Hook the robot rides Claude and Codex toward the carrot

zsh — adopt

# 1. bootstrap the runtime once

 $ npx -y repo-harness init 

✓ host adapters · skills · CodeGraph configured

# 2. preview, then apply the repo-local contract

 $ repo-harness adopt --dry-run 

✓ Migration Report — 14 surfaces ready

// why file-backed

Chat memory forgets. The repo remembers.

The same coordination problem, two ways. repo-harness moves the source of truth out of the thread and into files every agent — and every human — can read.

Chat-memory coordination

Context evaporates when a session ends or hits its limit
Each session re-derives structure with grep-and-read loops
No durable record of what was decided, or why
Claude and Codex drift out of sync across threads
Review means re-reading the whole conversation

repo-harness

File-backed workflow

Resume the exact next step from .ai/harness/handoff/
A ~12KB stable root context plus a CodeGraph index
Plans, contracts, checks, and reviews live in the repo
Claude and Codex read the same source artifacts first
Review from one Human Review Card plus machine evidence

// what it does

The surface area is intentionally small.

Inspect a repo, install repo-local workflow files, route host events through hooks, and keep the workflow surfaces consistent across Claude, Codex, and humans.

File-backed sessions

Handoffs, plans, and resume packets live in the repo. A session can end mid-task; the next one resumes the exact next step, blockers, and changed files.

Token-lean by design

A ~12KB stable root context plus a CodeGraph index for structural queries — instead of grep-and-read loops that re-scan the repo every session.

Hooks that guard & trace

Eight managed routes warn, block, trace, and hand off work. Edit gates hold until the plan is approved; done-claim gates verify file-backed evidence.

One Human Review Card

A single-screen decision surface per task: verdict, intended vs actual files, commands passed, residual risk, and rollback.

Isolated worktrees

Agents work in a linked branch or worktree, constrained to the contract’s allowed paths — unrelated dirty state stays protected.

Local & auditable

Durable truth is repo files, not chat history or hosted threads. Optional MCP sidecar exposes only workflow artifacts — no source writes, no shell.

// the whole system at a glance

Local, auditable, agent-native.

Eight layers, one source of truth. The CLI orchestrates; contracts and the filing system hold durable state; the verification layer proves the work — and ChatGPT Pro plans locally while Claude or Codex executes.

Interfaces Orchestrator Contracts & State Filing System Delegation Verification Integrations Outcomes

repo-harness architecture — eight layers from user/agent interfaces through the orchestrator CLI, contracts and state, filing system, delegation fabric, verification and review, external integrations, to outcomes; plus the ChatGPT Pro local planner comparison. — repo-harness architecture · click to enlarge

// plan to closeout

One layered chain, all the way down.

The planning chain is intentionally tiered. Each step writes a decision-complete artifact the next agent reads first — chat is never the source of truth.

Every task opens with structured due diligence — the Geju P1/P2/P3 discipline.

Map

Map the territory: the files, surfaces, and prior art the change touches.

Trace

Trace impact: callers, callees, and the cause chain behind it.

Decision

Decision & rationale: the chosen approach, and why — recorded for the next agent.

$ repo-harness prd

Product intent

A guided direction pass, then an upper-layer PRD under plans/prds/.

$ repo-harness sprint

Ordered backlog

PRD becomes a sprint with machine-checkable acceptance lines.

$ repo-harness goal

Claude or Codex executes

A bounded /goal prompt runs each sprint slice through the loop on either agent.

shipped

human review path

Accept only when the review recommends pass, the card verdict is pass, and external acceptance passes. Then inspect the contract, latest trace, and changed files.

checks/latest.json

 $ repo-harness mcp prepare-goal \ 

  --prd plans/prds/auth.prd.md \

  --sprint plans/sprints/auth.sprint.md

› .ai/harness/handoff/codex-goal.md

✓ goal handoff ready

// chatgpt pro as a planner

Plan with ChatGPT Pro. Execute with Claude or Codex.

The optional repo-harness mcp sidecar exposes only workflow artifacts to MCP clients. ChatGPT Pro plans against real repo state and moves an idea through PRD → Sprint → Goal handoff — then your existing Claude or Codex session executes the file-backed Sprint.

1

Read repo state

ChatGPT reads workflow files through the MCP sidecar — plans, contracts, checks, handoffs.
2

Write the PRD

write_prd_from_idea drafts a decision-complete PRD under plans/prds/.
3

Write the Sprint

write_checklist_sprint turns it into an ordered backlog with machine-checkable acceptance.
4

Prepare the handoff

prepare_codex_goal_from_sprint writes .ai/harness/handoff/codex-goal.md.
5

Claude or Codex executes

Your existing Claude or Codex session runs the host-native /goal prompt and stages each completed Sprint phase.

mcp — planner profile

# expose only workflow artifacts to ChatGPT

 $ repo-harness mcp setup chatgpt --repo . 

 $ repo-harness mcp serve --transport http \ 

  --host 127.0.0.1 --port 8765 --profile planner

› connect the /mcp URL as a ChatGPT Connector

✓ planner online — read-only over workflow files

Safety boundary

no source-code writes no arbitrary shell no default runner separate from API quota

// guards, traces & handoffs

Eight managed hook routes.

The installed adapter owns eight routes. The tuple event + routeId + matcher is the stable contract — they warn, block, trace, and hand off work across sessions.

RouteMatcherFunction

SessionStart .default all sessions Injects the prior handoff, sprint status, and read-only config-security findings before work starts.

PreToolUse .edit Edit | Write Enforces worktree policy and plan/contract readiness before any implementation edit.

PreToolUse .subagent Task | Agent Keeps delegated work returning through the parent session instead of leaking completion claims.

PostToolUse .edit Edit | Write Records edit traces, refreshes handoff and task status, queues architecture drift.

PostToolUse .bash Bash Observes command results and captures verification evidence without replacing the runner.

PostToolUse .always all tools Low-noise always-on trace and runtime observation; stale copies soft-skip with a refresh hint.

UserPromptSubmit .default all prompts Classifies prompt intent, routes planning/check/hunt hints, renders host-safe guidance.

Stop .default session stop Finalizes the handoff and guards against ending with unresolved draft-plan or evidence gaps.

guard routes fail closed — required gates block when their scripts are missing.

// the human review path

Accept or reject from one screen.

Every task writes a Human Review Card — the one-screen decision surface. See what changed, why it was in scope, what verified it, what risk remains, and how to roll it back.

Accept only when the review recommends pass, the card verdict is pass, and external acceptance is pass, not_required, or an explicit manual override.

tasks/reviews/20260618-1042-add-oauth.review.md

# Human Review Card verdict · pass

Change type

Feature — add OAuth device flow

Commands passed

bun test check-task-workflow --strict

Intended vs actual files

src/auth/device-flow.ts

src/auth/index.ts

tests/auth/device-flow.test.ts

External acceptance

manual override

Residual risk

Token refresh path is covered by tests; rate-limit backoff is advisory only.

Reviewer action

Rollback

git revert codex/add-oauth

// first 5 minutes

Bun-powered. One command to bootstrap.

The default installer runs on Bun — no Node setup needed, and it installs Bun for you if it’s missing. Already on Node? It works too. Bootstrap the runtime once, then preview the repo-local contract with a dry run before anything is applied.

Preview before you apply

Run repo-harness adopt --dry-run from the repo root. It reports every file that would be created or refreshed — apply only when the report looks right.

Full docs & reference

install

# macOS / Linux — installs Bun if missing

 $ curl -fsSL https://raw.githubusercontent.com/\ 

  Ancienttwo/repo-harness/main/install.sh | sh

 $ repo-harness init 

✓ host adapters · skills · CodeGraph configured

// acknowledgements

Built on good work by others.

These skills, repos, and runtimes shaped the workflow contract while repo-harness was designed and shipped. They are acknowledged here as influences — not ordinary bundled dependencies.

Hylarucoder · Geju

Methodology

The P1/P2/P3 due-diligence method and Geju practice behind the planning, tracing, and decision-rationale discipline.

Waza · TW93

Skills CLI

think, hunt, check, and health skills for daily planning, bug hunts, verification, and Codex-first skill sync.

gstack · gbrain · Garry Tan

Operator workflow

Product discovery, plan and design review, post-ship doc hygiene, knowledge sync, and long-form repo memory.

Mermaid

Runtime skill

Human-readable architecture and system-flow diagrams.

CodeGraph

Dev dependency

Symbol-aware navigation, impact tracing, and readiness checks.

OpenAI Codex

Execution agent

Primary execution agent for repo-local implementation and verification.

When Codex materially contributes to a commit, it carries Co-authored-by: codex <codex@openai.com> — opt-in and visible per commit.

// first 5 minutes

Make your next agent session resumable.

Bootstrap the runtime, preview the contract with a dry run, then prove the workflow. Free and open source under MIT.

$ bun add -g repo-harness

Adopt your repo Star on GitHub