HeadMaster

Open-source ADLC (AI-Driven Delivery Lifecycle) built on Claude Code.
Turns a single conversation into a PRD, a TDD, Jira stories, code, tests, and a reviewed PR.

View on GitHub Quick Start Discussions

Why HeadMaster exists

AI coding tools have collapsed the time it takes to write code. The bottleneck moved upstream: deciding what to build, how to build it, and proving it works. Most teams now have agents that can write a function in seconds but still spend days arguing about requirements, scoping tickets, reviewing PRs, and untangling the changes that nobody designed.

HeadMaster is the orchestration layer that closes that loop. It treats software delivery as a pipeline of artifacts — requirements, design, tickets, code, tests, review — and drives a feature through every stage with the right specialist agent at each step. You stay in the loop at every gate. Nothing merges without you saying yes.

The pipeline

Five stages, five gates, one terminal. Each stage emits a reviewable artifact; the next stage refuses to start until you approve the previous one.

1init-featureTier + route detected

2planPRD

3designSYSTEM_DESIGN + TDD

4breakdownJira epic + stories

5executeCode, tests, PR

Stages auto-skip based on feature size. An XS bug fix runs execute only. An L feature runs all five. Tier classification is mechanical (LOC + risk indicators), not vibes.

What you actually get

Every stage produces a markdown artifact in docs/features/{project}/{slug}/. They're not throwaway — downstream agents read them, retrospectives reference them, and you can hand them to a teammate who has never seen the feature.

PRD (from /plan)

# PDF Invoice Export — PRD

## Problem
Finance team manually re-keys invoices from the admin UI into
their accounting system. ~4 hours/week of error-prone copy work.

## Acceptance Criteria
- AC1: User clicks "Export PDF" on an invoice; downloads within 3s
- AC2: PDF contains: header, line items, totals, terms, footer
- AC3: Bulk export for filtered list (max 100 invoices, zipped)
- AC4: PDF is rejected if invoice is in DRAFT state

## Out of scope
- Custom PDF templates per customer (v2)
- Localized currency formatting (v2)

TDD (from /design)

## S3. Module: invoices/export

Interface:
  POST /api/invoices/{id}/pdf      → 200 application/pdf
  POST /api/invoices/export        → 202 + job_id (bulk)

Dependencies: ReportLab 4.x (already in project)
Failure modes: invoice not found → 404; draft → 409

Delivery slices:
  Slice 1: single-invoice endpoint, no styling
  Slice 2: header/footer template + line items
  Slice 3: bulk endpoint + zip packaging

Jira stories (from /breakdown)

HM-142  Add /api/invoices/{id}/pdf endpoint            [3pt]
HM-143  Build PDF template (header, line items, footer) [5pt]
HM-144  Bulk export endpoint with job queue             [5pt]
HM-145  Reject DRAFT invoices with 409                  [1pt]
HM-146  Integration tests for AC1-AC4                   [3pt]

What's in the box

13 specialized agents

Requirements, PRD author, solutions architect, TDD author, developer, code reviewer, QA engineer, release, retrospective, plus codebase analyst and web researcher.

19 skills

init-feature, plan, design, breakdown, execute, implement, review-code, qa-integration, security-scan, jira-ops, retrospect, and more.

Tier auto-classification

XS / S / M / L. A typo fix skips planning; a multi-service refactor runs the whole pipeline. Workflow per tier in .claude/workflows/.

Subagent isolation

Reviewers, QA, and TDD-reviewers never see implementation context — only the diff and the acceptance criteria. Enforced by a pre-spawn hook.

Failure ledger

Per-story retry log. When a test fails twice the same way, the executor changes approach instead of re-running the broken one.

Self-learning retrospective

After every feature, /retrospect mines the run for patterns and writes them into the right agent's MEMORY.md. Config-level proposals get routed to a human queue.

Built-in safety

Unconditional human gates — Breakdown's Jira push, PR merge, and any edit to .claude/ or pipeline scripts always require explicit approval, regardless of autonomous mode.
Security scan in every execution phase — secret detection, SAST, CVE check on the branch diff before code review even starts.
External-data sandboxing — content fetched from the web or external systems is wrapped in  markers; agents are instructed never to execute instructions within.
No autonomous credential access — uses provider default credential chains, no hardcoded keys, secrets stay in the user's secret manager.
Destructive git ops blocked — force-push, reset --hard, and clean -f are in the deny list.
Atomic commits per story — every Jira story maps to a tagged commit; reverting a story is one command.

How it's different

	Raw Claude Code	Devin / Cursor agents	HeadMaster
Generates code	Yes	Yes	Yes
Produces a PRD before coding	No	No	Yes
Produces a TDD / design doc	No	No	Yes
Decomposes into Jira tickets	No	No	Yes
Mandatory human review gates	No	Optional	Yes (unconditional)
Subagent context isolation	No	No	Enforced by hook
Self-learning retrospective	No	No	Yes
Runs on your own Claude key	Yes	No (hosted)	Yes

HeadMaster isn't a replacement for Claude Code — it uses Claude Code as the runtime and adds the pipeline discipline that turns ad-hoc agent output into reviewable software delivery.

Get started

git clone https://github.com/munna-chauhan/HeadMaster.git && cd HeadMaster
cp config.yml.example config.yml
sh scripts/setup_projects.py
claude
/init-feature "Add PDF invoice export"

Requirements: Claude Code, Python 3.9+, Node.js 18+, an Anthropic API key. Optional: Atlassian credentials for Jira/Confluence push, GitHub CLI for PR creation.

Full Quick Start →

FAQ

Is HeadMaster just a wrapper around Claude Code?

No. Claude Code is the runtime — it's how the agents execute. HeadMaster is the orchestration on top: 13 specialist agents with separate memories and models, 19 skills that drive the stage transitions, hooks that enforce isolation between subagents, scripts that manage state, and workflows that define which stages run for which feature size.

Does it replace developers?

No, and that's by design. Every stage has a human gate. The pipeline produces artifacts (PRD, TDD, stories, PR) that are explicitly designed to be reviewed by a person before the next stage starts. HeadMaster shifts your work from writing boilerplate to reviewing and steering — closer to a tech lead than an IC.

What if I don't use Jira?

The jira-ops skill is gated by a jira_push flag in config.yml. Disable it and Breakdown writes stories to a local markdown file (JIRA_BREAKDOWN.md) that the executor reads instead. Linear / GitHub Projects adapters are on the roadmap.

How much does it cost to run?

HeadMaster itself is Apache 2.0 — free. The cost is your Anthropic API usage. A small feature (XS/S tier) typically runs $0.50–$3 in Claude tokens; a large multi-service feature (L tier) can be $10–$30 depending on codebase size. Prompt caching across stages cuts this significantly.

Can I trust the code it produces?

You don't have to trust it — you review it. Every story produces an atomic commit, every PR has a structured description with traceability back to the Jira story and TDD section, security scans run before review, and the diff is small (story-sized, not feature-sized). Treat agent output the same way you treat a junior engineer's PR.

What languages does it support?

Any language Claude Code can read and write — Python, JavaScript/TypeScript, Go, Rust, Java, C#, Ruby, etc. The pipeline is language-agnostic; only the setup-env skill needs to detect your stack (it reads from repo-registry.yml).

Is the output deterministic?

Agent output isn't deterministic, but the pipeline structure is. The same input produces the same set of artifacts, with the same gates, in the same order. The content within each artifact varies — which is why human review at each gate is mandatory, not optional.

Can I use this at work?

Apache 2.0 license, so yes. The bigger questions are: does your org allow Anthropic API access on your codebase, and do you have approval for AI-assisted commits in your PR process? Most orgs that already permit Claude Code or Copilot will be fine; check with security first if you're in a regulated industry.

Roadmap

ShippedFive-stage pipeline (plan → design → breakdown → execute → PR)

Shipped13 specialist agents with isolated memory

ShippedTier-based workflow routing (XS/S/M/L)

ShippedSubagent context isolation hook

ShippedSelf-learning retrospective with memory write-back

ShippedJira and Confluence MCP integration

In progressDiagram-as-code skill (Mermaid + draw.io)

In progressCost telemetry per feature run

PlannedLinear and GitHub Projects adapters

PlannedMulti-repo orchestration (one feature spanning N services)

PlannedPlugin API for custom agents and skills

Full roadmap →

Get involved

HeadMaster is in active development and feedback shapes priorities directly.

Try it on a small internal project, then open a GitHub Discussion with what worked and what didn't.
Report issues — bugs, surprising agent behavior, missing stages, anything — at GitHub Issues.
Contribute — agent improvements, new skills, workflow tiers, language-specific setup detection. See CLAUDE.md for the contribution rules.
Share — if HeadMaster saves you time, post about it. Word of mouth is how OSS projects survive.