Why This Document Exists

Over the past weeks, I've built something that doesn't have a standard name yet: an AI development team. Not a single AI assistant that helps you code, but a coordinated system of specialized AI agents that take a Jira ticket, design the UI, implement the frontend and backend in parallel, review their own code for security issues, run tests, generate reports, and update Jira with the results — all from a single command.

This document explains how it works. Not the marketing version — the actual architecture, the design decisions, the things that worked and the things that didn't. It's written for engineers and technical leads who want to understand what's possible with agent orchestration today, and what the practical constraints are.

What this is not

This is not a product. It's a working prototype that runs against real Jira tickets and real codebases. The commands are Markdown files. The agents are Claude instances with specialized prompts. The "team" is a set of conventions for how those instances coordinate through files and tool calls. There is no custom infrastructure — it runs entirely within Claude Code's existing agent framework.

The Core Idea: Markdown Files That Become AI Teams

Claude Code has a feature called custom slash commands. You write a Markdown file in .claude/commands/, and it becomes a command you can run in the terminal. The Markdown file is the prompt — it tells Claude what role to play, what steps to follow, what tools to use, and when to spawn sub-agents.

The insight that makes this work: a sufficiently detailed Markdown prompt is an executable specification. If you describe a 5-phase pipeline with decision points, agent handoffs, and file-based communication — Claude will execute it. The Markdown file is simultaneously the documentation and the implementation.

# Example: .claude/commands/jira.md (simplified)

You are the **Jira Ticket Orchestrator**.
Fetch the ticket, analyze it, plan the work,
coordinate implementation agents, and update Jira.

## PHASE 1: Fetch & Analyze
Fetch the ticket via REST API...
Ask the user for project paths...

## PHASE 2: Design (if UI changes)
Launch a UI Designer agent in Paper.design...
Present designs to user for approval...

## PHASE 3: Implementation
Launch Frontend + Backend agents in parallel...

## PHASE 4: Report & Update Jira
Generate HTML report, post to Jira...

That's it. No framework, no SDK, no deployment pipeline. A Markdown file in a folder. When a user types /jira FO-2847, Claude reads the Markdown, and the pipeline begins.

Architecture Overview

The system has three layers:

LayerWhat It IsExamples
Commands Markdown files that define pipelines. Each command is a self-contained workflow with phases, decision points, and agent specifications. /jira, /new-feature, /full-pipeline, /unit-test, /deps
Agents Claude instances spawned by commands. Each agent has a specialized role, receives a detailed prompt, and works independently. Agents have no memory of each other — they communicate through files. Security Auditor, Frontend Developer, UI Designer, Test Engineer, Code Analyst
Integrations External systems that agents interact with via APIs, MCP servers, or CLI tools. Jira REST API, JAM MCP, Paper MCP, Playwright, Docker, Git
System Architecture
User
Types a command
Orchestrator
Reads the .md command
Plans, coordinates, asks user
↓ spawns agents ↓
UI Designer
Paper MCP
Frontend Dev
Components, routes
Backend Dev
APIs, DB, logic
Code Analyst
Security + quality
Test Engineer
Unit + E2E tests
Doc Lead
HTML reports
↓ communicate via ↓
reports/*.md
File-based comms
Jira API
Tickets, comments
JAM MCP
Bug recordings
Paper MCP
Design tool

The Agent Team: Roles and Responsibilities

Each agent is a Claude instance with a specialized prompt. When the orchestrator spawns an agent, it passes the full context that agent needs: the ticket details, the stack profile, the file paths, and the specific task. Agents don't know about each other — they each do their job and write their output to a report file.

The Orchestrator You

The orchestrator isn't a separate agent — it's the main Claude instance running the command. It's the only one that can interact with the user. It reads the command Markdown, executes the phases, spawns sub-agents, and presents results for user approval.

Responsibilities:

Why the orchestrator asks the user, not the agents

Sub-agents cannot interact with the user. Only the main Claude instance (the orchestrator) can ask questions, present options, and get approvals. This is a Claude Code constraint, not a design choice — but it's actually a good constraint. It means all user-facing decisions flow through a single point, which makes the pipeline predictable and auditable.

UI Designer Design Agent

Creates visual mockups in Paper.design using MCP tools. Reads the feature plan, existing code patterns, and any screenshots of the current app. Produces artboards with HTML/CSS designs and writes a design specification.

Tools used: mcp__paper__create_artboard, mcp__paper__write_html, mcp__paper__get_screenshot, mcp__paper__get_jsx

Output: Paper artboards + reports/feature-ui-plan.md

Key constraint: The user MUST review and approve designs before implementation begins. This is a hard gate — no code is written until designs are approved.

Frontend Developer Implementation Agent

Implements UI changes based on the approved design spec and feature plan. Creates components, pages, routes, hooks. Updates navigation. Follows the existing project's patterns and the API contract exactly.

Input: Feature plan + UI design spec + existing codebase patterns

Output: reports/feature-ui-implementation.md

Runs in parallel with: Backend Developer (when scope is full-stack)

Backend Developer Implementation Agent

Implements API endpoints, database schema changes, middleware. Uses parameterized SQL, proper auth, input validation. Follows the API contract from the feature plan exactly.

Input: Feature plan + existing codebase patterns

Output: reports/feature-backend-implementation.md

Also used for: Fixing issues found by Security Auditor and Quality Engineer in the dev-team loop

Security Auditor Review Agent

Scans every source file for OWASP Top 10 vulnerabilities: injection, broken auth, data exposure, XSS, access control, misconfiguration. Categorizes findings as CRITICAL, WARNING, or INFO.

Output: reports/security-audit.md

Runs in parallel with: Code Quality Engineer

Code Quality Engineer Review Agent

Scans for code smells, bugs, anti-patterns: type safety issues, error handling gaps, performance problems, dead code, API design issues. Same severity categorization.

Output: reports/quality-audit.md

Test Engineer Testing Agent

Writes unit tests for every fix. Covers edge cases and security regression scenarios. Runs all tests (existing + new) and iterates until 100% pass. Supports 9 stacks: Java/JUnit, C#/xUnit, JS/Jest, Python/pytest, Go, Rust, PHP, Ruby.

Output: reports/test-report.md or reports/unit-test-report.html

Code Analyst Review Agent

Reviews ONLY the git diff — not the entire codebase. Checks new code for SQL injection, XSS, command injection, hardcoded secrets, null refs, race conditions, missing auth. Auto-fixes critical issues.

Output: reports/code-analysis.md

Key distinction from Security Auditor: The Security Auditor scans the whole codebase. The Code Analyst reviews only what changed. One is for audits, the other is for pull request-style review.

Documentation Lead Report Agent

Compiles all agent reports into a single, polished HTML report. Dark theme, collapsible sections, severity badges, code snippets, testing checklists. The report is the artifact that gets attached to the Jira ticket.

Input: All reports/*.md files from other agents

Output: reports/master-report.html or reports/feature-report.html or reports/jira-{KEY}-report.html

How Agents Communicate

Agents don't talk to each other directly. They communicate through three mechanisms:

1. Report Files (Primary)

Every agent writes its output to a file in reports/. The orchestrator reads these files and passes relevant content to downstream agents. For example:

  1. Security Auditor writes findings to reports/security-audit.md
  2. Quality Engineer writes findings to reports/quality-audit.md
  3. Orchestrator spawns Backend Developer, telling it: "Read the findings from both reports and fix everything"
  4. Backend Developer writes what it changed to reports/fixes-applied.md
  5. Orchestrator spawns Test Engineer, telling it: "Read the fixes and write tests for each one"

This is effectively a message-passing system where the messages are Markdown files. It's simple, inspectable (you can read the files), and robust (files don't disappear if an agent crashes).

2. Prompt Injection (Secondary)

When the orchestrator spawns an agent, it includes all necessary context directly in the prompt. The agent receives: the Jira ticket details, the stack profile, the file paths, the design spec, and whatever else it needs. This is redundant with the file system, but it ensures the agent has everything without needing to read files first.

3. Tool Calls (External Systems)

Agents interact with external systems via tool calls: Jira REST API (via curl), Paper MCP tools, JAM MCP tools, Docker CLI, Git commands, Playwright. The orchestrator handles Jira authentication centrally (credentials are in .env), and agents inherit this through sourcing the same file.

Communication Flow (Jira Pipeline Example)
User: /jira FO-2847
  │
  ▼
Orchestrator ──── curl ────▶ Jira REST API (fetch ticket)
  │                           │
  │ ◄── ticket JSON ──────────┘
  │
  ├── Detect JAM links? ──── mcp ────▶ JAM MCP (video analysis)
  │
  ├── Ask user for project paths (AskUserQuestion)
  │
  ├── UI changes? ──── spawn ────▶ UI Designer Agent ──── mcp ────▶ Paper MCP
  │                                   │
  │ ◄── reports/feature-ui-plan.md ───┘
  │
  ├── User approves designs? (AskUserQuestion)
  │
  ├── spawn (parallel) ──────▶ Frontend Dev Agent ──▶ reports/feature-ui-implementation.md
  │                    └─────▶ Backend Dev Agent  ──▶ reports/feature-backend-implementation.md
  │
  ├── spawn ──────────────────▶ Code Analyst ──▶ reports/code-analysis.md
  │
  ├── spawn ──────────────────▶ Doc Lead ──▶ reports/jira-FO-2847-report.html
  │
  ├── curl ────▶ Jira (upload report + screenshots + comment)
  │
  └── curl ────▶ Jira (transition status + log time)
  

The Command System: 28 Slash Commands

Each command is a Markdown file in .claude/commands/. Some are standalone (single agent, single task), others are orchestrators that spawn multiple agents across phases. Here's the full inventory:

Jira Pipeline Project Management

CommandWhat It DoesPhases
/jira FO-2847 Full ticket-to-resolution pipeline. Fetches ticket, detects JAM recordings, classifies scope, asks for project paths, designs in Paper (if UI), implements, reviews code, generates report, updates Jira (comment + attachments + status + time). 5 phases
/jira sprint Batch mode. Fetches available Scrum teams dynamically, user picks a team, fetches sprint tickets, user selects which to process, runs each through the full pipeline sequentially with continue/skip/stop controls. 5 phases per ticket
/jira teams Lists all available Scrum teams from Jira. Instant lookup, no processing. Instant
/jam {url} Analyzes JAM bug recordings via MCP. Fetches video analysis, console logs, network requests, user events. Accepts URL, JAM ID, or Jira ticket key. MCP-based
/tempo Time logging. /tempo addTime FO-2847 2h "Bug fix". Also supports getTime, getWeek, getMonth for viewing logged time. Instant

Mandatory user interaction: project paths

The Jira pipeline always asks the user which project directories to work in. It never guesses. This was a deliberate decision after the AI incorrectly routed work to the wrong codebase. The current working directory (d:\Kunder\247\AIComp) is the orchestration project — it contains the commands and configs, not the actual code. The real codebases are at separate paths like D:\Kunder\247\Finago\control-backend-api.

Feature Development Pipeline Design + Implementation

CommandWhat It DoesAgents Involved
/new-feature 6-phase pipeline: plan feature, capture existing UI screenshots, design in Paper (user approval gate), parallel frontend + backend implementation, code analysis, master report. Optional handoff to /full-pipeline for testing. Orchestrator, UI Designer, Frontend Dev, Backend Dev, Code Analyst, Doc Lead
/code-analysis Reviews only the git diff. Checks for security issues, logic errors, quality problems. Auto-fixes critical issues. Code Analyst (single agent)

/new-feature — Feature Development Pipeline

1
Feature Planning — Orchestrator (you) does this directly
User Ask for Project Paths
Detect stack (package.json, *.csproj, pom.xml). Always ask user for backend/frontend/database paths. Build Stack Profile for all agents.
Analyze Codebase
Read project structure, routing, components, APIs, DB models. Adapt to detected stack.
Draft Feature Plan
Summary, scope (UI/Backend/Full-stack), API contract, DB schema changes, UI components, risks.
→ reports/feature-plan.md
Gate User Approval
Present plan. User: Approve / Request changes (iterate) / Reject. No code written until approved.
Scope Decision
Full-stack / UI → Phase 2-3-4 Backend-only → Skip to Phase 4
Gate: user approved plan
2
Visual Capture — UI changes only
Playwright Screenshots
Check if app running (localhost:3000/5173/4200). Capture BEFORE screenshots of relevant pages as design reference.
→ reports/screenshots/*.png
Gate: screenshots captured
3
UI Design in Paper — mandatory for any UI change
Agent UI Designer
Creates artboards in Paper.design via MCP. Reads feature plan + existing code patterns + screenshots. Writes design spec.
→ Paper artboards + reports/feature-ui-plan.md
Gate Design Review
User opens Paper to review. Accept (proceed) / Change (re-launch designer) / Reject (new direction). No code until approved.
Gate: user approved designs
4
Parallel Implementation — agents build simultaneously
Agent Frontend Dev
Components, pages, routes, hooks. Follows approved design + API contract.
→ reports/feature-ui-implementation.md
Agent Backend Dev
APIs, DB schema, middleware, validation. Parameterized SQL, proper auth.
→ reports/feature-backend-implementation.md
Agent Code Analyst
Reviews ONLY the git diff. SQL injection, XSS, secrets, null refs, race conditions. Auto-fixes critical issues.
→ reports/code-analysis.md
Gate: implementation + code review complete
5
Master Report
Agent Documentation Lead
Compiles all reports: plan, design, frontend, backend, code analysis. Produces polished HTML report with 8 sections.
→ reports/feature-report.html
6
Optional Testing — handoff to /full-pipeline
User Decision
Run full testing pipeline (security, quality, unit tests, E2E)? If yes → /full-pipeline
Feature Development Complete
6
Phases
4
Agents
2
User Gates
5
Reports

Development Team Loop Iterative Quality

CommandWhat It DoesAgents Involved
/dev-team Iterative quality loop: parallel security + quality scan, fix all issues, write tests, re-scan. Repeats until zero new findings. Typically 2-3 rounds. Security Auditor, Quality Engineer, Backend Developer, Test Engineer, Doc Lead
/security-audit Standalone OWASP Top 10 scan. Security Auditor
/quality-audit Standalone code quality scan. Quality Engineer
/fix-all Fix all findings from audit reports. Backend Developer
/test-all Write tests for all fixes and run them. Test Engineer
/master-report Compile all reports into master HTML. Documentation Lead

Testing & Dependencies Quality Assurance

CommandWhat It DoesModes
/unit-test Maps existing test coverage, identifies gaps, creates unit tests, runs and fixes them iteratively (up to 5 rounds). Supports 9 language stacks. * (full scan), {file} (single), --fix-ignored (rehabilitate disabled tests)
/playwright-test Runs Playwright E2E browser tests. Analyzes failures, distinguishes app bugs from test bugs, writes missing tests, generates HTML report. Single mode

Dependency Health Security & Maintenance

CommandWhat It DoesModes
/deps Scans dependencies for CVEs (with exploitability assessment), outdated packages (staleness score), and license risks (GPL/copyleft detection). Produces a health score 0-100 with letter grade A-F. Can auto-fix safe updates or export Dependabot/Renovate configs. Full audit, --vuln-only, --outdated, --license

Docker Pipeline DevOps

CommandWhat It Does
/docker-buildBuild Docker image, validate (size, non-root, health check, no secrets), security scan.
/docker-deployDeploy with docker compose, wait for health check, test endpoints, collect container info.
/docker-testFull integration test suite against live container.
/docker-teardownGracefully tear down containers, report freed resources.
/full-pipelineAll of the above: dev-team loop + Playwright + Docker build/deploy/test + master report.

Universal Commands Any Project

CommandWhat It Does
/create "description"Context-aware feature creator. Reads ProjectType from .env, adapts role (Game/App/SaaS Dev), designs in Paper, generates HTML plan with mockup screenshots (auto-opens in browser), implements, verifies with Playwright.
/create-project "description"Full project creator from scratch. Asks clarifying questions, designs architecture + UI in Paper, generates HTML plan, builds with full agent team (backend + frontend + security + tests + Docker), delivers running application.
/bug "description"Context-aware bug fixer. Analyzes pasted screenshots, diagnoses root cause, applies minimum fix, verifies with Playwright, saves timestamped report.
/verifyE2E verification with Playwright. Uses project profile (.claude/project-profile.json) for login and navigation. Takes before/after screenshots, generates self-contained HTML report with clickable lightbox. Auto-checks profile completeness before running.
/changelogReads reports from .claude/unprocessed_reports/ (created by /create and /bug), generates beautiful HTML changelog with features (blue) and bug fixes (amber). Moves processed reports to prevent double-counting.

Git Operations Git

CommandWhat It Does
/git sync {branch}Merge latest from a branch into current. Pre-flight checks, fetch, merge. With --fix-merge-errors: AI-powered per-file conflict resolution. With --all: sync all projects.
/git statusQuick branch overview: ahead/behind, uncommitted changes, stashes, last commit.

Repository Onboarding Onboarding

CommandWhat It Does
/repo-setup {url}Clone, analyze stack, install deps, configure env, build, test, start. Produces 12-section HTML setup guide. Detects dependent repos and offers to set them up too.
/repo-setup {org_url}Organization scan: fetches all repos via GitHub API, maps relationships (depends-on, frontend-for, shared-library), calculates startup order, generates architecture documentation with CSS diagrams.
/repo-setup --auto-setupAuto-clone, install, build, test all repos without prompting. Compatible with --search filter.

/full-pipeline — End-to-End Delivery Pipeline

1
Code Quality Loop — iterative: scan → fix → test → verify until zero findings
Agent Security Auditor
OWASP Top 10 scan. Injection, XSS, broken auth, data exposure, misconfig.
→ reports/security-audit.md
Agent Quality Engineer
Code smells, bugs, anti-patterns. Type safety, error handling, dead code.
→ reports/quality-audit.md
Agent Backend Developer
Reads both audit reports. Fixes ALL issues by severity: Critical → Warning → Info.
→ reports/fixes-applied.md
Agent Test Engineer
Writes unit tests for every fix. Edge cases + security regression tests. Runs all tests, iterates until 100% pass.
→ reports/test-report.md
Regression Check
Re-run Security + Quality scans on modified code. New issues → loop back Zero findings → proceed
Gate: zero findings + all unit tests pass
2
Playwright E2E Testing — browser-based end-to-end tests
Agent Playwright E2E Engineer
Full test suite: auth flows, CRUD, user management, orders, security (XSS, auth bypass, data leaks). Distinguishes app bugs from test bugs. Writes missing tests for coverage gaps.
→ reports/playwright-report/ + reports/playwright-test-report.md
Failure Triage
App bug → fix source, re-run unit + E2E Test bug → fix test, re-run E2E All pass → proceed
Gate: all E2E tests pass
3
Docker Pipeline — build, deploy, verify, integration test
Agent Docker Build
docker compose build — validate image size, non-root user, health check, no secrets in layers.
→ reports/docker-build-report.md
Agent Docker Deploy
docker compose up -d — health check, test endpoints, collect container ID, IP, ports, network.
→ reports/docker-deploy-report.md
Agent Integration Tests
Full test suite against live container: auth, CRUD, security, edge cases. Records every request/response.
→ reports/docker-integration-test-report.md
Pass / Fail
Fail → fix, rebuild, redeploy, retest All pass → container stays running
Gate: integration tests pass + container healthy
4
Master Report
Agent Documentation Lead
Compiles ALL agent reports into a single master HTML report: security audit, code quality, fixes, tests, Docker build, deployment info, integration tests, recommendations.
→ reports/master-report.html
Pipeline Complete
42
Issues fixed
187
Unit tests
20
E2E tests
100%
Pass rate
9+
Agents

External Integrations

SystemProtocolWhat It's Used For
Jira REST API (curl) Fetch tickets, post comments, upload attachments, transition status, log time. Credentials in .env.
JAM (jam.dev) MCP Server Analyze bug recordings: video analysis, console logs, network requests, user events, screenshots. Auto-detected in Jira ticket descriptions and comments.
Paper (paper.design) MCP Server Create UI mockups: artboards, HTML/CSS designs, screenshots, JSX export. Used in both /jira (for UI-related tickets) and /new-feature.
Docker CLI Build images, deploy containers, run integration tests, teardown.
Playwright CLI (npx) Browser-based E2E testing, screenshot capture for before/after comparisons.
Git CLI Diff detection for code analysis, branch management.

Key Design Decisions (and Why)

1. File-based communication, not memory

Decision: Agents communicate through reports/*.md files, not through shared memory or context.

Why: Sub-agents in Claude Code start with a fresh context. They have no memory of the parent or other agents. Files are the natural handoff mechanism — they're persistent, inspectable, and don't depend on context window management. An agent that writes to reports/security-audit.md produces an artifact that any other agent (or human) can read.

2. Always ask the user for project paths

Decision: The pipeline never auto-detects which codebase to work in. It always asks.

Why: The AI incorrectly assumed a ticket belonged to gateway-backend when it was actually control-backend-api. In a multi-project environment, the cost of working in the wrong codebase is catastrophic — you're modifying the wrong code. Asking takes 5 seconds. Fixing a wrong-codebase mistake takes much longer.

3. User approval gates before implementation

Decision: Two mandatory approval points: (1) the feature plan, and (2) the UI design. No code is written until both are approved.

Why: Agent work is cheap to redo at the planning stage but expensive at the implementation stage. A wrong plan means multiple agents building the wrong thing. Catching misunderstandings at the plan/design stage saves massive amounts of token spend and time.

4. Parallel agents where possible, sequential where necessary

Decision: Security + Quality scans run in parallel. Frontend + Backend implementation run in parallel. But design must complete before implementation, and implementation must complete before code review.

Why: True parallelism saves time (two agents working simultaneously is faster than two agents working sequentially). But dependencies must be respected: you can't implement a UI that hasn't been designed, and you can't review code that hasn't been written.

5. Stack-agnostic commands with runtime detection

Decision: Commands like /unit-test and /deps detect the project's tech stack at runtime and adapt their behavior. The same command works for Java/Maven, C#/.NET, JavaScript/npm, Python/pip, Go, Rust, PHP, and Ruby.

Why: We work across multiple stacks (Java backend, React frontend, .NET APIs). Having separate commands per stack would be unmaintainable. Runtime detection lets one command serve all projects.

6. Mandatory reports for every pipeline

Decision: Every pipeline generates an HTML report, regardless of whether the user asked for one.

Why: Reports serve three purposes: (1) they're the artifact that gets uploaded to Jira, (2) they're the communication mechanism between agents, and (3) they're the audit trail. If something goes wrong, the report tells you what each agent did and found.

Patterns That Work Well

The Iterative Loop Pattern

Used in /dev-team and /unit-test. Scan, fix, verify, re-scan. Continue until zero findings. This is the most reliable pattern because it's self-correcting: if a fix introduces a new issue, the next scan catches it.

The Scope-Based Branching Pattern

Used in /jira and /new-feature. Classify the work as UI-only, backend-only, or full-stack, then branch the pipeline accordingly. UI-only skips backend agents. Backend-only skips design and frontend. Full-stack runs everything.

The Parallel-Then-Merge Pattern

Launch two or more agents simultaneously, wait for all to complete, then merge their outputs into the next step. Used for Security + Quality scanning, and for Frontend + Backend implementation.

The User Gate Pattern

Present results to the user with explicit options (Accept / Change / Reject). Block pipeline progression until the user responds. Used for plan approval, design review, and Jira update decisions.

The Project Name Resolution Pattern

Commands accept either a full path or a project name (e.g., control-backend-api). The system resolves names to paths based on known project directories. If resolution fails, it asks the user. This makes commands feel like CLI tools — short, memorable invocations.

Patterns That Don't Work

Letting agents guess project paths

This caused the AI to work in the wrong codebase. Never again. Always ask.

One giant agent that does everything

A single agent trying to scan, fix, test, and report in one go produces worse results than specialized agents. The context window gets polluted with too many concerns, and the agent loses focus. Specialization works.

Skipping the design phase "because it's a small change"

Even a one-line color change benefits from a Paper mockup. The user seeing the change before it's implemented catches misunderstandings that are trivial to fix at design time and expensive to fix after implementation.

Running all tests in one shot without iterating

Newly created tests frequently fail on the first run due to mock misconfiguration, wrong assertions, or compilation errors. The fix loop (run, analyze failures, fix, re-run, up to 5 iterations) is essential. Without it, you'd hand the user a pile of broken tests.

Token Efficiency: What Actually Matters

Agent orchestration is expensive in tokens. Every agent spawn is a fresh context. Every tool call costs tokens. Every report file that gets read costs tokens. Here's what we've learned about managing costs:

StrategyImpactExample
Scope-based branching High Backend-only tickets skip UI Designer + Frontend Dev + Paper MCP — saves ~40% of tokens
Parallel execution Medium (time) Security + Quality scans run simultaneously — wall clock time cut in half, same token cost
Targeted code review High Code Analyst reviews only the git diff, not the entire codebase — 10x fewer tokens than a full scan
Stack detection Medium Detecting "Java + Maven" means the agent prompt includes JUnit patterns, not Jest. No wasted exploration.
Fix loop cap Safety Unit test fix loop caps at 5 iterations. Prevents infinite token burn on unfixable tests.

Lessons Learned

1. The prompt IS the product

In a traditional software project, you write code and the code runs. In this system, you write prompts and the prompts run. The quality of the output is directly proportional to the quality of the prompt. A vague prompt produces vague results. A prompt that specifies exact file paths, exact output formats, and exact decision criteria produces reliable, reproducible results. We treat our .claude/commands/ Markdown files with the same rigor as production code.

2. Agents need more structure than humans

A human developer can be told "fix the bug" and figure out the rest. An agent needs: which project directory, what language, what test framework, what files to read first, what patterns to follow, where to write the output, and what to do when it encounters an edge case. The more structure you provide, the better the output.

3. File-based communication is surprisingly robust

We initially worried about agents writing to the wrong files, overwriting each other's output, or producing incompatible formats. In practice, this almost never happens. Agents are good at following naming conventions when you tell them explicitly what to write and where.

4. User gates save more time than they cost

Every user approval point adds 30-60 seconds of human time. But each gate prevents 5-15 minutes of wasted agent work when the plan or design is wrong. The math is clear: always gate before expensive operations.

5. The iterative loop is the most powerful pattern

Single-pass pipelines (scan once, fix once, done) miss things. The loop pattern (scan, fix, re-scan, repeat until clean) catches issues introduced by fixes, previously masked issues, and interaction effects between changes. It consistently produces cleaner output than single-pass.

6. JAM MCP integration changed bug analysis

Before JAM integration, bug tickets had text descriptions and maybe a screenshot. With JAM MCP, the agent can analyze the actual video recording: see what the user clicked, read the console errors, check the network requests. It turns a vague bug report into a structured, actionable analysis. The auto-detection of JAM links in Jira tickets means this happens automatically — no extra steps.

What's Next

PriorityFeatureWhy
High /pr — Smart PR Creator Closes the loop: Jira ticket → implement → PR → Jira update. Currently the pipeline stops before creating a PR. This is the missing piece.
Medium Git integration in /jira Auto-create feature branches per ticket, auto-commit with ticket reference, link PRs to Jira.
Medium Slack notifications Post to a team channel when a ticket is resolved, with a link to the report and key stats.
Low CI/CD integration Trigger builds after implementation, verify deployment, link build status to Jira.
Low Sprint retrospective (/retro) Analyze completed sprint: tickets resolved, time logged, code quality metrics, churn areas. Auto-generate a retro report.

Full Command Reference

CommandCategoryAgentsOutput
/jira {key}Project MgmtUp to 6reports/jira-{KEY}-report.html
/jira sprintProject MgmtUp to 6/ticketSequential processing
/jira teamsProject Mgmt0Team list (terminal)
/jam {url}Bug Analysis0 (MCP)JAM analysis (terminal)
/tempoTime Tracking0Jira worklog
/new-featureFeature Dev4-5reports/feature-report.html
/dev-teamCode Quality5 (iterative)reports/master-report.html
/full-pipelineEnd-to-End9+reports/master-report.html + Docker
/unit-testTesting1-3reports/unit-test-report.html
/playwright-testTesting1reports/playwright-report/
/depsSecurity1-2reports/deps-audit-report.html
/security-auditSecurity1reports/security-audit.md
/quality-auditQuality1reports/quality-audit.md
/fix-allImplementation1reports/fixes-applied.md
/test-allTesting1reports/test-report.md
/code-analysisReview1reports/code-analysis.md
/master-reportDocumentation1reports/master-report.html
/docker-buildDevOps1reports/docker-build-report.md
/docker-deployDevOps1reports/docker-deploy-report.md
/docker-testDevOps1reports/docker-integration-test-report.md
/docker-teardownDevOps1reports/docker-teardown-report.md
/create "desc"Universal0-1reports/feature-plan.html + .claude/unprocessed_reports/
/create-project "desc"Universal5+reports/project-plan.html + reports/project-delivery-report.html
/bug "desc"Universal0-1.claude/unprocessed_reports/
/verifyTesting0reports/verification-report.html
/changelogDocumentation0reports/changelog.html
/git sync {branch}Git Ops0-1reports/git-sync-report.md
/repo-setup {url}Onboarding1-3reports/repo-setup-guide.html
/reportReporting0reports/change-report.html
/impact-scan "desc"Analysis0reports/impact-scan-report.html

Reproducing this setup

Everything described in this document runs on Claude Code (Opus 4.6) with no custom infrastructure. The entire system is 28 Markdown files in .claude/commands/, an .env file with Jira credentials, and two MCP server connections (JAM + Paper). There is no server, no database, no deployment pipeline. If you have Claude Code, you can copy the Markdown files and have the same team.