The Playbook was built for a very specific person: me, alone, building apps I own. I control the design. I control the scope. I decide when to ship. The only stakeholder who sends me annotated screenshots at 10am is also me.
Then I used it on a client project — rebuilding a marketing agency's landing page from Adobe XD mockups — and discovered exactly which parts hold up and which parts quietly fall apart when someone else is making the decisions.
The Engagement
A Utah-based digital marketing agency needed their consultation landing page rebuilt to match a new XD design. WordPress theme (FoundationPress), custom CSS, jQuery. Their dev team was starting to use AI tools and wanted help leveling up.
Two deliverables: the landing page itself, and an AI development package (CLAUDE.md, custom skills, adapted process) their team could use on every future project.
The first deliverable was a project. The second was a product. I didn't realize the second one would become the more interesting piece until I was halfway through the first.
Six Phases Is Too Many
The full Playbook has six phases: Concept, Pre-Production, Production, Testing & QA, Launch, Post-Launch. For a two-week client landing page, that's like putting on a tuxedo to walk to the mailbox.
I compressed to four:
| Full Playbook | Agency Version |
|---|---|
| Concept | Discovery (client goals, not my ideas) |
| Pre-Production | Design & Setup (extract tokens, scaffold, get approval) |
| Production + Testing | Build (sprints with visual QA baked in) |
| Launch | Launch (preflight, deploy, handoff docs) |
Testing didn't earn its own phase because the sections are visual, not functional. There's no API to integration-test. The QA is: does this match the design? That happens after every section build, not as a separate gate.
Post-Launch became an optional Phase 4 (Optimize) for retainer clients. Most agency projects end at launch and never get touched again. Sad but true.
The Gate That Actually Matters
In the standard Playbook, phase gates are self-directed. "Am I confident the architecture is solid?" I ask. I answer. I move on.
Agency gates are different. The client decides. And the one that matters most — "has the client approved the design before you write code?" — is the one teams most want to skip.
We didn't skip it, but I watched how close it comes to happening. The design exists. The developer is eager. The client is busy. The temptation to "just start building, they'll review it later" is enormous. But building against an unapproved design is building against a target that might move. The Phase 1 gate exists to prevent exactly this.
Five Rounds of Screenshots
Here's the part that nearly broke me.
The XD design existed as static artboards. To verify the build matched the design, the client's team took screenshots of the live page, placed them next to the XD frames, annotated the differences in red, and sent them over. Five times.
Round 1: Mobile header is black (should be white). Content order is wrong. Form is in the wrong place. Round 2: "Start" should be white, not yellow. Header button is empty. Round 3: Step cards need different sizing. Vertical line too thin. Category text too light. Round 4: Comparison table scrolls horizontally. Testimonial text clips when you swipe. Round 5: Desktop timeline not centered. Card backgrounds wrong shade of gray.
Every round, the issues were CSS values that could have been caught by comparing the build to the design at two breakpoints. The problem wasn't the code — it was the information pipeline. Annotated screenshots are a lossy format. They miss exact pixel values, font weights, border radii, and subtle color differences. The developer squints at a screenshot trying to tell if a background is #141414 or #2A2A2A.
Figma's MCP server would have eliminated most of this. Claude Code reads the design file directly — structured layout data, exact token values, component hierarchy. The AI generates code using the actual design values instead of human-transcribed approximations. I've added this as a recommendation (not a requirement) in the agency adaptation, because the design team needs to buy in separately from the dev team.
What the Playbook Got Right
Tier 1.5 visual QA. The Playbook's three-tier testing model (build checks → visual smoke tests → full test suite) maps perfectly to agency work. The middle tier — open the page at mobile and desktop widths, compare to the design, fix mismatches — is the single most effective quality gate. If I'd been more disciplined about running it after every section instead of batching sections, rounds 1-3 of client feedback would have been one round.
Spec-as-context. The Playbook's emphasis on making specs consumable by AI carried over directly. The project CLAUDE.md with design tokens, CSS prefix, and architecture decisions meant every Claude Code session produced consistent output. Without it, every component would've used different colors and spacing.
Always shippable. Building section by section instead of "it'll come together at the end" meant the page looked presentable at every stage. When the client checked in mid-build, they saw a real page, not a construction zone.
What I Changed Afterward
I added three things to the Playbook:
-
Agency adaptation guide (
docs/adaptations/agency-consulting.md) — the compressed 4-phase process with client-specific gates, scope management guidance, and a real engagement as a reference case study. -
Design-to-code pipeline reference (
phase-1-pre-production/references/design-to-code-pipeline.md) — covering both the Figma MCP workflow and the XD fallback, with a comparison table showing why structured data beats screenshots. -
Agency transitions in TRANSITIONS.md — compressed phase gates with the client approval checkpoints and the specific risks of skipping them ("launching without explicit sign-off" is the agency equivalent of "shipping without tests").
The Playbook was always meant to be adapted. The README says "adapt as needed." But "adapt as needed" without an example is just a permission slip. Now there's an actual adaptation, tested on a real engagement, with specific guidance on what to compress and what to keep.
## Lesson
A methodology isn't proven until it works
on a project type it wasn't designed for.
That's when you find out which parts are
principles and which parts are assumptions.