Our homeowners product was about to launch with dozens of design-to-code gaps. I recorded a design review with my UI designer, then built a tool that integrated our Figma designs and the live review app — and had AI analyze every screen to find inconsistencies we’d missed, identify systemic patterns, and generate fix-ready documentation that engineering could self-serve.
Obie’s homeowners insurance quoting system was weeks from launch. I’d designed a 9-step quote flow, and engineering had built it. But as implementation progressed, gaps appeared — wrong typography, missing component states, inconsistent spacing, colors that didn’t match the design system.
This is normal. Every product has a gap between Figma and production. What wasn’t normal was our process for closing it.
Engineers weren’t ignoring design fidelity — they didn’t have a good way to see it. The information was scattered across Figma comments, Slack threads, and Jira tickets. No one could answer the basic question: “How many issues are there, and how close are we to done?”
I realized this wasn’t a design handoff problem. It was a design operations problem. And the solution wasn’t more tickets — it was a tool.
Using Claude Code for AI-assisted development, I built a production-grade QA tool in a single day. Not a prototype. Not a slide deck. A live, deployed application that the team used daily through launch.
Figma design embedded side-by-side with the live review app for each of the 9 quote flow steps. No more switching between tabs.
Every gap documented with type (Text, Missing, Different, Extra), severity (High/Medium/Low), and fix tier (CSS-only, Component, Needs Dev).
Each issue includes “What to change,” “Where to look,” and a copy-paste Claude Code prompt so engineers can fix issues without a design handoff meeting.
Four states per issue (Open, In Progress, Done, Dismissed) with localStorage persistence. A progress bar shows resolution rate in real time.
Arrow keys and number keys for fast step switching. Built for engineers who live in the terminal.
Design review session notes embedded in the tool, plus 6 global themes (modal standardization, typography scale, component states) that cut across individual issues.
The standard process would be filing each issue as a separate 1-point Jira ticket. That’s 52 tickets — each needing screenshots, Figma links, implementation guidance, grooming, assignment, and back-and-forth comments. At ~15–20 minutes of overhead per ticket (writing, triaging, context-switching), that’s 13–17 hours of ticket management alone — scattered across sprints — before a single pixel changes.
One day of building replaced all of it. No tickets. No grooming. No “which Figma frame?” comments. A single front-end engineer resolved all 52 issues between planned tickets over the course of one sprint — without taking away from any planned work. The tool made each fix so self-contained that an engineer could pick one up in a 15-minute gap, paste the code suggestion, and move on.
Not all design gaps are equal. A wrong hex color is a 30-second CSS fix. A missing component state might require new logic. Treating them the same wastes engineering time on triage.
Every issue got two dimensions: severity (High/Medium/Low — how visible is this to users?) and tier (CSS-only/Component/Needs Dev — how much engineering effort?). Engineers could filter to “show me all High-severity CSS-only issues” and knock out the highest-impact, lowest-effort work first.
13 of the 52 issues were High severity. 11 were CSS-only. Engineers cleared all CSS-only issues in a single sprint, getting visible polish wins out the door fast while planning the deeper component work.
Traditional design QA hands over screenshots with annotations: “this should be 16px not 20px.” The engineer then has to find the right file, the right component, the right CSS rule. That translation step is where fixes stall.
Each issue card includes a collapsible code section with: what specifically to change, where in the codebase to look, and a paste-ready prompt for AI-assisted fixes. I didn’t just identify problems — I reduced the cost of fixing them to near zero.
I’m not an engineer. I don’t write production application code. But I’ve integrated AI-assisted development into my practice using Claude Code — and for this project, the AI wasn’t just the builder. It was the analyst.
My UI designer Pablo and I walked through the live review app screen by screen, talking through every element that felt off. We recorded the session so nothing was lost — our observations, Pablo’s gut reactions, specific callouts on spacing, color, and component misuse.
Using Claude Code, I built an interactive walkthrough that embedded the Figma design file and the Heroku review app side by side for each step of the quote flow. This wasn’t a static comparison — it was a live, navigable audit environment.
I had Claude Code analyze the Figma designs against the review app screen by screen. It identified inconsistencies that Pablo and I hadn’t caught — subtle spacing differences, wrong font weights, missing hover states, component variants that didn’t match the design system. Then I fed in our recorded review notes, and the AI applied our human judgment to rank and contextualize its findings.
The most powerful step: I had the AI analyze which inconsistencies were systemic rather than one-off. It identified recurring themes — like sizing and alignment being off across multiple screens because engineering had miscoded the typography scale. Instead of filing the same fix 15 times, we identified 6 global patterns where a single code change would cascade across the entire flow.
No meeting. No handoff deck. Engineers opened the tool, filtered by severity and tier, and started fixing. By the next morning, issues were already being resolved.
This wasn’t “AI built a thing for me.” This was a designer orchestrating AI as an analytical tool — combining human judgment (Pablo’s eye, my design review notes) with machine analysis (screen-by-screen comparison, pattern recognition) to produce something neither could do alone. The AI found issues we missed. We provided the context to rank them. The tool made the fixes self-serve.
More UI fidelity work shipped pre-launch than any previous Obie release. The tool didn’t just track issues — it changed the team’s relationship with design quality.
The clearest signal: engineering proactively asked me to build one for the next feature release. Design QA went from an afterthought that happened in the last week — if there was time — to something the team expected and planned around.
52 issues that would normally be 52 individual 1-point Jira tickets, each requiring grooming, assignment, and context-switching. Instead: one front-end engineer fixed every issue between planned tickets in a single sprint, with zero disruption to the roadmap. The tool cost one day to build. It shipped a product with zero known design gaps and created a repeatable process that engineering now requests by default.
As the sole UX designer supporting four product teams, I can’t be in every room, on every PR, at every standup. The highest-leverage thing I can do isn’t design more screens — it’s build systems and tools that carry my design intent forward without me in the room.
The QA walkthrough is one example. The design system is another. The pattern is the same: invest upfront in the thing that scales, so the quality compounds instead of eroding.
The AI analysis wasn’t perfect. It flagged issues that weren’t actually problems — false positives from dynamic rendering states and responsive breakpoints that looked different from Figma but were intentional. I manually reviewed every finding before sharing with engineering, which added time but was essential for credibility. If the tool had shipped unvetted AI output, the team would have stopped trusting it after the first false alarm.
The first version was also tightly coupled to this specific feature — hardcoded steps, hardcoded Figma embeds. For the next release, I’m building a more modular version that any designer could populate without rebuilding the app from scratch. The goal is to make this a repeatable part of our release process, not a one-time heroic effort.
← Back to all work