The Harness Issue

I've never spoken or read the word "harness" as much as I have in the last few months. I don't think that's going to change anytime soon. As models converge at the top it is the harnesses that differentiate them.

Coding harnesses like Claude Code and Codex have brought the power of agentic engineering to the masses, supplying you with skills for subagents, code reviews, and more out of the box.

After playing around with Figma Make and Google Stitch in recent weeks I don't think design harnesses are quite there yet. Both tools are still essentially coding harnesses under the hood. Figma Make even more so. At least Stitch gives you designs in a traditional artboard format.

Code is more deterministic. We can put tests and automations in place. Design is a bit more nuanced. We can follow basic rules like CRAP (contrast, repetition, alignment, and proximity) but that's harder for a model to judge than checking whether a page loads in under a second.

My design skills are a bit rusty. It's been a minute since my graphic design internship in NYC in the 2010s. Ok, it was 2010. At this moment I just don't see the design harnesses being able to go from scratch and keep design elements consistent across a project.

Of course last November most developers were still reading every line of code their agents wrote (or at least claimed they were). It'll be interesting to see how design harnesses catch up to their coding brethren in the coming months.

📖 Read
Agent Harness Engineering
Addy Osmani
Addy Osmani breaks down in detail what a harness is and the importance of them in agentic engineering. A decent model with a great harness can beat the output of a great model with a bad harness.

🔧 Tool
Flue — The Agent Harness Framework
Fred Schott
When you're ready to start engineering your own harness, check out Flue. A TypeScript agent harness framework that bakes in the loop for you, with skills, tools, sandboxes, subagents, and MCP support out of the box. So you can spend your time configuring the harness instead of building the agent loop from scratch.

🔧 Tool
Sites – Codex
OpenAI
I've been really digging Codex lately. Their latest update builds on the idea that was rattling around in my head last week with using HTML for plans to share with teammates better. Codex takes it to the next level. You can build sites and share them with your workspace directly in Codex now, and it uses D1 and R2 from Cloudflare under the hood. It's still in preview and only on ChatGPT Business and Enterprise for now, so most of us can't poke at it just yet.

📖 Read
After Automation
Dan Shipper, Every
Dan Shipper makes the case that AI doesn't shrink the work, it moves it. We may be writing less code for the actual output, but we're picking up engineering work at the harness level — building the review queues, evals, repo rules, and instruction files that turn a flood of cheap output into something good. When the rare stuff gets cheap, the valuable work shifts to the humans who know what's actually worth building.

❝

The current generation of models only knows about work that has been done. Humans know about what needs to be done, right now, at this moment.

📖 Read
When AI Builds Itself
Anthropic
Meanwhile Anthropic's AI is basically writing itself now with Claude Code. 80% of code merged into their codebase is now authored by Claude. Cool, cool, cool.

Thanks for reading,
Jason

The Harness Issue

Keep Reading

Previously on Tech

Home