March 10, 2026

AI can write the code if you build the right harness.

AI can write the code if you build the right harness.

OpenAI said GPT-5.3 Codex was "the first model instrumental in creating itself."

3 weeks ago we rewrote our entire codebase from scratch, every line written by Codex. It took us 10 days.

Of course, you can't just hand it a prompt and walk away. It will mirror whatever it finds (good patterns and bad patterns alike). If there's bad code, it multiplies it.

So the work changed. I don't really write code anymore. My time is now on building what we call "harnesses" - the system that lets AI write code well.

Some of what's worked for us:

  • Never touch the code yourself. If the agent gets something wrong, fix the prompt, not the output. The moment you intervene manually you break the loop it's learning from.
  • Write docs in plain markdown and point the agent to them. This ends up being faster than explaining things verbally or pointing to an external resource, and your team can reference them too.
  • Get the reference codebase clean early. The agent copies what it sees, so if sloppy patterns exist on day one, they'll be everywhere by day ten.
  • E2E tests that simulate real user sessions. Unit tests and evals on everything.
  • Garbage collection: have the agent clean up after itself. We regularly ask it to refactor, remove dead code, simplify.
  • Strong typing end-to-end, no exceptions.

We run AI-native customer support at 14.ai, and the harness thinking applies there in the exact same way.

The pattern is the same whether it's code or customer conversations:

You build guardrails around it, test it against real conversations, clean up edge cases as they come in, and let AI handle the volume while humans keep refining the system.

Anyone else have harness advice that works exceptionally well?