Agent harness design patterns
Off-the-shelf AI
won’t build your edge.
cyborg.build — how high-agency teams design harnesses that actually ship.
The problem
Generic tools, generic results.
- Off-the-shelf tools deliver shallow sourcing — the generic signals everyone already sees.
- Centralized platforms always struggled with adoption. Bought, rarely used.
- “Data-driven VC” became a marketing sticker. It never really transformed the industry.
Why it’s hard
The hard part isn’t the technology.
- Culture and team DNA have to align with the technology — or it won’t stick.
- A harness must be tailored to its user. It is not standard software.
- Doing this well is genuinely hard — most never get past the demo.
Typical pitfalls
Most break before they ship.
- ×Overbuilding orchestration
- ×Weak evals
- ×Context bloat
- ×No recovery path
- ×No observability
- ×Low adoption — built, never used
- ×No trust in the output
- ×One-size-fits-all harnesses
tech, yes — but adoption, trust, and culture break more of them
At Lunar
We don’t study this — we live in it.
- Each of us builds, extends, and lives in our own harness.
- Years building VC data and AI platforms.
- We invest in this frontier — and live on it.
Why we built cyborg.build
So you don’t start from scratch.
- Best practices and hard-won learnings, shared.
- A starter harness to build from.
- Opinionated tooling — skills, plugins, stacks, workflows.
- A way to stay ahead of the curve.
The patterns we’ll share
What a working harness actually looks like.
Minimum viable harness
If you do only three things.
- 01 Separate doer from checker The same agent should not be solely responsible for planning, executing, and judging its own work.
- 02 Keep durable state Put task state in files, checklists, logs, commits, tickets, or database records — not only in chat history.
- 03 Attach an evidence trail Important claims should link to tests, traces, citations, replays, source data, or verifier findings.
The Magic Seven · 01 / 07
Input Shaping
What exactly are we asking the system to do?
Turn vague intent into a typed work package: goal, constraints, acceptance criteria, available tools, risks, and stop conditions.
▸ full practices & artifacts behind login
The Magic Seven · 02 / 07
Role Architecture
Who does what?
Split cognitive roles so each agent or module has a clear job.
▸ full practices & artifacts behind login
The Magic Seven · 03 / 07
State & Memory
What must persist so work can continue or be audited later?
Treat the filesystem, database, ticket system, or git history as the harness memory.
▸ full practices & artifacts behind login
The Magic Seven · 04 / 07
Execution Control
How do we stop the agent from drifting, looping, or doing too much?
Bound the loop with phases, work-unit limits, permissions, budgets, and stop conditions.
▸ full practices & artifacts behind login
The Magic Seven · 05 / 07
Verification & Evaluation
How do we know if the output is good?
Evaluate outputs with independent checks before accepting them.
▸ full practices & artifacts behind login
The Magic Seven · 06 / 07
Evidence & Explainability
Why should a human believe this?
Pair important outputs with inspectable evidence.
▸ full practices & artifacts behind login
The Magic Seven · 07 / 07
Adaptation & Counterfactuals
What changes if assumptions, feedback, or context change?
Let the harness revise, branch, compare alternatives, and learn from failures.
▸ full practices & artifacts behind login
Your turn
Build yours — with a headstart.
starter harness · best practices · opinionated tooling