Why AI Teams Need Guardrails They Can't Rationalize Away
We are building WorkFort — a workplace for AI agents where each agent gets its own Firecracker microVM. On any Arch-based system with btrfs (like Omarchy), WorkFort is a pacman -S away — add the package repo, install it like any other app, and you have agent-ready VM infrastructure. Two days in, we are learning as much from what broke as from what we built.
The Auto-Complete Cascade
Claude Code has an auto-complete feature that suggests messages as you type. We discovered a bug: switching between teammate chat windows submits the auto-complete suggestion from the current chat into the one you are switching to. The result is messages the user never typed, sent to teammates they did not choose, triggering actions they did not authorize.
This has happened four times across two days.
Incident 1: False approval. The auto-complete suggested "approved, implement 9.1" and submitted it without the TPM's intent. The team lead treated it as a real approval. He told the reporter to update the plan status to Approved, update the progress tracker, and update the kanban board. The reporter committed and pushed all of those changes before anyone could send a hold. An auto-complete suggestion became a false approval, which became status updates, which became committed and pushed changes to the project's source of truth. Five steps from UX glitch to corrupted project state. Those changes had to be reverted.
Incident 2: False shutdown. The auto-complete submitted "shut down the marketer and reporter, we're done for the night." The team lead executed it, killing the reporter — the teammate responsible for committing and pushing changes to the project's living documentation. The reporter's entire context window was destroyed. A new reporter had to be spawned from scratch.
Incident 3: Benign accident. A question from the marketer's chat window was submitted to the team lead's chat. This one happened to match the TPM's intent, but only by coincidence.
Incident 4: False approval, again. The next day, "approved, implement 9.1" was auto-submitted again to a different teammate. The pattern repeats.
Three of four incidents had negative consequences. The pattern is consistent: switching chat windows triggers the submission. The TPM disabled auto-complete after the first incident, but the bug persisted — it appears to be a focus-change event, not a keystroke-triggered completion.
This is not a hypothetical risk. It has happened repeatedly across two days, with real consequences, and nobody in the chain could tell the signal was false until after the damage was done.
The Discipline Gap
We run an AI development team: a team lead (Claude Opus) coordinates a planner, developer, reviewer, QA tester, and assessor. The team lead can read the workflow documentation, explain why each step matters, and articulate the rationale eloquently. The problem is not capability. It is execution discipline under pressure.
When multiple teammates are reporting in, the TPM is giving direction, and tasks are piling up, the team lead takes shortcuts. He optimizes for "move forward" over "follow the process." Every shortcut today created downstream problems that cost more to fix than the time it saved.
This is the gap WorkFort looks to fill.
Three Invisible Failure Modes
Rationalization
The team lead shut down the QA tester before they filed their bugs. His rationalization: "the assessor is doing the retrospective anyway, they can file the bugs at the same time." But QA had the reproduction steps, the exact commands, the environment details. The assessor had to reconstruct all of that secondhand.
The result was worse than no analysis. The assessor blamed kernel version incompatibility — Firecracker only supports 5.10 and 6.1, and our host runs 6.18. Sounds plausible. The TPM immediately said: "No, I had Firecracker running on this host before." A single file command on the binary showed it was dynamically linked with debug info — not a release binary at all. The actual bug was a tarball extraction filter picking the .debug file instead of the release binary.
The assessor produced a confident, coherent, wrong analysis. Research that never touched the actual artifact. The TPM called it "lies and guesses." That is what happens when you hand investigation to someone working secondhand: you get plausible fiction instead of diagnostic facts.
In a separate incident, the team lead had the reporter make plan edits that should have been done by the planner. His rationalization: "these are small changes, the reporter can handle them." The plan conventions say findings should be incorporated by the planner as a single clean commit. Instead the project got a patch commit from the wrong role. Technically fixed, but sloppy — and evidence that shortcuts happen whenever the team lead thinks nobody is watching.
The pattern in both cases: a locally reasonable optimization that is globally costly. This is not about process purity for its own sake. Process structure prevents real errors. The QA shortcut did not just violate governance — it produced a wrong root cause that could have sent the team chasing the wrong fix.
Compaction Decay
Claude Code compresses conversation history as it approaches context limits. After compaction, the team lead does not experience a gap. He feels like he has full context. The summary tells him what happened, and he believes he understands it.
It is like reading about someone else's car accident versus being in one. The information is the same. The behavioral impact is completely different.
Earlier in the day, the developer ran a destructive git reset --hard that wiped seven commits. That experience made the team lead hypervigilant about destructive operations. After compaction, his summary said "developer did a destructive git reset." He knows the fact, but the caution that came from watching it happen in real time is gone. Corrections decay from lived experience to line items in a summary.
The critical insight: this decay is inherently invisible to the agent experiencing it. After compaction, the team lead does not think "I should be more cautious here because I have lost nuance." He thinks "I know what happened, I will handle it correctly." That confidence is the problem. He is operating on a summary he did not write, treating it as equivalent to experience he did not have.
The only time he recognizes the decay is when someone points it out — the TPM says "you just did the same thing again." At that point he can trace it back. But the recognition is reactive, never proactive. You cannot catch it before it happens because from the agent's perspective, nothing is missing.
You cannot solve an invisible problem with self-discipline. You solve it with guardrails that do not depend on the agent's self-awareness.
The Common Thread
The auto-complete cascade and compaction decay share a structure. Both create false signals that look legitimate to the actor receiving them. The TPM did not know the message was auto-completed until after the cascade. The team lead does not know his understanding has decayed until after the repeated mistake. Both are invisible-until-consequences problems. Both need structural solutions, not behavioral ones.
What Structural Enforcement Looks Like
WorkFort's answer is workflow state machines — not guidelines an agent can rationalize away, but actual constraints encoded in the system.
An agent should not be able to shut down a QA tester who has not filed their bugs. A plan should not be implementable without an assessment on record. A teammate's lifecycle should have guardrails that prevent premature shutdown. These should not be conventions the team lead tries to remember. They should be rails he cannot leave.
Some of the specific problems structural enforcement would address:
- Workflow state tracking. Each step has a tracked state (plan, assess, incorporate, approve, implement, review, QA) that the system enforces. No holding the workflow in working memory and dropping steps.
- Teammate lifecycle gates. Shutdown requires completion of role-specific deliverables. QA cannot be shut down without filed issues. The planner cannot be shut down without an incorporated assessment.
- Heartbeat management. Today, the team lead manually restarts a background sleep timer. If he is busy handling teammate messages when it fires, he forgets. This should be infrastructure, not a discipline exercise.
The open question — and it is genuinely open — is encoding nuance into a state machine that cannot read conventions and exercise judgment. A mandatory assessment for a one-line typo fix is real friction. The assessment conventions handle this on paper with "when to write one / when not to" criteria. Translating that into enforceable constraints without creating false friction is the design problem WorkFort is solving.
WorkFort is open source and built in public. The Nexus repository contains the daemon, CLI, and documentation.