How We Govern Our Own AI Agents (Dog-Food Case Study)
Constellation uses its own governance infrastructure to constrain Claude Code in production
The Setup
Constellation is built using AI agents. Specifically, Claude Code — Anthropic's agentic coding tool — is a primary development tool. Claude Code operates as an autonomous agent: it reads files, writes code, runs commands, manages git operations, and interacts with external services. It has broad access to the codebase and can take consequential actions.
This creates an obvious question: if Constellation is a governance platform for AI agents, shouldn't it govern its own AI agents? The answer is yes, and we have been running this setup since early 2026.
The mechanism is Constellation's **MCP governance gate** — a Model Context Protocol server that intercepts Claude Code's actions before they execute, checks them against institutional constraints, and either permits, blocks, or escalates them. The governance gate runs as a local MCP server that Claude Code connects to as part of its tool chain. Every action Claude Code takes passes through the gate.
This is not a demo or a proof of concept. It is our production development workflow. Every commit in the Constellation repository has been governed by Constellation.
What We Constrain
Our constraint set has evolved through practical experience. Here are the categories of constraints we enforce on Claude Code today.
Git operations. Claude Code cannot push directly to the main branch. All changes must go through staging first. Force pushes are blocked entirely — there is no override. These constraints exist because a single bad push to main triggers a production deployment, and reversing a production deployment is significantly more costly than preventing a bad push.
Spending and resource commitment. Any action that commits financial resources — creating paid API subscriptions, provisioning cloud infrastructure, or purchasing services — requires human escalation above a defined threshold. Claude Code can suggest these actions and prepare the configuration, but it cannot execute them without explicit approval.
Schema and data changes. Database migrations require explicit human review before execution. Claude Code can generate migration files and even test them against a development database, but applying them to staging or production requires escalation. This constraint exists because schema changes are effectively irreversible at scale.
Documentation commitments. This is an unusual constraint that reflects our development philosophy. Every session that changes code must end with documentation updates, a passing build, a descriptive commit, and a push to staging. This is enforced as a governance constraint, not a cultural norm. Claude Code is reminded of this obligation and its completion is tracked.
External communications. Claude Code cannot send messages to external services (Slack, email, webhooks) without human review. Internal logging is unrestricted, but anything that leaves the system boundary requires approval.
What We Have Learned
Six months of dog-fooding has produced insights that we could not have gained any other way.
Constraints must be precise, not aspirational. Early constraints like "be careful with production systems" were useless. Claude Code interpreted them contextually and sometimes incorrectly. Replacing them with specific, machine-readable rules — "cannot execute commands matching `git push origin main`" — eliminated ambiguity. The lesson: governance constraints are code, not prose.
The governance gate adds negligible latency. Our initial concern was that pre-execution constraint checking would slow down development. In practice, the gate evaluation takes 50-200ms per action — imperceptible in a development workflow where the human is reviewing output anyway. The performance cost of governance is not zero, but it is close enough to zero that it has never been a practical concern.
Escalations are information, not interruptions. We expected escalations to be annoying — the agent hits a constraint, work stops, a human must intervene. In practice, escalations are valuable signals. They tell us what the agent is trying to do, why it exceeds current authority, and whether the constraint should be adjusted. About 30% of our constraint refinements originated from analysing escalation patterns.
Shadow mode is essential for new constraints. When we add a new constraint, we run it in shadow mode first — the constraint is evaluated but not enforced, and the would-be violation is logged. This lets us calibrate the constraint before it affects the workflow. Without shadow mode, new constraints are either too loose (permitting what they should block) or too tight (blocking legitimate actions and frustrating the developer).
Friction Points
Dog-fooding is valuable precisely because it reveals friction that users would experience. Here are the friction points we have identified and how we have addressed them.
Constraint conflicts. Two constraints can conflict: "all code changes must be committed before session end" and "commits to the database migration directory require human review." If a session includes a migration, Claude Code must escalate to commit — but the session-end constraint requires a commit. We resolved this by implementing constraint priority ordering and allowing escalation to satisfy both constraints simultaneously.
Stale constraints. Constraints that were relevant during one project phase become irrelevant during another. A constraint blocking changes to a specific module during a refactoring freeze should be removed when the freeze ends. Without a review cadence, stale constraints accumulate and create unnecessary friction. We now attach review dates to all time-bound constraints and surface them in the weekly governance digest.
Context loss on escalation. When Claude Code escalates an action, the human reviewer needs context: what was the agent trying to do, why, and what constraint triggered the escalation? Early escalations included minimal context — essentially "action blocked, please approve." We improved this by requiring the governance gate to include the full action description, the triggering constraint, and the agent's assessment of why the action was appropriate. This reduced escalation resolution time significantly.
The "just let me do it" impulse. The strongest friction is psychological, not technical. When you are in flow and the governance gate blocks an action, the impulse is to override the constraint. We deliberately made overrides require a formal process — you must record why you are overriding, and the override is logged in the governance trace. This friction is intentional: it forces a moment of deliberation before bypassing governance.
Results
After six months of governing our own AI agents with Constellation, we can quantify the impact.
Zero unintended production deployments. Before the governance gate, we had three incidents where Claude Code pushed directly to main, triggering unintended production deployments. Since implementing the constraint, we have had zero. The constraint has been triggered (attempted pushes to main are logged) — it is not that the agent never tries, it is that the governance gate prevents the action from executing.
Governance trace coverage: 100%. Every action Claude Code takes is recorded in a governance trace. When we need to understand why a change was made, the trace provides complete context — not just the git history, but the governance evaluation that preceded each action. This has been invaluable for debugging and for onboarding new team members who want to understand decision history.
Constraint refinement cadence: weekly. We review and adjust constraints weekly based on escalation patterns and shadow mode data. The constraint set is a living document — not static policy, but evolving infrastructure that adapts to how we actually work. Roughly 15% of constraints are modified in any given month.
Developer satisfaction: high. This surprised us. We expected developers to resent the governance overhead. Instead, the consistent feedback is that governance constraints reduce cognitive load. Instead of remembering "do not push to main, do not modify the payment module without review, do not commit without updating docs," the governance gate handles it. Developers focus on the work; the infrastructure handles the governance.
The strongest validation is simple: **we would not remove it.** Even if Constellation were not a governance product, we would keep the governance gate running on our own development workflow. It makes us faster, not slower, because it eliminates the cognitive overhead of manual governance and the recovery cost of governance failures.
What This Means for You
Dog-fooding is not just a product development technique. It is a credibility mechanism. When we tell customers that governance infrastructure can govern AI agents without destroying developer productivity, we are not speculating — we are reporting.
The specific constraints we enforce are less important than the pattern. Every organisation that deploys AI agents will have different constraints reflecting their institutional context. A financial services firm will constrain trading agents differently than a software company constrains coding agents. The constraints are specific; the infrastructure is general.
If you are running AI agents today — whether Claude Code, GitHub Copilot in agent mode, custom LLM pipelines, or any other autonomous system — you can start with three steps. First, write down the constraints you currently enforce mentally ("do not push to main", "do not spend more than $500 without approval"). Second, ask yourself: if you were not watching, would those constraints still be enforced? Third, if the answer is no, you need governance infrastructure.
The gap between "constraints I enforce when I am paying attention" and "constraints that are enforced at the moment of action regardless of whether I am watching" is the governance gap. Closing it is not a luxury. It is a prerequisite for trusting AI agents with consequential work.
Related Glossary Terms
Related Posts
See governance infrastructure in action
Constellation enforces corporate governance at the moment of action — for both humans and AI agents.