Progressive Trust
A framework for gradually increasing AI agent autonomy based on demonstrated compliance — from shadow (observe-only) through preview, active, and autonomous levels.
Progressive trust is the principle that AI agents should earn increasing autonomy through demonstrated compliance with institutional constraints, rather than being granted full autonomy immediately or restricted permanently.
The four trust levels are: - **Shadow**: The agent observes but takes no autonomous action. All outputs go to a review queue. - **Preview**: The agent suggests actions but requires human approval before execution. - **Active**: The agent acts within established constraints autonomously, escalating when it encounters a boundary. - **Autonomous**: The agent has full delegation for approved decision classes, operating independently within broad constraints.
Trust levels can be adjusted per agent, per domain, or per action type. An agent might be Autonomous for code formatting but Shadow for external communications. Trust can also be temporarily reduced (e.g., during audit periods or after incidents).
This framework maps directly to how organisations manage human employees: new hires have limited authority, which increases as they demonstrate competence and reliability. Progressive trust applies the same logic to AI agents, with the advantage that compliance is structurally verified rather than subjectively assessed.
How Constellation handles this
Constellation implements progressive trust through the governance gate. Each agent has a trust level that determines which actions require approval, which proceed autonomously, and which are blocked entirely.
Frequently Asked Questions
How does an AI agent move from Shadow to Autonomous?
Trust level changes are explicit governance decisions — they require a human to evaluate the agent's track record and deliberately increase or decrease trust. This is recorded as a governance trace with the reason for the change.