Uniform governance across AI agents will lead to their failure

Applying uniform governance to all AI agents, regardless of their autonomy level and scope, can lead to enterprise AI agent failure, says Gartner – and failures are most likely to occur when organisations fail to distinguish between an agent’s ability to act, and the scope of access it is granted.

Gartner predicts that by 2027, 40% of enterprises will demote or decommission autonomous AI agents due to governance gaps identified only after production incidents occur.

“Enterprises are treating AI agent governance as binary – either locked down or fully trusted – and that is the root cause of failure,” says Shiva Varma, senior director analyst at Gartner. “Agents operate at different autonomy levels and across different trust boundaries. When the same controls are applied indiscriminately, organisations encounter two common failure modes: over-restriction of simple agents, which slows delivery and drives shadow development; or under-restriction of more autonomous agents, which increases operational, security and compliance risk.”

To mitigate these risks, Gartner recommends applying a proportional governance approach that classifies AI agents across distinct autonomy levels, with each level representing a different trust boundary and corresponding governance requirements.

AI Agent Autonomy Levels

Source: Gartner (May 2026)

Level 1: Observe

At Level 1, observe agents are limited to read-only access to defined data sources, with outputs visible only to the requesting user. Common use cases include document summarisation, data or knowledge retrieval, and code explanation.

“At this level, governance should focus on baseline controls such as scoped data access, user authentication, usage logging, and basic functional and security testing,” says Varma. “Because risk is limited primarily to data exposure and output accuracy, controls should remain lightweight and targeted.”

Level 2: Advise

Advise agents generate recommendations, drafts or proposed actions while humans review all outputs and execute actions manually. These agents retain read‑only access with no write access to any system and are commonly used for email drafting, report or code generation, and decision support.

Although humans execute decisions, advisory agents can anchor judgment, creating downstream risk when inaccurate outputs are trusted due to automation bias.

“Governance for advise agents should include all Level 1 controls and extend to addressing output quality and decision influence through accuracy and hallucination testing, domain-specific quality evaluations, and user training on appropriate reliance levels,” says Varma.

Level 3: Act with approval

At Level 3, agents can execute actions such as writing data, sending communications, or modifying configurations – but only after explicit human approval for every action.

“At this level, human review is effective only if it remains a meaningful control,” says Varma. “Without strong security testing, clear approval workflows with audit trails, and agent‑specific incident response procedures, approvals can degrade under time pressure or approval fatigue, creating a false sense of safety while expanding the attack surface.”

Level 4: Act autonomously

At the highest autonomy level, agents execute actions independently within defined guardrails – with humans reviewing exceptions, audit logs and aggregated outcomes rather than individual decisions.

“When agents operate autonomously, actions are executed at a scale and speed that can outpace human oversight,” says Varma. “Because accountability for outcomes remains with the organisation, this level requires the most rigorous governance, including continuous monitoring, enforced guardrails, rapid rollback mechanisms, circuit breakers that halt agent operation on threshold violations, and clear ownership for agent behaviour.”