Threat Intelligence 7 min read

AI Agents Are Running Loose. Governance Is Catching Up.

Kyanite Blue Labs, Threat Intelligence·3 April 2026

The Problem With Autonomous AI Agents

AI agent governance has become one of the most pressing security questions of 2025. Frameworks like LangChain, AutoGen, CrewAI, and Microsoft's own Azure AI Foundry Agent Service have made it straightforward to deploy agents that act independently — booking travel, executing transactions, managing cloud infrastructure, and writing and running code. The deployment friction is low. The governance friction has been almost non-existent. That asymmetry is the problem. When a human employee takes an action, there is a chain of accountability: authorisation, audit logs, access controls, and the ability to reverse a decision. When an autonomous agent takes the same action, none of those guardrails exist by default. The agent just acts. And it acts at machine speed, across multiple systems simultaneously, without pausing for a second opinion. This is not a theoretical concern. Researchers have already demonstrated prompt injection attacks against AI agents — where malicious content embedded in a webpage, document, or API response causes an agent to take actions its operator never intended. An agent with write access to a database, the ability to send emails, or permissions to modify cloud infrastructure is a meaningful attack surface. The more autonomy you grant, the larger that surface becomes.

What Microsoft's Agent Governance Toolkit Actually Does

Microsoft released the Agent Governance Toolkit as an open-source project to address the gap between what AI agents can do and what organisations can control. The toolkit is structured as seven packages, each targeting a specific aspect of agent behaviour and accountability. The core capabilities include policy enforcement for agent actions, permission scoping so agents only access what they need for a given task, audit logging that captures agent decisions and the reasoning behind them, and mechanisms for human-in-the-loop approval on higher-risk operations. There is also tooling to help developers define and test behavioural boundaries before an agent goes into production. Put simply: the toolkit gives developers and security teams a structured way to ask 'what is this agent allowed to do, under what conditions, and who gets notified when it acts?' Those are basic questions. The fact that the AI industry has been shipping agent frameworks without answering them first is the story here. The toolkit is available on GitHub and is designed to work across the major agent frameworks, not just Microsoft's own stack. That is a meaningful signal — Microsoft is not building a proprietary moat here, they are acknowledging that the governance problem affects the entire ecosystem.

  • Policy enforcement: define which actions agents are permitted to take
  • Permission scoping: limit agent access to the minimum required for each task
  • Audit logging: capture agent decisions and the context behind them
  • Human-in-the-loop controls: flag high-risk operations for approval before execution
  • Behavioural boundary testing: validate agent behaviour in pre-production environments

Why This Is a Security Issue, Not Just a Governance One

There is a tendency to frame AI governance as a compliance or ethics question — important, but separate from the day-to-day work of security teams. That framing is wrong, and it creates blind spots. An AI agent is software with credentials. It has access tokens, API keys, and permissions. It interacts with external services. It processes untrusted input. Every one of those characteristics maps directly onto the attack vectors that security teams already defend against. The difference is that agents introduce a new layer of complexity: the agent's behaviour is not fully deterministic, it can be influenced by the content it processes, and its actions can cascade across systems faster than any human operator can intervene. Consider a supply chain attack scenario. An attacker compromises a third-party data feed that an AI agent monitors. The malicious payload is not malware in the traditional sense — it is carefully crafted text designed to manipulate the agent's next action. The agent reads the feed, interprets the instruction, and executes it: deleting records, exfiltrating data, or modifying access controls. No exploit, no vulnerability in the traditional sense. Just an agent doing exactly what it was told, by someone who was not authorised to tell it. This is why attack surface management matters more, not less, when AI agents are in play. Every data source an agent reads, every API it calls, every service it can write to — all of it is part of your attack surface now.

What Organisations Should Do Right Now

The release of Microsoft's toolkit is a useful prompt to audit your own AI agent deployments, whether they are in production or being evaluated. Most organisations that have adopted AI tooling in the past 18 months have done so quickly, often under pressure from leadership to demonstrate AI capability. The governance review has not always kept pace. Here is a practical starting point: First, inventory every AI agent or automated workflow that has access to production systems, customer data, or external services. This sounds obvious, but many teams cannot answer it accurately. Shadow AI deployments — agents stood up by individual teams without formal IT involvement — are common. Second, apply least privilege to agent credentials the same way you would to a human user. An agent that reads customer records for reporting should not also have write access to those records. An agent that sends notifications should not have access to financial data. This is basic access hygiene, but it is routinely skipped in agent deployments because the frameworks make it easy to grant broad permissions by default. Third, treat the data sources your agents consume as part of your threat model. If an agent reads content from the web, from third-party APIs, or from user-submitted documents, those sources can deliver adversarial inputs. Your threat model should account for that. Fourth, define what 'high-risk' looks like for each agent and enforce human approval for those actions. The specific threshold will vary — an agent that can send marketing emails is lower risk than one that can modify DNS records — but the principle is the same. Autonomous does not have to mean unmonitored.

  • Inventory all AI agents with access to production systems or external services
  • Apply least-privilege credentials to agents, reviewed regularly
  • Include agent data sources in your threat model
  • Define and enforce human-in-the-loop thresholds for high-risk actions
  • Log agent decisions and review anomalies as you would any privileged account

The Bigger Pattern: Capability Outpacing Control

Microsoft's toolkit is valuable, but its existence also illustrates a pattern that security professionals have seen before. New capabilities get deployed at speed because the business value is clear and the risks are less visible. Governance follows later, once incidents have occurred or regulators have intervened. We saw this with cloud adoption. Shadow IT proliferated, misconfigured storage buckets became a breach vector of choice, and organisations spent years retrofitting controls that should have been built in from the start. We saw it with APIs — the attack surface expanded dramatically before the industry developed mature API security practices. The AI agent story is following the same arc, except the pace of deployment is faster and the potential for autonomous action is greater. The UK's National Cyber Security Centre (NCSC) has already published guidance on the security implications of large language models, noting that prompt injection is a top concern for LLM-integrated applications. The NCSC's 2024 report on AI cyber security highlighted that 'the security of AI systems depends heavily on the security of the data they are trained on and the inputs they receive at runtime.' That framing applies directly to AI agents. For resource-constrained security teams, the honest answer is that you cannot secure what you cannot see. Before evaluating any specific governance toolkit, you need visibility into what AI-powered processes are operating in your environment and what access they hold. That is an attack surface question before it is an AI governance question.

How Kyanite Blue Can Help You Govern Your AI Attack Surface

The governance challenge that Microsoft's toolkit addresses sits squarely within the broader problem of attack surface visibility. If you do not know which AI agents are operating in your environment, what they can access, and whether their behaviour is being monitored, you cannot manage the risk they introduce. Hadrian, Kyanite Blue's AI-powered attack surface management platform, provides continuous visibility across your external-facing assets — including APIs, cloud services, and third-party integrations that AI agents interact with. Hadrian maps your attack surface as it actually exists today, not as it was documented six months ago, and flags exposures before attackers find them. For organisations deploying AI agents that call external services or process third-party data, Hadrian gives security teams the visibility to identify where those agents create new exposure. You can find out more at /products/hadrian. For organisations concerned about what happens if an AI agent is manipulated into exfiltrating data — whether through prompt injection, compromised credentials, or a supply chain attack on a data source — BlackFog's anti data exfiltration technology provides a last line of defence. BlackFog monitors and blocks unauthorised data transfers at the device and network level, regardless of how the exfiltration attempt is initiated. An agent told to send a file somewhere unexpected will be stopped. See /products/blackfog for details. If you want to understand your organisation's current exposure before investing in specific controls, our team can walk you through a focused review of your AI-adjacent attack surface. Check your data exfiltration risk in two minutes at /data-exfiltration-risk, or contact us directly at /contact to talk through your environment.

Protect Your Business

The threats described in this article are real and ongoing. Kyanite Blue provides the security solutions that prevent these attacks — from endpoint protection to data exfiltration prevention.

Frequently Asked Questions

What is Microsoft's Agent Governance Toolkit and why does it matter for security?

Microsoft's Agent Governance Toolkit is an open-source, seven-package framework that gives developers and security teams controls over autonomous AI agent behaviour. It covers permission scoping, audit logging, policy enforcement, and human-in-the-loop approval for high-risk actions. It matters because AI agents with broad system access represent a real attack surface, and most organisations have deployed them without formal security controls in place.

What security risks do autonomous AI agents introduce?

Autonomous AI agents introduce several risks: they hold credentials and access tokens that can be stolen or misused, they process untrusted external inputs that can be used for prompt injection attacks, and they can take actions across multiple systems faster than humans can intervene. A compromised or manipulated agent can exfiltrate data, modify records, or escalate access without triggering traditional security alerts.

How should businesses apply least privilege to AI agents?

Treat AI agents like privileged service accounts. Grant each agent only the permissions required for its specific task, reviewed and scoped before deployment. An agent that reads data for reporting should not hold write permissions. Credentials should be rotated regularly, scoped to specific services, and monitored for anomalous use — the same controls applied to any human user with elevated access.

AI securityautonomous agentsAI governanceattack surfaceenterprise security

Want to discuss this with our team?

Book a free 20-minute call with David or Max.

Book a call