AI Agent Security: The Runtime Tooling View of Prompt Injection, Data Exfil, and Output Trust
The post-incident review I want to describe happened in Amsterdam in early January. A logistics company had shipped a customer-facing chatbot built on a third-party LLM, hooked into their order-management system through a tool-calling layer, and a customer had stumbled into the prompt that exfiltrated other customers’ shipping addresses. Not a deliberate attacker — a customer who had been frustrated, typed something rude, and triggered an unguarded code path that the QA team had never tested because the QA team thought of prompt injection as something the model provider handled. The model provider did handle some of it. The model provider had not handled the specific combination of polite preamble plus instruction reversal plus tool-call coercion that this customer had landed on by accident. Four hundred and twelve shipping addresses had been returned across a forty-minute window before the on-call engineer noticed the spike in tool-call volume and pulled the integration. The post-incident review took six weeks. The legal exposure took eighteen months to close. The company has since bought a runtime gateway-level defence and an application-level output validator. They had been planning to add both after the chatbot proved out in production. That decision had been made by people who did not know what they were exposing.
This is the operational view of AI agent security in 2026, and the reason this page exists separately from the governance hub and from the governance tooling piece on the same site. Governance is policy, evidence, deployment-gate procedure — the artefact layer the CISO uses to demonstrate that the model was approved and is being monitored. Agent security is the runtime layer that stops a hostile prompt from exfiltrating customer data through a tool call between the request and the response. Both are needed. The procurement is separate. Most enterprises that buy a governance platform discover, at month nine, that they still have no runtime defence, and the order should be reversed for any production-facing deployment. The cost of running blind on agent security while building out governance is the cost of the incident the runtime layer would have caught.
What follows is the four-threat-category model, the four procurement archetypes, the named-vendor scorings against an honest practitioner rubric, and the structural failure mode of treating runtime defence as a checkbox rather than a layered control. The OWASP LLM Top 10 is the shared starting vocabulary, the NIST AI RMF is the broader risk framework that maps over it, and the four-category split below is the one I use in actual procurement engagements because it maps cleaner to which vendor sits where in the pipeline.
The four threat categories that drive procurement
The OWASP LLM Top 10 enumerates ten risks. The four threat categories that actually drive vendor selection collapse some of them together and split others apart, because procurement is driven by which control surface the vendor sits on rather than by the academic completeness of the threat taxonomy. The four below are the ones I have seen real budget signed against in fractional CISO and CTO engagements through 2024–2026.
Prompt injection and jailbreak. The OWASP LLM01 category, and the threat most production-facing deployments materially under-defend. The taxonomy splits into direct injection (the user types the malicious prompt into the application’s input) and indirect injection (the malicious prompt is embedded in content the model retrieves — a webpage, a document, an email — that the model treats as instruction rather than data). The indirect variant is the one the agentic-architecture trend has made much more dangerous, because agents that browse, read documents, or process emails are processing untrusted content as part of their normal operation. The mitigation surface is detection-and-rejection at the input layer (gateway-level pattern detection) plus instruction-vs-data separation at the prompt-construction layer (application-level discipline) plus output validation at the response layer. Single-layer defences do not work; the layered defence is the only credible posture in 2026.
Data exfiltration via tool calls. The combination of OWASP LLM06 (sensitive information disclosure) and LLM08 (excessive agency), and the category that turns prompt injection from a misuse problem into a breach. Agents that hold credentials, have tool-calling permissions, and can invoke functions against the enterprise data plane are exactly the surface an injected prompt is trying to weaponise. The mitigation surface is tool-call permission scoping (the agent should only be able to invoke the functions its specific use case requires, with the narrowest possible data access), tool-call validation (the security layer inspects what the agent is about to invoke before allowing it), tool-call rate limiting (a sudden spike in volume is a runtime alarm regardless of whether each individual call looks legitimate), and output redaction (sensitive fields are stripped from tool responses before they re-enter the model’s context). The Amsterdam incident above was not just a prompt injection; it was a classic identity and access management failure. The agent was over-privileged, and the model was the mechanism that exposed the lack of least-privilege scoping.
Supply chain attack on model weights or prompts. The combination of OWASP LLM03 (training data poisoning), LLM05 (supply chain vulnerabilities), and LLM10 (model theft), and the category that pulls the threat model upstream of the runtime conversation entirely. Enterprises that fine-tune their own models inherit the supply-chain security problem; enterprises that consume third-party model weights from Hugging Face or other model registries inherit it too, even when they think they are only consuming an API. The mitigation surface is model-provenance scanning (verifying the integrity of model artefacts before they enter the deployment pipeline), prompt-template integrity (the same kind of integrity check applied to the system prompts that ship with the application), and dependency-graph monitoring against known compromised model versions. This is the category MLSecOps platforms own, and it is the category most application-engineering-led deployments under-defend because the threat surface looks adjacent to the work they think they are doing.
Output-trust manipulation. The combination of OWASP LLM02 (insecure output handling), LLM09 (overreliance), and the operational risk that the model returns a confident-sounding wrong answer that downstream systems treat as truth. This is the category where the line between security and quality blurs, and where the procurement teams most consistently underbuy. The mitigation surface is output validation (schema-checking, type-checking, sanity-checking against known-good ranges), provenance attribution (the model’s answer is annotated with the retrieved sources it drew from), confidence signalling (the application surfaces uncertainty when the model’s internal signals suggest it), and downstream-system isolation (the model’s output never directly drives a high-stakes action without an intermediate validation step). The Amsterdam incident again: the model’s response was not validated against the should this customer be allowed to see this data check before the tool-call response was returned. That check was a security responsibility that the engineering team had treated as a model-quality concern.
These four categories cover the real procurement conversation. The mapping to OWASP is dense enough that the CISO and the application-security team can have a shared vocabulary, and the split into four categories rather than ten is procurement-friendly enough that the vendor evaluation does not drown in academic completeness.
The four procurement archetypes
The vendor market splits into four archetypes, distinguished by where in the application-to-model pipeline the vendor’s control surface sits. Most production deployments need at least two of the four. Treating them as substitutes is the structural error that produces a vendor evaluation favouring whichever archetype the buyer talked to first.
Runtime guardrails proxy. The archetype that sits as a reverse proxy or sidecar between the application and the upstream model provider, intercepts every request and response, and applies detection-and-rejection rules in the inline path. Prompt Security and Lasso Security lead this archetype; Lakera Guard sits across this and the application-library archetype with both API and SDK modalities; the Cisco Robust Intelligence platform deploys as an inline gateway when configured that way. The procurement strength is that the integration is centralised — one chokepoint, all traffic visible, all enforcement uniform — and the CISO can demonstrate coverage with a single architectural diagram. The procurement weakness is that the proxy can only enforce what it sees, and request bodies that route around the proxy (developer-side test calls, alternative SDK paths, model providers added without security review) do not get the protection. The most common failure mode is incomplete deployment: the proxy covers 80% of traffic and the breach happens in the uncovered 20%.
Runtime monitoring and detection. The archetype that observes the request-response stream — sometimes inline, sometimes via mirrored traffic, sometimes via log ingestion — and produces real-time alerting, evaluation harnesses, and red-team automation against the production deployment. The line between this archetype and the guardrails-proxy archetype is whether the tool blocks malicious traffic in the request path (proxy) or detects and alerts on it after the fact (monitoring). HiddenLayer’s runtime-monitoring product sits here, as does the bulk of the Robust Intelligence platform, as does much of the LLM observability work the governance tooling piece covers from a different angle. The procurement strength is the detection breadth — these tools see novel attacks the inline pattern detection misses, because they are looking at behavioural signal rather than known-bad patterns. The procurement weakness is the response latency — detection without blocking means the attack has already succeeded by the time the alert fires, and the value is mostly in the post-incident understanding rather than the prevention.
MLSecOps platform. The archetype that covers the model supply chain and the training-and-deployment pipeline rather than the runtime request path. Protect AI is the largest of the named vendors in this archetype; HiddenLayer covers the model-artefact-scanning side; the Guardian product line from Robust Intelligence has historically covered the pre-deployment model-evaluation surface. The procurement strength is upstream coverage — these tools catch poisoned weights, vulnerable model dependencies, and compromised training data before the model reaches production, where none of the runtime archetypes can help. The procurement weakness is that the value is invisible until the threat materialises, which is the structural reason these tools are systematically underbought. The mitigation works in proportion to how much of your AI footprint is custom-trained or custom-fine-tuned; for enterprises consuming third-party APIs exclusively, the archetype is less load-bearing and the runtime archetypes are more so.
Application-layer libraries and SDKs. The archetype that lives inside the application code itself, providing output validation, type checking, schema enforcement, and policy logic at the boundary between the model’s response and the downstream system that will consume it. Guardrails AI is the open-source-led leader here; Credal sits in this archetype with the focus on enterprise data-access controls embedded in the application; Lakera’s SDK is the application-side complement to their gateway. The procurement strength is the depth of context — the library sees what the application is trying to do, can validate against the application’s specific schema, and can enforce policy that an inline proxy cannot reason about. The procurement weakness is the engineering effort required to deploy and maintain it; this archetype is meaningful only when the application engineering team owns the security posture in code, which is a cultural commitment the CISO cannot mandate from outside the engineering organisation.
The procurement signal that tells you which archetype to buy is not in the feature list. It is in which question your engineering organisation and security organisation are arguing about. If the argument is we cannot see what our agents are calling in production, you need runtime guardrails proxy plus runtime monitoring. If the argument is we fine-tune our own models and we do not know if the weights are clean, you need MLSecOps. If the argument is the model is occasionally producing outputs that break downstream systems, you need application-layer libraries. Mapping the argument to the archetype takes ten minutes and prevents the nine-month evaluation cycle.
Prompt Security and Lasso Security
These two are the most procurement-friendly entries in the runtime-guardrails-proxy archetype, and they have similar enough postures that they appear on the same shortlist more often than not.
Prompt Security ships as an inline reverse proxy between application and model, supports the major upstream model providers (OpenAI, Anthropic, Google, Azure OpenAI, AWS Bedrock), and produces a security dashboard the CISO can review during an audit. The strength is the breadth of the policy engine — the platform handles prompt-injection detection, sensitive-data redaction in both input and output, model-provider routing, and rate-limiting in a single control plane. The weakness is the detection-rate-on-novel-attacks question that applies to every runtime guardrails proxy in this market; published benchmarks look strong, realised detection on attacks that did not exist when the detector was trained is materially lower. The procurement signal Prompt Security answers cleanly is we need a single chokepoint for our LLM API traffic with security policy enforcement and an audit trail. For mid-to-large enterprises with multiple application teams consuming third-party LLM APIs, this is the cleanest single procurement available.
Lasso Security sits in the same archetype with a tighter focus on the data-loss-prevention surface — what is being sent to the model, what is being returned, and whether either side contains regulated content. The strength is the depth of the DLP integration, particularly for enterprises whose security organisation is already operating a broader DLP programme that the LLM traffic is bypassing. The weakness is the narrower breadth — Lasso is a stronger DLP layer for LLM traffic than Prompt Security, but a narrower general-purpose security gateway. The procurement signal Lasso answers cleanly is we have a regulated industry, a mature DLP posture, and a CISO who needs LLM traffic to be in scope. For finance, healthcare, and government deployments, this is often the right answer over the broader gateway play.
The decision between these two is rarely which vendor and often which procurement question. Prompt Security is the broader gateway; Lasso is the deeper DLP. Some enterprises run both and accept the seat-level overlap; most pick one based on whether the centre of gravity is general-purpose security gateway or DLP-extended-to-LLM.
HiddenLayer
HiddenLayer is the most established of the named-vendor security pure-plays oriented toward the MLSecOps side, and the platform has been broadening into the runtime-monitoring archetype through 2025 and 2026.
The strength on the MLSecOps side is the model-artefact-scanning capability — HiddenLayer can scan model files (Hugging Face artefacts, internal model registries, ONNX exports) for embedded malicious content, detect tampering against signed model versions, and produce supply-chain assurance for the model layer in the same way SAST and SCA tools produce assurance for the application code layer. For enterprises that fine-tune their own models, consume third-party model weights, or operate an internal model registry, this is the highest-leverage purchase in this archetype, and the procurement is often easier to defend than the runtime archetypes because the analogy to existing software-supply-chain security tooling is direct.
The strength on the runtime-monitoring side is the behavioural-detection surface — HiddenLayer’s Model Scanner and AISec Platform produce runtime alerting on anomalous model behaviour, with the framing closer to what attack signature did we just see than what policy was just violated. The procurement weakness is the same as for any monitoring-without-blocking tool; the value is in post-incident understanding rather than prevention, and the team consuming the alerts needs the operational muscle to act on them. The fit is strongest in enterprises that already operate a mature SOC and can absorb the alerts into existing security operations workflows.
The procurement signal HiddenLayer answers cleanly is we have a model supply chain we need to defend and a SOC team that can consume runtime detection alerts. For enterprises whose AI footprint is exclusively third-party API consumption, the platform is heavier than the use case requires; the supply-chain coverage is wasted and the runtime monitoring is better delivered by tools that sit closer to the request path.
Lakera and the Cisco Robust Intelligence position
Lakera is the prompt-injection-and-adversarial-defence specialist in the runtime-guardrails-proxy archetype with simultaneous deployment options as an inline gateway and as an application-side SDK. Lakera Guard’s production-tested track record on prompt-injection detection is the strongest among the named pure-plays I have evaluated, and the integration story is the cleanest for engineering teams that want to drop the protection into application code rather than route through a central proxy.
The procurement complication is the Cisco acquisition. Cisco acquired Robust Intelligence in 2024 and announced its acquisition of Lakera in 2025, consolidating both into the Cisco AI Defense platform. This is the most significant consolidation move in the agent-security category to date and changes the procurement conversation in three ways. First, the standalone roadmap for both Lakera and Robust Intelligence is now subordinated to the Cisco AI Defense product strategy; the pace of pure-play innovation has slowed visibly, and several customers I have talked to have flagged uncertainty about which capabilities will survive the integration. Second, the procurement is now bundled into Cisco’s broader security relationship for enterprises that already buy Cisco; the marginal pricing economics shift in Cisco-anchored shops. Third, the multi-cloud and provider-agnostic posture that the Lakera and Robust Intelligence pre-acquisition products had is being slowly tilted toward Cisco-stack integration, which is a procurement signal worth watching for enterprises whose security posture is explicitly multi-vendor.
The procurement signal Lakera (or Cisco AI Defense, depending on how you want to label the post-acquisition product) answers cleanly is we need the most production-tested prompt-injection defence in the market and our enterprise is comfortable with the Cisco-stack direction of travel. For Cisco-anchored enterprises, the answer is increasingly attractive. For enterprises actively de-risking from Cisco, the procurement conversation is harder and the pure-play alternatives (Prompt Security, Lasso) become more competitive.
The honest read on the Robust Intelligence side, separately, is that the pre-acquisition platform was strong on the model-evaluation-and-red-teaming surface — the Guardian product family — and the Cisco integration has slowed the standalone roadmap as noted. The platform is still credible; the procurement question is whether the eighteen-month forward roadmap will deliver what your evaluation needs, which is harder to answer in 2026 than it was in 2023.
Protect AI
Protect AI is the most complete MLSecOps platform in the named-vendor set, with coverage spanning model-supply-chain scanning, model-vulnerability assessment, MLOps pipeline security, and runtime monitoring. The platform was acquired by Palo Alto Networks in 2025, which is the second major consolidation move in this category after Cisco’s Lakera and Robust Intelligence absorptions.
The strength is the breadth of the MLSecOps coverage. Protect AI’s NB Defense for notebook security, the Recon platform for adversarial robustness testing, the Sightline platform for vulnerability scanning across the AI supply chain, and the Layer platform for MLOps security combined produce the most complete enterprise-ready MLSecOps suite in the market. For enterprises with substantial custom model development, internal model registries, and ML engineering teams operating CI/CD pipelines for models, the platform answers the question how do we secure the AI supply chain more completely than any single competitor.
The procurement complication is the same shape as for Lakera. The Palo Alto Networks acquisition is consolidating Protect AI into the broader Cortex security platform; the standalone roadmap is now subordinated to Palo Alto’s AI security strategy; the multi-vendor posture is being tilted toward Palo Alto-stack integration. For Palo Alto-anchored enterprises, this is favourable; for enterprises actively de-risking from Palo Alto, the procurement is harder.
The procurement signal Protect AI answers cleanly is we develop and fine-tune our own models, we operate ML CI/CD pipelines, and we need a complete supply-chain-to-runtime security posture. For enterprises whose AI footprint is exclusively third-party API consumption, the platform is heavier than the use case requires, and the runtime archetypes are the more relevant purchase.
Guardrails AI and Credal
These two sit in the application-layer-library archetype, with different commercial models and different centres of gravity.
Guardrails AI is the open-source-led leader in application-layer output validation. The library lets engineering teams instrument their LLM applications with type-checking, schema-enforcement, policy validation, and output sanitisation at the boundary between the model’s response and the downstream system. The commercial offering (Guardrails Hub and the enterprise tier) adds a marketplace of community-maintained validators and enterprise-grade support around the open-source core. The procurement strength is the depth of context — the validators run inside the application code, see the schema the application expects, and can enforce policy logic that a runtime gateway cannot reason about. The procurement weakness is the engineering ownership requirement; the value is real only when the application engineering team commits to building and maintaining the validators, which the security organisation cannot mandate from outside.
The procurement signal Guardrails AI answers cleanly is our application engineering teams want to enforce output policies in code and we need the open-source-led tooling to do it. For engineering-led organisations with mature application-security culture, this is the right complement to a runtime gateway. For organisations where the security posture is compliance-led and the engineering teams will not own application-layer security in code, the library will be deployed and quietly fall out of date.
Credal sits in a narrower slice of the application-layer archetype focused on enterprise data-access controls for LLM applications — specifically, the use case where an LLM application needs to query enterprise data sources on behalf of users with respect to those users’ permission models. The platform sits between the application and the underlying data sources and enforces the which user is allowed to see which data check at the boundary, which is the failure point the Amsterdam incident above exemplified. The procurement strength is the specific coverage of the tool-call data-access surface that the runtime gateways do not see at the necessary depth; the platform understands that the application is trying to query the order-management system as User X and enforces User X’s permissions against the response.
The procurement signal Credal answers cleanly is we are building LLM applications that act on enterprise data on behalf of users, and we need user-permission enforcement at the data-access boundary. For customer-facing or employee-facing LLM applications with tool-calling integration into enterprise systems, this is a specific procurement for a specific failure mode, and the value is high in proportion to how much your AI footprint is moving toward agentic-with-tool-calls architectures, which is the dominant direction in 2026.
The four-criterion scoring rubric
The procurement methodology I run against this market mirrors the structure of the governance tooling four-week PoC, with the criteria adjusted for the runtime-defence nature of the procurement.
Criterion one: coverage of the four threat categories against your stack. The test is not whether the vendor’s marketing claims coverage; it is whether the platform produces defensible enforcement against representative attacks in your actual deployment. Five is coverage across all four categories relevant to your architecture, demonstrated against a real red-team test set; one is coverage on the demo set, untested against your stack.
Criterion two: pipeline-position fit. Does the vendor’s control surface sit where your architecture needs it. The gateway-archetype vendors fit deployments routing through a centralised LLM-traffic chokepoint; the application-library vendors fit deployments where the engineering teams own the security posture in code; the MLSecOps vendors fit deployments with substantial custom-model development. Five is the vendor’s pipeline position aligns with your architecture and integration is straightforward; one is the vendor’s pipeline position requires architectural changes you are not committed to making.
Criterion three: detection-and-response latency and the kill-switch posture. Detection without blocking is post-incident analysis; blocking with inadequate kill-switch is operational risk. The test is whether the vendor’s inline enforcement has a defined kill-switch path that the on-call engineer can invoke without the vendor’s support involvement, and whether the runtime monitoring produces alerts at a latency that allows operational response. Five is blocking enforcement with a documented sub-minute kill switch and runtime monitoring at sub-second alert latency; one is detection only, with no operational response path the buyer’s team can execute.
Criterion four: three-year total cost against the consolidation prediction. The agent-security market is consolidating fast; the standalone vendors named above are a moving target. The test is whether the three-year total cost (licence plus integration plus the engineering and SOC-team time to operate) is justified against the likelihood that the platform will be acquired or pivoted during the contract term, and whether the data-portability and exit terms are honest. Five is cost at the low end of the archetype range with clean data-portability terms; one is high cost with five-year-commitment pricing and no defined exit path.
The full four-criterion scoring sheet is published under CC-BY-4.0 and linked from the governance tooling piece. The agent-security overlay is published alongside it.
Where I would buy what, by enterprise shape
A pragmatic short list, scoped to enterprise shape rather than to vendor preference.
For a mid-to-large enterprise consuming third-party LLM APIs from application code with no custom model training: runtime guardrails proxy as the primary purchase (Prompt Security or Lasso Security depending on whether the centre of gravity is general gateway or DLP-extended), application-layer libraries as the engineering-led complement (Guardrails AI for output validation, Credal if tool-calling against enterprise data is involved), and defer the MLSecOps procurement until custom model work materially expands.
For a Cisco-anchored security shop with an existing relationship at the platform level: Cisco AI Defense (the post-acquisition Lakera plus Robust Intelligence platform) as the primary purchase, with the procurement caveats above about the standalone-roadmap uncertainty. The integration economics inside the Cisco stack and the procurement leverage in the renewal cycle make this the default answer for Cisco-anchored enterprises.
For a Palo Alto Networks-anchored security shop with substantial custom model development: Protect AI (now part of Cortex) as the primary MLSecOps purchase, paired with a runtime guardrails proxy (Prompt Security, Lasso, or the Palo Alto-native equivalent as it matures) for the request-path coverage. The same procurement caveats about consolidation apply.
For a regulated-industry enterprise (finance, healthcare, government) with mature DLP and SOC operations: Lasso Security as the primary runtime gateway because the DLP integration is load-bearing, HiddenLayer for the model-supply-chain and runtime-monitoring surface, and Guardrails AI or Credal at the application layer depending on the engineering team’s ownership posture.
For an engineering-led organisation with mature application-security culture and a heterogeneous model footprint: Guardrails AI plus Credal at the application layer as the centre of gravity, with a lightweight runtime guardrails proxy (Prompt Security at the smaller tier) as the central chokepoint, and HiddenLayer or Protect AI for the supply-chain coverage proportional to the custom-model footprint.
For a small or mid-sized organisation with limited security operations capacity and a small AI footprint: a single runtime guardrails proxy (Prompt Security at the entry tier) plus Guardrails AI in the application code is the minimum viable posture, with the explicit acknowledgement that the coverage is partial and the gap is accepted as a known risk against the operational capacity available to manage anything heavier.
None of these recommendations come with a referral fee or a sponsorship. The honest signal of a working agent-security deployment is that the engineering teams accept the inline latency and the SOC team has the alert volume and quality to act on what the tools surface. The signal of a failing deployment is that the tools are deployed, the dashboards exist, and nobody is reading them; the attacks that succeed are the ones the dashboards would have caught if anybody had been looking.
The structural failure mode: checkbox instead of layered defence
The single most predictive variable for whether an enterprise’s agent-security posture will survive contact with a real attack is whether the security team treated the tooling procurement as a checkbox or as a layered defence. The checkbox posture buys one tool from one archetype, deploys it, marks the procurement complete, and discovers at the next incident that the tool was solving a different problem. The layered-defence posture accepts that no single archetype produces credible coverage, buys two or three across complementary archetypes, and accepts the operational complexity as the cost of the posture.
The mechanism is dull and recognisable. A runtime guardrails proxy that catches 90% of prompt-injection attempts at the gateway layer leaves 10% to reach the application; the application-layer output validation catches another 80% of what reaches it; the tool-call permission scoping prevents the remaining 2% from accessing data they should never have been able to query. None of the three tools alone is sufficient; the three together produce a posture that survives the kinds of attacks the Amsterdam logistics company did not. The procurement teams that buy one tool and call the work done are the ones who will be in the post-incident review six months later trying to explain why the one tool was not enough.
The OWASP LLM Top 10 lists prompt injection as item LLM01 and has done so since the first version of the list. The persistence of the ranking is the signal — this is the unsolved problem of the category, not the solved one — and the procurement posture that treats it as solved by any single vendor’s detector is the procurement posture that fails. The honest read in 2026 is that the runtime defence layer is necessary, the application-layer discipline is necessary, the tool-call permission scoping is necessary, and the MLSecOps coverage is necessary in proportion to the custom-model footprint. Buying one is the floor. Buying three or four across complementary archetypes is the working posture. Buying nothing while a production-facing agent ships is the procurement failure mode that produces the next incident.
The Brooks point applies inversely here: adding more layers to a defence-in-depth posture does not reduce its effectiveness the way adding more people to a late project does, because the layers are operationally independent in a way that engineering teams are not. The cost of the additional layers is real (licence, integration, operational tax) and the cost should be modelled against the cost of the incident the layers prevent. For enterprises whose production-facing AI is genuinely production-facing — handling customer data, executing transactions, returning responses that customers act on — the incident-prevention math favours the layered posture by a margin large enough to fund the procurement without further debate.
What I would do, if starting from scratch on agent security in mid-2026, is exactly this. Run the four-week procurement methodology that the governance tooling piece describes, with the criteria adjusted for the runtime-defence character of the work. Pick the archetype your stack requires most urgently — usually runtime guardrails proxy plus application-layer library for third-party API-consuming enterprises, plus MLSecOps when custom model work is substantial. Sign annual or two-year contracts only. Plan the consolidation review at month twelve. Treat the OWASP LLM Top 10 as the shared vocabulary with the broader security organisation and the four-category split as the procurement-driving model. Do not buy one tool and call the work done. The cost of the incident the second tool would have caught is the procurement justification for the second tool.
Sources
- OWASP Top 10 for Large Language Model Applications, 2025 — the shared vocabulary for the threat-category conversation
- NIST AI Risk Management Framework, v1.0, and Generative AI Profile — the broader risk framework the procurement archetypes map under
- EU AI Act, Regulation (EU) 2024/1689 — Article 15 robustness, accuracy, and cybersecurity obligations for high-risk systems
- Prompt Security and Lasso Security — runtime guardrails proxy archetype primary references
- Lakera Guard and Cisco AI Defense (post-acquisition platform combining Lakera and Robust Intelligence) — combined gateway and SDK archetype reference
- HiddenLayer AISec Platform — MLSecOps and runtime-monitoring archetype reference
- Protect AI (now part of Palo Alto Networks Cortex) — MLSecOps archetype primary reference
- Guardrails AI — application-layer library archetype primary reference
- Credal — application-layer data-access enforcement reference
- Related: capabilities hub, agentic AI architecture patterns, enterprise AI red teaming, governance hub, AI governance tools, CISO AI governance responsibilities
The four-threat-category model, the four-archetype split, and the four-criterion scoring sheet are CC-BY-4.0 and linked from the governance tooling piece. Methodology: vendor scorings drawn from fractional CTO, CIO, and CISO engagements (2023–2026) where the agent-security procurement either preceded a production-facing deployment and the deployment shipped without an incident worth naming, or did not and produced the incident that should have informed the original procurement. The pattern is consistent enough to publish.
