Capabilities and Adjacencies: What the Strategy Turns Into — Capabilities illustration
Capabilities

Capabilities and Adjacencies: What the Strategy Turns Into

The tactical work — AI-SRE, agentic orchestration, AI for engineering teams, context engineering, readiness assessment, the cost of failed projects. The capability layer that turns an approved strategy into running systems.

A platform team I worked with last year had built — by the standards of any reasonable architect — a clean agentic orchestration layer. Three model providers, a routing tier, a tool-calling layer, an evaluation harness. The diagram fit on one slide. The engineers were proud of it. Eight weeks after launch, the cost line had doubled, the latency P95 had tripled, and the product team was quietly routing around the orchestration layer to call models directly because the abstraction had become the bottleneck. The architecture was correct. The capability had stopped working.

That is what this hub is about. Not the strategy that approved the work, which lives at /, but the layer underneath — the architectures, the tooling, the readiness checks, the failure data, and the operational practices that turn a board-approved strategy into systems that run on a Tuesday afternoon in November without anyone paging the on-call.

The capability layer is where most AI programmes actually fail. The strategy gets signed. The capability does not get built, or gets built wrong, or gets built right and then drifts because nobody owns the maintenance. Survey data from 2025 and 2026 puts enterprise AI project failure rates between 60% and 85% depending on definition; my own engagement data, on a smaller sample, runs in roughly the same range. The mechanism is dull, but the friction is expensive. Strategies are written by people who never touch a terminal. Capabilities are operated by people who were not in the room when the strategy was signed. The bridge between them is this layer.

The capability/strategy split, named

Every page in this cluster sits on one side of a line. On one side: questions the board approved an answer to. On the other: questions an engineering or platform leader has to answer in the next quarter. The split matters because the same vendor will sell you content on both sides and treat the distinction as marketing rather than architecture. It is architecture.

A strategy answers: should we build agentic orchestration. A capability answers: which orchestration pattern, evaluated against which workloads, with what failure modes acknowledged. A strategy answers: should we adopt AI-SRE. A capability answers: which AI-SRE tool, integrated against which existing observability stack, owned by which on-call rotation. A strategy answers: are we ready. A capability answers: where specifically are we not ready, and what does it cost to close that gap.

The deep-dives under this hub each take one of those capability questions seriously, with named tooling where naming names is honest and decision-frameworks where the right answer is “it depends on the constraint you have not stated yet.”

Orchestration and agentic architecture

The orchestration cluster — /capabilities/orchestration/ for the architecture overview, /capabilities/orchestration/agents/ for the agentic patterns — covers the work that sits between your models and your products. In 2026 this is the most volatile capability area on the site. The reference architectures published by the hyperscalers in 2024 are already partially obsolete — they assumed rigid DAG-shaped tool routing and central orchestrators, both of which lost ground in 2025 to looser, more autonomous agent patterns. The agentic-frameworks landscape (LangGraph, AutoGen, CrewAI, the model-vendor-native equivalents) has consolidated faster than the analyst firms have updated their grids.

The position I take in those pages. The right orchestration architecture in 2026 is the smallest one that works, and “works” means runs in production with an SLA, not “demos at a board meeting.” Most enterprise teams over-build the orchestration layer in the first year and under-build the evaluation harness, and the result is a system that is impressive in the architecture diagram and brittle in production. The pattern to copy is the one Anthropic and OpenAI both quietly recommend in their cookbooks — fewer layers, more evals, ship the smallest agent that solves the workflow and add complexity only when measured failure modes demand it.

AI-SRE and operations

AI-SRE tooling is the highest-impact capability in this cluster for most enterprises, and it is the cluster with the most-real product comparisons because the buying decision is concrete. The questions in this category are not strategic — they are which tool, integrated where, owned by which team.

The page treats AI-SRE as one of three discrete buying motions: the incident-triage tool (Bits AI, the FireHydrant AI features, the Sentry AI-SRE work), the alert-noise-reduction tool (PagerDuty AI, Datadog Watchdog at the higher tier), and the post-incident-analysis tool (incident.io’s AI features, the open-source equivalents). Treating them as one buying decision is the failure mode; they require different data, different integrations, and different on-call adjustments, and the vendors who sell you all three will conflate them. The page names which tools belong in which motion and where the cross-purchases actually make sense.

Three vendor deep-dives sit under this cluster: the full vendor comparison megapage covering all 10+ named vendors against the four-criterion matrix, the head-to-head Traversal vs Resolve AI for the most-searched specific comparison, and the procurement-shopper’s Resolve AI alternatives page. The LLM observability cluster sits adjacent — agentic systems are where AI-SRE and LLM observability converge most.

Engineering teams and the AI-for-engineering question

AI for engineering teams covers the question every VP Engineering has been asked twice this year — what does generative AI do to my engineering org, and what should I do about it. The answer that fits on a slide is “use Cursor or Copilot and measure throughput.” The answer that survives a year is more complicated. Throughput on individual code generation rises measurably; throughput on team-level shipping does not, because the bottleneck moves to code review, integration, and on-call. The page covers the operational implications, the tool selection trade-offs, and the hiring posture changes that follow.

Context engineering is the related but distinct practice — the design of the data and prompt context that flows into the model layer. The page exists because the executive question “do we need a context-engineering capability” is real and almost always answered wrong on the first pass. Most enterprises do need one. Most build it inside the wrong team.

Readiness, adoption, and the failure-mode work

The diagnostic side of this cluster lives in three pages. Enterprise AI readiness assessment is the four-page diagnostic I mentioned in the FAQ — what to check before approving the strategy, how to score it, where the cliffs are. Scalable AI adoption is the deployment-pattern work — how to move from one production workload to twenty without the marginal cost of the twenty-first becoming the budget-killer it usually is. Cost of failed AI projects is the failure-mode catalogue, with the numbers cited from the surveys that have them and the named engagement work where the published numbers are unreliable.

The failure-modes page is the one I would read first if the strategy on the table looks too confident. Failure-mode literacy is the single highest-leverage form of due diligence in enterprise AI, and the published material on it is biased toward the optimistic by a margin large enough to matter.

The named-vendor procurement deep-dives

The capability layer has matured to the point where vendor selection is its own work. Several procurement deep-dives sit under this cluster, each scoring the major named vendors against an honest practitioner rubric.

Productivity and copilots. Enterprise AI Copilots covers the procurement question every CIO is now answering — Microsoft 365 Copilot, Salesforce Agentforce, SAP Joule, ServiceNow AI Agents, Atlassian Rovo, Google Gemini for Workspace, IBM watsonx, plus the horizontal AI search archetype (Glean, Moveworks). The structural problem most enterprises end up paying for two or three of these simultaneously without a consolidation plan.

Data foundations. AI Data Readiness is the “before any AI programme survives, the data layer needs to look like this” piece — Alation, Atlan, Collibra, Informatica, Databricks Unity Catalog, Snowflake Cortex, and the four data-readiness checks that pre-date strategy approval.

Retrieval-augmented architecture. RAG Architecture covers the vector-database procurement work — Pinecone, Weaviate, Qdrant, Turbopuffer, Elasticsearch, AWS OpenSearch, Azure AI Search, GCP Vertex AI Vector Search, Algolia — and the four architectural choices (chunking, retrieve-and-rerank, embedding lifecycle, context-window strategy) that decide whether the RAG system pays back.

Orchestration framework selection. AI Orchestration Frameworks is the framework-comparison spoke that pairs with the architecture-overview piece — LangChain, LangGraph, CrewAI, AutoGen, LlamaIndex, Haystack, Semantic Kernel, DSPy, MCP, and the honest read on when a framework earns its place over a direct model-vendor SDK call.

Real-time AI. Voice AI Platforms covers the four genuine procurement categories — contact-centre voice agents (PolyAI, Parloa, Vapi, Retell, Bland), meeting AI (Otter, Fathom, Granola, Read, Krisp), voice infrastructure (LiveKit, Twilio, Daily.co), and voice-model providers (ElevenLabs, Cartesia).

Operational AI. AIOps Platforms sits underneath the AI-SRE work above — alert correlation, anomaly detection, root-cause analysis, predictive analytics. BigPanda, Moogsoft, PagerDuty AIOps, Dynatrace Davis AI, Datadog Watchdog, ServiceNow ITOM, OpsRamp.

Runtime AI security. AI Agent Security covers the operational runtime-tooling view — Prompt Security, Lasso Security, HiddenLayer, Lakera, Protect AI, Robust Intelligence, Guardrails AI, Credal. Distinct from the policy-level governance work.

Inference compute. GPU Inference Platforms is the CTO-procurement view of the GPU and inference market — hyperscaler-native, inference-API platforms (Together AI, Replicate, Anyscale), serverless GPU (Modal, RunPod), and specialised silicon (Groq, SambaNova, Cerebras).

ML platforms. ML Infrastructure Platforms covers the platform layer wrapping training, model registry, deployment, and evaluation — SageMaker, Vertex AI, Azure ML, MLflow, Weights & Biases, ClearML, Comet, Databricks.

Vertical AI. Legal AI Platforms covers the legal-vertical procurement decision that is increasingly arriving at the CIO’s desk — Harvey, Spellbook, Robin AI, Lexion, Casetext.

Red teaming and the security adjacency

Red teaming sits at the boundary between this cluster and the governance hub. The page covers the testing motion that has emerged as the EU AI Act high-risk requirements have firmed up — what to red-team, how often, by whom, against what threat model. The governance hub covers the policy side; this page covers the operational side. Both pages are read together by the small but growing number of enterprises that have figured out red-teaming is a capability question, not a compliance one.

How this hub fits the rest of the site

Read the root hub first if you have not. The capabilities cluster only makes sense in the context of an approved strategy, and the four-question diagnostic on the root page tells you which capabilities your strategy actually requires. Read the frameworks cluster for the document side and the roadmap cluster for the sequencing. Then come here for the implementation work.

If you are a VP Engineering or platform lead with no AI-strategy responsibility, read this hub first and the root hub second. The capability questions on this page are the ones you will be asked to answer in the next quarter regardless of whether your CEO has signed a strategy document; the strategy work is what you need so that the answers you give point in a consistent direction.

The order of the deep-dives matters. Start with readiness assessment, move to cost of failed AI projects, then to whichever of AI-SRE, orchestration, or engineering teams matches the workload you actually have. The remaining pages are reference material for the specific question you are facing when you face it.


Sources & methodology

Across the guide

Frequently asked questions

What is the difference between an AI strategy and an AI capability?
A strategy is the document your board approves. A capability is the thing your engineers can actually use on Monday morning. Strategies decide what you will build; capabilities are what you have built. The mistake worth naming: enterprises approve strategies that assume capabilities they do not have, and then spend the next eighteen months trying to assemble the capabilities behind a strategy that has already moved on. The capability layer should lead the strategy by one quarter, not lag it by four.
Where does AI-SRE actually pay off?
In two specific places: incident triage on services with high alert volume and noisy signal-to-noise ratios, and post-incident analysis where the slow step is human attention. AI-SRE pays back fastest at firms with 24/7 production-incident SLAs and an existing observability stack that the AI layer can read from. It pays back slowest where the underlying observability is poor; the AI layer cannot rescue an unobservable system, and pretending it can is the most expensive mistake in this category.
What is the actual failure rate of enterprise AI projects?
Between 60% and 85% depending on whose definition you use and which year the survey covers. The number is high because the published surveys count anything below stated business outcomes as a failure, and the gap between stated and realised outcomes on AI work is wide. The useful version of the number is per-cluster: customer-facing generative-AI deployments fail at roughly 70%, internal productivity deployments fail at roughly 40%, and AI-SRE-style infrastructure deployments fail at roughly 30%. The failure rate is not a property of AI; it is a property of how the workstream was scoped.
Do we need an AI readiness assessment before starting?
Yes, but not the kind most consultancies sell. The useful readiness assessment is a four-page document covering data accessibility, the operating-budget transition path, the existing governance maturity, and the engineering organisation's current capacity. The expensive readiness assessment is a 60-page deck that produces the same four answers a quarter later and costs €200k. Do the four-page version yourself or with one fractional advisor. The 60-page version is for filing, not for deciding.