Gen-AI-Today

GenAI TODAY NEWS

Free eNews Subscription

When AI Ambitions are Dictated by Cloud Matters

By Special Guest
Mike Hicks, Principal Solutions Analyst, Cisco ThousandEyes

Cloud operations today are, for the most part, mature. Enterprises have a comfort level with cloud: It has a defined role in an operational sense, and there’s enough support available, through a combination of architectural best practices, community, knowledge, visibility and automation, to optimally run most digital applications and workloads in public, private or hybrid cloud environments.

Moreover, cloud technology has become a key for widespread access to AI. In years past, only a select few private companies would have had access to the high-performance compute capacity required to run generative AI workloads. Cloud is proving to be the great leveller, making this level of compute accessible – and the AI services that use it available – to all who wish to use it.

But it’s coming at a cost. It’s not necessarily a financial one, although that’s a factor in decision-making. The bigger cost is to cloud optimization approaches. Put simply, widespread and intensive AI adoption is starting to push organizations beyond their comfort zones when it comes to cloud configurations. Targeted action is required to get comfortable with cloud again.

Understanding AI characteristics

To understand why established norms in cloud operations are being tested, one must first understand the nature of the AI workloads that cloud is now being asked to drive.

AI workloads are powerful, both in the sense of the value they can bring to enterprises and the amount of compute resources required to run them at scale.

This will only increase as Agentic AI becomes the dominant type of AI encountered in enterprise environments. Agentic AI signifies a tighter integration of AI technology into business processes, with autonomous or semi-autonomous software agents handling key processes or parts of those processes to meet specific goals. These systems can make rapid decisions, manage complex tasks, and adapt to changing conditions, assuming underlying systems perform as expected and required, but we’ll get to that.

What enterprises need to know is that Agentic AI is more interactive than other forms of AI – “talking” constantly to source systems, data repositories, external tools, databases, and APIs, which makes it a more latency-sensitive evolution of artificial intelligence technology. A cloud or connectivity disruption or failure could lead to an agent-led process failing to either kick off or achieve what it’s intended to.

The main thing to understand about AI workloads is they have different characteristics to the workloads used to define cloud operational parameters today. That means past decisions to make a digital application or workload perform optimally in the cloud are not always cross-applicable to AI. Today’s cloud setups are not designed to meet a very different set of requirements, nor were they intended to.

For enterprises, it’s clear that the same effort that went into optimizing cloud setups for a digital context must now be repeated to optimize cloud setups for AI.

The onus is on enterprises to understand and capture the characteristics of their different AI workloads, such that supporting cloud infrastructure can be architected and configured to meet evolving performance needs.

What this will look like in the cloud

For most enterprises, the reality is that AI and the source systems it taps into run in multiple clouds, in multiple data centers, and across a complex network of owned and unowned connectivity links.

Not all AI services will be available in a local region or zone, and that may be an overriding factor in an enterprise’s choice of AI model.

From an operational excellence perspective, enterprises need to determine where the infrastructure underpinning an AI service and the users of that service are based, to understand whether a cloud environment can support those requirements or if changes need to be made.

This includes understanding the extent of the AI’s exposure to “common” infrastructure, such as having a large amount of traffic being funnelled over a single fiber link, or through a single aggregation point, such as a point-of-presence in a high-density data center that has a high concentration of AI service providers present. Such concentration risk and single points of failure may exceed internal risk tolerances, given the increasingly critical role that AI plays.

Enterprises need to understand how every provider or part of their AI service delivery chain operates. How does a provider prioritize traffic at certain transit or hand-off points? Do they perform their own load balancing? How will this impact AI service delivery? The answers to these questions may give enterprises cause to re-architect their cloud setups to diversify traffic routes and improve redundancy options.

Performance efficiency will be impacted by these decisions. A round trip response time of 50ms might be acceptable for a basic generative AI application, such as a user asking a question and expecting a contextual response. But, for a busy Agentic AI system, if every query response takes 50ms, that will quickly add up. Users may experience excessive transaction times, timeouts or other congestion and latency-related issues as a result.

Enterprises can improve performance efficiency by proactively identifying optimisation opportunities for traffic and cloud resource usage.
 

About the author: Mike Hicks is a Principal Solutions Analyst at Cisco ThousandEyes. He is a recognized expert in network and application performance, with more than 30 years of industry experience supporting large, complex networks and working closely with infrastructure vendors on application profiling and management. He is the author of "Managing Distributed Applications: Troubleshooting in a Heterogeneous Environment" (Prentice Hall 2000) and "Optimising Applications on Cisco Networks."




Edited by Erik Linask
Get stories like this delivered straight to your inbox. [Free eNews Subscription]


SHARE THIS ARTICLE
Related Articles

Jabra Reviving Human Focus Amid AI Revolution in Customer Experience

By: Erik Linask    5/27/2025

Jabra looks to redefine how customer service teams make good on the promise of quality CX by combining the "what" of customer conversations, with "how…

Read More

When AI Ambitions are Dictated by Cloud Matters

By: Special Guest    5/27/2025

How are increasing AI workloads changing what we know about and how we design cloud architectures?

Read More

Rising AI-Driven Infrastructure Costs Expose Critical Weaknesses: NVMe SSDs & CXL Modules Redefine Scalability

By: Special Guest    5/7/2025

AI workloads are too demanding for their existing IT architecture. GPUs remain under-utilized, not because of faulty hardware, but because data can't …

Read More

Class Action Lawsuit Against Dialpad Could Redefine Consent Standards in AI-Powered Communications

By: Erik Linask    5/6/2025

Dialpad is facing a class action lawsuit in California for allegedly eavesdropping on phone calls made by consumers to businesses using AI-powered com…

Read More

Building Personalized AI Agents

By: Special Guest    4/4/2025

It's tempting to build an AI Agent that can do everything, but that's a recipe for a diluted and, ultimately, less effective generic workflow.

Read More

-->