Cisco and Aible: Autonomous Long-Running Agents From Edge to Core

Enterprises are trying to figure out how to transform their operations in the world of autonomous agents, but are finding it challenging to do so safely. Solutions like OpenClaw have generated a lot of interest in the benefits of long-running agents that can coordinate across agents but OpenClaw comes with a lot of security risks as highlighted by Cisco in a blog entitled “Personal AI Agents like OpenClaw Are a Security Nightmare.” How can enterprises benefit from long-running agents that maintain state across sessions, but do so safely?

NVIDIA recently published a Multi-Agent Intelligent Warehouse blueprint that can help organizations optimize warehouse operations through intelligent automation, real-time monitoring, and natural language interaction. But what would be the impact of such agents on business KPIs such as revenue saved by avoiding stockouts vs. the cost of expediting the shipment of stock? How can agents at the edge detect the risk of stockout and contact centralized agents that can optimize shipments across the supply chain? How do we secure such a network of agents via model vulnerability scanning, real-time guardrails, etc? At NVIDIA GTC, Aible, Cisco, and Vaidio will demonstrate such a use case at the Cisco booth.

Secure Business-Optimized Agents at the Core

Here, an Aible long-running agent (similar to OpenClaw but designed for enterprise security and governance) ran on a Cisco Compute Blade Server with the NVIDIA Nemotron 3 model, powered by NVIDIA accelerated computing. The long-running agent leveraged an Aible business-optimized predictive agent to augment the NVIDIA blueprint, balancing the risk of a potential stockout with the cost of expediting a shipment, and showing business leaders what the economic impact would be from running a fleet of such agents across their warehouses and stores to deliver optimal business outcomes. Business users can easily adjust the agent by chatting with it about business constraints such as how much inventory they can afford to carry, or how much they can spend on expedites, and observe the agents automatically adjust their behavior to conform to those constraints. The long-running agent then uses this predictive agent as a tool to decide when to request a pickup or an expedite for a specific SKU from the NVIDIA Multi-Agent Intelligent Warehouse.

Creating, Configuring, and Interacting With the Warehouse Expedite (Core) Long-running Agent

Cisco provided several key capabilities to secure the long-running agent. First the entire set of agents ran securely on Cisco Secure AI Factory with NVIDIA. Cisco’s AI Defense software was used to evaluate the Nemotron 3 open models running on that server across multiple security and alignment metrics. Moreover as each agent leveraged the Nemotron 3 open models, the actual interaction was live evaluated by Cisco’s AI Defense with any insecure requests being immediately shut down.

Securing the Aible Long-running Agent with Cisco AI Defense

The long-running agent from Aible also leveraged several built-in protections designed with IT governance in mind. OpenClaw can perform some incredibly flexible tasks, but it does so by generating flexible code on the fly. This is a huge problem in enterprise because such code is never consistent. Imagine two executives asking the same question and getting different answers. Which answer is correct? How do we figure out the nuances of the auto-generated code that led to the different answer? The Aible long-running agent doesn’t write arbitrary code; instead, it works through an extensive deterministic tools layer that, for example, can perform most kinds of data transformation, data preparation, data analysis, predictive modeling, and optimization tasks and do them in an enterprise governed way. For example, if a user query requires a new transform to be created, that information is saved off and any successive user queries with a similar request uses the same new transform. Essentially, Aible forces long-running agents to use the same golden data, golden transform, and consistent reuse principles that are well established in enterprise IT. Moreover, all of the activities of the various agents are logged and each agent is allowed to interact only with very specific datasets and predictive models. The agents never have write access to the underlying data. The core of Aible long-running agents predates OpenClaw and does not share a single line of code with that open source project.

Secure Agents Collaborating at the Edge

A different set of agents ran on Cisco Unified Edge at each store location (the edge), with NVIDIA accelerated computing. The first was a stock counter video agent from Vaidio. This agent continuously monitored video feeds at the store to detect cases when a SKU was removed from the shelf whether through purchase, theft or damage. Vaidio uses object models and prompt-based VLMs to accurately detect when a person takes an object off the shelf, and sends the appropriate SKU for further processing based on shelf location, even without a good camera view of the object.

The Vaidio agent using the NVIDIA Metropolis application framework then securely called an Aible agent running at the same edge location and passed on the information such as “10 units of coffee left.” The language model used by the Aible agent as well as each interaction with that agent was secured by Cisco AI Defense exactly as in the case of the agents at the core. In fact, enterprises can consolidate alerts from both the core and edge agents to conduct comprehensive AI security evaluation of the entire system of agents across core and edge with Cisco AI Defense.

Grocery Restock at Store (Edge) Agent: End-to-end Automation From Vaidio Object Detection to Automated Expedite via Aible Long-running Agent

The secure Aible agent at the edge enhances the information with additional context such as information about the store, what other purchases have happened recently, etc. and then periodically calls the central Aible long-running agent with a set of SKUs that may need replenishment. This long-running agent considers requests from multiple stores, considers the inventory levels in the warehouse, the expedite shipment costs, and the impact of stockouts and takes the decision on whether or not to expedite a shipment or request a SKU to be picked up from the dock.

This level of automated coordination across stores and warehouses, especially while considering the impact on overall business KPIs like stockout risk, inventory carrying cost, expedite cost, etc. is humanly impossible today. In turbulent times where customer purchase behavior, supply chain costs, inventory costs, etc. are constantly changing, this is even more so the case.

What makes this kind of interaction possible isn’t just better open AI models. It’s the long-running nature of the agents and where those open models run. Retail use cases like stockout detection depend on real-time decision-making. Latency is one part of the equation, but it’s not the only challenge retailers face. Reliability matters. When AI relies on constant round trips to a centralized cloud, even small delays can disrupt the experience. Bandwidth constraints, connectivity interruptions, and rising data movement costs can quickly turn promising use cases into operational headaches.

There’s also the question of data sovereignty. Much of the data generated inside the store (video feeds, customer interactions, operational signals) is sensitive by nature. Retailers increasingly want control over where the data is processed and how it’s handled, rather than pushing everything to a distant cloud or enterprise data center.

Here different agents running at the edge coordinate to take the right decisions locally. Then they call the centralized agent at the right time when a decision has to be taken regarding shipping additional inventory or not. The autonomous long-running agents at the core considers multiple business factors across stores to make the right business decisions. Users can review the impact of such decisions on their business KPI and adjust the agents as they desire by just interacting with the agent. And the end-to-end system of agents is secured by Cisco Secure AI Factory with NVIDIA and Cisco AI Defense.

This fleet of agents leveraging technology from NVIDIA, Cisco, Aible and Vaidio can be seen at the Cisco booth at NVIDIA GTC. You can also try creating your own Aible autonomous long-running agent leveraging Cisco AI Defense at the Aible booth at NVIDIA GTC.

Aible on Cisco: Secure, Business-Optimized Autonomous Long-Running Agents From Edge to Core

Subscribe Here!