Background3_sm-1

Empowering Fortune 500's and government organizations to realize business impact.

Business User Feedback with Reasoning Steps

Over the last several months, Aible worked with NVIDIA through the NVIDIA Inception program to build the Aible Intern Agent solution that is optimized for converged architectures, explains its reasoning steps, lets the user train it simply by providing feedback on those steps, and constantly adjusts to optimize for the specific use case.

Agents Can Run Serverless or on Servers on NVIDIA Superchips

Screenshot 2025-07-25 at 12.32.15 PM

We believed that NVIDIA’s design of superchips which combine CPUs and GPUs coherently over a high speed interface would accelerate the agents significantly. To test this, we placed the entire Aible stack, from the user interface to the mechanisms for Retrieval Augmented Generation (RAG) for structured & unstructured data, model coordination capabilities and automated post-training capabilities, all on the Grace CPU. We split the Hopper GPU part using techniques like MIG to run multiple models needed by the agent at the same time.

Aible on Single NVIDIA Servers or Superchips Outperform Cloud

Nvidia_Partner_Image3

Even with a very simple agent with just two models and three steps, the superchip was more than twice as fast as running the agent on a typical cloud architecture with the different models running optimally on different servers. This is because the Agent management code in the cloud has to work asynchronously with each of the models underlying the agent, while on the superchip the coordination can be synchronous. Moreover, because we knew the precise performance characteristics of each individual model and could control their relative performance based on how we allocated the GPU resources, what concurrency settings we used for each model, etc. we could optimize the agent for end-to-end performance.

Resources

ACCESS NEWSWIRE

AI for Business Users at Enterprise Scale

Aible in collaboration with NVIDIA will feature how the joint solution empowers Fortune 500 companies and government organizations.

NVIDIA AI PODCAST

CVS Health and Aible

Delivering Enterprise AI with Rapid Prototyping

BLOG

NVIDIA DGX Cloud Serverless Inference

NVIDIA DGX Cloud Serverless Inference is an auto-scaling AI inference solution