NVIDIA Dynamo Planner Brings SLO-Driven Automation to Multi-Node LLM Inference

Microsoft and NVIDIA have released Part 2 of their collaboration on running NVIDIA Dynamo for large language model inference on Azure Kubernetes Service (AKS). The first announcement aimed for a raw throughput of 1.2 million tokens per second on distributed GPU systems. Now, this latest release focuses on helping developers work faster and improving operational … Read more

Microsoft Research Introduces AIOpsLab: A Framework for AI-Driven Cloud Operations

Microsoft Research unveiled AIOpsLab, an open-source framework designed to advance the development and evaluation of AI agents for cloud operations. The tool provides a standardized and scalable platform to address challenges in fault diagnosis, incident mitigation, and system reliability within complex cloud environments. As microservices and serverless architectures become standard in enterprise IT, their complexity … Read more