Microsoft Build 2024: CEO Nadella Declares 'A Golden Age Of Systems'
'I still remember distinctly the first time Win32 was discussed … .Net, Azure. These are moments that I’ve marked my life with. And it just feels like we’re, yet again, at a moment like that,' Microsoft CEO Satya Nadella said in his keynote at Build 2024.
Microsoft Chairman and CEO Satya Nadella declared "a golden age of systems" in his Build 2024 keynote Tuesday, taking time to walk an audience of developers through the tech giant's ongoing innovations in artificial intelligence across infrastructure, data, tooling, applications and more.
The Redmond, Wash.-based vendor has made breakthroughs in making computers understand humans instead of needing laypeople to understand computers, plus making computers assist humans in reasoning, planning and taking actions, Nadella said.
"This is, like, maybe the golden age of systems," Nadella said. "What's really driving it? I always come back to the scaling laws. Just like Moore's Law helped drive the information revolution, the scaling laws of DNNs [deep neural networks] are really—along with the model architecture, interesting ways to use data, generate data—driving this intelligence revolution."
Microsoft Build 2024
While Intel co-founder Gordon Moore said that the number of transistors on an integrated circuit should double every two years with minimal cost increases, the DNN scaling law sees "doubling every six months," Nadella said.
Nadella made multiple comparisons to tech history to illustrate the gravity of the AI moment. Microsoft's Windows Copilot Runtime will do to AI what 32-bit Windows did for GUIs. The rollout of Copilot across organizations reminds Nadella of the start of the PC era, both "democratizing expertise."
As for scaling, Microsoft has added 30 times the supercomputing power to Azure in six months, Nadella said. Microsoft-backed OpenAI's latest GPT AI model is 12 times less expensive and six times faster since launch.
Looking ahead, Nadella predicted that the next exciting iteration of Copilot work will involve users making agents who can perform work in the background asynchronously. "That's one of the key things that's going to really change in the next year," he said.
Although the two men didn't share the stage, speaking after Nadella Tuesday was Sam Altman, CEO of OpenAI, the creator of ChatGPT, Dall-E and other popular generative AI tools.
Altman remarked that AI models keep getting smarter and that the hype has reached heights seen during the mobile phone revolution of the late 2000s when every company spoke about their mobile capabilities and whether they had a mobile app. "A few years later, no one said they were a mobile company because it was like table stakes," Altman said.
In an observation that could apply to not just developers but executives and solution providers as well, Altman said that adopting AI, like adopting other evolutionary technologies, "doesn't get you out of the hard work of building a great product or a great company or a great service."
"AI alone is a new enabler, but it doesn't automatically break the rules of business," Altman said. "You still have to figure out how you are going to build enduring value in whatever you are doing. And it is easy to lose sight of that in the excitement of the gold rush."
Here is more of what Nadella said during his Build 2024 keynote.
A 'Golden Age Of Systems'
I still remember distinctly the first time Win32 was discussed … .Net, Azure. These are moments that I've marked my life with.
And it just feels like we're, yet again, at a moment like that. It is just that the scale, the scope is so much deeper, so much broader this time around. Every layer of the tech stack is changing.
Everything—from the power draw and the cooling layer of the data center to the NPUs and the edge—are being shaped by these new workloads, these distributed, synchronous … workloads are reshaping every layer of the tech stack.
But if you think about even going all the way back to the beginning of modern computing … there have been two real dreams we've had.
First is can computers understand us instead of us having to understand computers? And second, in a world where we have this ever-increasing information of people, places and things, as you digitize more artifacts … can computers help us reason, plan and act more effectively on all that information? … And here we are, I think that we have real breakthroughs on both fronts. … This is maybe the golden age of systems.
What's really driving it? I always come back to the scaling laws. Just like Moore's Law helped drive the information revolution, the scaling laws of DNNs are really—along with the model architecture, interesting ways to use data, generate data—driving this intelligence revolution.
You could say Moore's Law was probably more stable in the sense that it was scaling at maybe 15 months, 18 months. We now have these things that are … doubling every six months.
What we have, though, with the scaling laws is a new natural user interface that is multimodal—that means it supports tech, images, video as input and output.
We have memory that retains important context, recalls both our personal knowledge and data across our apps and devices. We have new reasoning and planning capabilities that help us understand very complex context and complete complex tasks while reducing the cognitive load on us.
But what stands out to me as I look back at this past year is how you all as developers have taken all of these capabilities and applied them, quite frankly, to change the world around us. … The rate of diffusion is unlike anything I've seen in my professional career. And it's just increasing.
Windows Copilot Runtime
What Win32 was to graphical user interface, we believe the Windows Copilot Runtime will free for AI. It starts with our Windows Copilot Library, a collection of these ready-to-use local APIs. … This includes no-code integrations for Studio Effects, things like creative filters, teleprompter, voice focus and much more.
But of course, if you want to access these models … you can directly call them through APIs. We have 40-plus models available out of the box, including Phi-Silica, our newest member of our small language family of models, which we specifically designed to run locally on your NPUs on Copilot+ PCs, bringing that lightning-fast local inference to the device.
The other thing is that Copilot Library also makes it easy for you to incorporate RAG [retrieval-augmented generation] inside of your applications with on-device data.
It gives you the right tools to build a vector store within your app. … We will be natively supporting PyTorch and the new WebNN [Web Neural Network] framework through Windows DirectML.
Native PyTorch support means thousands of OSS [open-source software] models will just work out of the box on Windows, making it easy for you to get started.
In fact, with WebNN, web developers finally have a web-native machine learning framework that gives them direct access to both GPUs and NPUs. … Both PyTorch and WebNN are available in developer preview today.
The Copilot Stack
We've always been a platform company. And our goal is to build the most complete, end-to-end stack—from infrastructure to data to tooling to the application extensibility so that you can apply the power of this technology to build your own applications. … We have the most complete, scalable AI infrastructure that meets your needs in this AI era.
We're building Azure as the world's computer. We have the most comprehensive global infrastructure with more than 60-plus data center regions, more than any other cloud provider.
Over the past year we've expanded our data center regions and AI capacity from Japan to Mexico, from Spain to Wisconsin. We're making our best-in-class AI infrastructure available everywhere. … At the silicon layer, we are dynamically able to map workloads to the best accelerator AI hardware so that we have the best performance.
And our custom I/O [input/output] hardware and server designs allow us to provide dramatically faster networking, remote storage and local storage throughput.
This end-to-end approach is really helping us get to the unprecedented scale. In fact, last November, we announced the most powerful AI supercomputer in the cloud for training using just actually a small fraction of our infrastructure. And over the past six months, we've added 30 times that supercomputing power to Azure. It is crazy to see the scale.
Nvidia, AMD Partnerships
We are not just scaling training fleets. We are scaling our inference fleet around the world, quadrupling the number of countries where Azure AI Services are available. … We offer the most complete selection of AI accelerators, including from Nvidia and AMD as well as our own Azure Maia, all dynamically optimized for the workloads.
That means whether you're using Microsoft Copilot or building your own copilot apps, we ensure that you get the best accelerator performance and the best cost. … You see this in what has happened with GPT-4. It's 12X cheaper and 6X faster since it launched. … It all starts, though, with this very deep, deep partnership with Nvidia, which spans the entirety of the Copilot stack, across both all of the hardware innovation as well as the system software innovation.
Together, we offer Azure confidential computing on GPUs to really help you protect sensitive data around the AI models end to end. … We will be among the first cloud providers to offer Nvidia's Blackwell GPUs B100s as well as GB200 configurations.
And we're continuing to work with them to train and optimize both large language models like GPT-4o as well as small language models. … Beyond the hardware, we are bringing Nvidia's key enterprise platform offerings to our cloud, like the Omniverse Cloud and DGX Cloud to Azure, with deep integration with even the broader Microsoft Cloud.
For example, Nvidia recently announced that their DGX Cloud integrates natively with Microsoft Fabric. That means you can train those models using DGX Cloud with the full access to Fabric data. And Omniverse APIs will be available first on Azure for developers to build their industrial AI solutions.
We are the first cloud to deliver a general availability of VMs based on AMD MI300X AI accelerators. It's a big milestone for both AMD and Microsoft. … It offers the best price/ performance on GPT-4 inference.
And we will continue to move forward with Azure Maia. In fact, our first clusters are live. And soon, if you're using Copilot or one of the Azure OpenAI Services, some of your prompts will be served using Maia hardware.
This article originally appeared on CRN. Read the rest of Nadella's comments here.