Microsoft Takes "Shock And Awe" Approach To New Azure Custom And Merchant Silicon For AI And General Purpose Workloads
At #MSIgnite, Microsoft went all "shock and awe" in one of the biggest displays I have seen from a CSP (Cloud Service Provider) in a while.
Custom-silicon:
Microsoft Azure Maia AI Accelerator: This is designed for ML and GAI training and inference. You have to assume today's and next-generation OpenAI models will be trained and inferenced on Maia. Sam Altman says this was a "co-collaboration" to produce "more capable" and "cheaper" models.
Racks are liquid-cooled
4x cards per server
ASIC, not a GPU (as expected)
no cluster or model size limit
"X86 host" (unclear if AMD or Intel)
TSMC 5nm (assuming high performance)
supports standard MX sub-8 bit data types
Ethernet connectivity (embedded by the way)
power Microsoft Copilot or Azure OpenAI Service
2/ Microsoft Azure Cobalt CPU: Arm-based SoC for general-purpose computing.
"up to 40% perf/core versus previous Arm server"
128 cores
12 DDR 5 channels
Arm Neoverse N2-based core
Microsoft for security and special power management
Overall Custom details:
Custom Approach: full stack, including rack, fleet, and cooling optimizations like liquid network; did all chip and SoC engineering to tape out
Custom Drivers: supply chain resilience, cost, highest performance
Custom Timeframe: early next year
Custom History: Custom console silicon in the 2010s, then Cerberus, then Azure Boost, and now Maia and Cobalt.
New merchant silicon:
NC H100 v5 VM Preview: NVIDIA AI H100 for mid-range training and GAI inference. Will add H200 next year.
ND MI300 VM: AMD Instinct MI300X-based VM.
This announcement on the custom silicon was more than I had imagined. Included both custom general compute and ML/GAI training and inference. Microsoft Azure is smart to roll this out to SaaS & PaaS first, followed by IaaS for everybody, as it lowers risk.
I wanted to see a vertical approach, and this effort was a ground-up, full-stack effort. I need help figuring out how more of this didn't leak as the company says it went all the way from IP and SoC design to tape-out to test working directly with TSMC; hence no SoC integrators as it did with console SoCs like AMD and as others do with Samsung Semiconductor custom design services and Marvell Technology custom capabilities.
Kudos for bringing out the MI300X, as it is GPU-based. AMD has been the "next in line" for big business to NVIDIA. I hear about AMD literally everywhere. This is one of the first of many AMD MI300X announcements.
Looking forward to further information on:
Performance versus merchant AMD and Intel and custom AWS and Google Cloud custom silicon.
Pricing for custom and AMD MI300X silicon
dates for full IaaS and SaaS apps like M365, D365, etc.
Official GA dates
Overall, I was very impressed and didn't expect all this at once.