OpenAI and Broadcom to co-design custom AI chips, deploy 10 GW by 2029

OpenAI and Broadcom to co-design custom AI chips, deploy 10 GW by 2029

It's official, OpenAI is designing its own AI chips. The company announced Monday a partnership with Broadcom to build 10 gigawatts of custom accelerators optimized for its models—joining Google, Amazon, Meta, and Microsoft in developing specialized silicon for AI workloads.

Key Points

  • 10 GW plan: first racks deploy H2 2026; finish by 2029.  
  • Ethernet-first architecture with Broadcom networking to scale out.  
  • OpenAI says its own models are already optimizing chip layouts.  

OpenAI and Broadcom have been working together for 18 months on chip designs tailored to OpenAI's inference workloads—the process of running AI models for the 800 million people who use ChatGPT weekly. OpenAI handles the chip design while Broadcom manages manufacturing, networking, and system integration. The first deployments are planned for late 2026, with the full 10 gigawatts completed by 2029.

The numbers are staggering. OpenAI currently runs on just over 2 gigawatts—the Broadcom deal alone quintuples that. The full 33 gigawatts OpenAI has committed to across recent partnerships represents a 15x expansion. For context: 10 gigawatts could power 8 million homes for a day. Nevada's entire peak load is 6.65 gigawatts.

During the partnership announcement, OpenAI emphasized vertical integration. Sam Altman described it as controlling everything "from etching the transistors all the way up to the token that comes out." By designing chips specifically for its models, OpenAI can optimize performance and reduce costs compared to general-purpose GPUs.

Greg Brockman, OpenAI's president, said the company used its own AI models to optimize the chip designs. "We've been able to get massive area reductions. You take components humans have already optimized and just pour compute into it, and the model comes out with its own optimizations." These are typically improvements human engineers would eventually find, but the AI accelerated the process by weeks or months.

The economics are brutal. A 1-gigawatt data center costs roughly $50 billion, with $35 billion going to chips at NVIDIA's pricing. Custom accelerators optimized for specific workloads squeeze out significantly more performance per watt and dollar. This matters when you are serving over 800M users per week.

OpenAI isn't pioneering here so much as playing catch-up. Google has run TPUs for years, Amazon built Trainium and Inferentia, Meta created MTIA, Microsoft developed Maia—all optimized for their workloads. Broadcom's now working with its fifth custom accelerator customer. What was Google's contrarian bet has become table stakes for frontier AI.

The shift reflects how compute-intensive AI has become. Altman traced OpenAI's journey from 2 megawatts to 200 megawatts to finishing this year "a little bit over 2 gigawatts." Even with those resources, demand outstrips supply. "If we had 30 gigawatts today with today's quality of models, I think you would still saturate that relatively quickly."

There's the paradox. Every time OpenAI makes models 10x more efficient, demand grows 20x. When ChatGPT wrote simple code, people used it despite clunky interfaces. Now with Canvas, developers run entire projects through AI assistants. Features like Pulse—running in the background for personalized insights—are limited to Pro tier precisely because of compute constraints. The vision is an AI agent for everyone 24/7, but that requires unprecedented scale.

Broadcom's role extends beyond manufacturing. The systems use Ethernet and other Broadcom connectivity solutions with open standards rather than proprietary networking. Charlie Kawwas, president of Broadcom's Semiconductor Solutions Group, said this approach creates "cost and performance optimized next generation AI infrastructure" while avoiding vendor lock-in.

The collaboration covers chip architecture, rack design, and data center networking. Altman noted that as the project became more complex, full system design became necessary to maximize efficiency gains across the entire stack.

The economics of custom chips are compelling at scale. AWS claims its Trainium chips offer 30-40% better price performance than comparable GPU instances. Google's TPU v5e costs around $11 per hour for 8 chips compared to significantly more for 8 H100 GPUs. Companies with sufficient scale and engineering resources consistently achieve better economics with custom silicon than with general-purpose GPUs.

NVIDIA isn't going anywhere. Its GPUs remain essential for research, offer unmatched flexibility, and benefit from a mature CUDA ecosystem. OpenAI will continue using Nvidia chips, particularly for training new models. But for inference workloads where requirements are well-understood, custom chips make financial sense.

The first chips are scheduled to arrive in late 2026. OpenAI will start with limited deployments to test performance before scaling to the full 10 gigawatts by 2029.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.

Subscribe