Qualcomm jumped 15% Monday after announcing it'll ship AI inference chips for data centers—in 2026 and 2027. It's the mobile chipmaker's third swing at the server market, and this time it's targeting inference workloads where NVIDIA's dominance is most at risk.
Key Points:
- The chips focus on cost and power efficiency for running large language models, where NVIDIA's training-optimized GPUs are overkill
- Qualcomm's first data center chip, Centriq, launched in 2017 and died in 2018 after burning hundreds of millions; its second attempt using Nuvia technology (acquired for $1.4 billion in 2021) hasn't shipped anything yet
- The AI250 uses "near-memory computing" that Qualcomm claims delivers 10x better memory bandwidth than competitors
For two decades, Qualcomm has been synonymous with mobile silicon—Snapdragon chips, wireless modems, and on-device AI. Monday’s launch flips that script. The company is now chasing the heart of generative-AI infrastructure: data-center inference, where NVIDIA has held a near-monopoly.

The new chips target rack-scale deployments with liquid cooling—a serious play for hyperscale data centers. Qualcomm claims the AI250's "near-memory computing" architecture delivers 10x better memory bandwidth than competitors while using less power. That matters for inference, where you're running models constantly and power bills add up fast.
Qualcomm dominates mobile chips but has no successful data center track record. Its existing Cloud AI 100 inference chip, launched in 2020, has some niche deployments (AWS offers it, BMW uses it for autonomous driving development) but hasn't touched NVIDIA's position. The company shuttered its first server division in 2018, acquired Nuvia specifically to re-enter the market in 2021, then spent three years fighting legal battles with Arm over chip licenses while producing nothing for data centers.
Now it's announcing a completely different product line—not CPUs like it talked about earlier in 2025, but specialized AI inference accelerators. That suggests either the CPU strategy stalled or Qualcomm realized inference is the easier wedge.
The inference market is growing faster than training because once you've built a model, you need to run it millions of times. Companies like AWS, Google, and AMD are all building inference-specific chips because NVIDIA's training-optimized GPUs are overkill and expensive for that workload. Qualcomm's pitching efficiency and cost—it has to, since it can't match NVIDIA's performance or software ecosystem.
However, the 2026-2027 timeline means customers are committing to vaporware for now. Qualcomm's hoping its mobile AI experience and power efficiency chops translate to data centers. But so did Intel, AMD, Google, and a dozen startups—and NVIDIA still has 65% market share.
The smart move would've been showing actual silicon and benchmark numbers. Instead, we got a press release and liquid cooling specs.