SAN DIEGO — Nvidia sells chips. What it actually sells is a reason not to buy anyone else’s chips. That reason has a name – CUDA – and for seventeen years it has been so deeply embedded in how AI models are built, trained, and deployed that switching away from it is less a technical choice than an act of institutional will. Qualcomm paid $3.9 billion Tuesday to give developers an easier way out.
The acquisition target is Modular, a four-year-old software startup founded by Chris Lattner and Tim Davis. Lattner is the engineer behind LLVM, the open-source compiler infrastructure that replaced a generation of proprietary toolchains and now underpins everything from Apple’s Swift to Rust’s code generation. He created Swift itself, the programming language Apple uses across its entire ecosystem. At Modular, he spent four years building Mojo – a language designed to write AI inference code once and run it optimized across chips from Nvidia, AMD, Intel, Qualcomm, and Apple Silicon, without a hardware-specific rewrite for each.
Qualcomm announced the all-stock deal at its Investor Day in New York on Tuesday, June 24, confirming a transaction in which Qualcomm will issue up to 19.2 million shares to Modular shareholders, CNBC reported. The $3.9 billion valuation more than doubles Modular’s September 2025 financing round, which valued the company at $1.6 billion. The deal is expected to close in the second half of 2026, pending regulatory review.
The bet Qualcomm is making is not just about chips. It is about who controls the software layer developers have to write against. For seventeen years, the answer has been Nvidia’s CUDA. Developers who use CUDA to accelerate neural network training and inference buy into an ecosystem – code, libraries, developer tools – that runs on one company’s hardware. Modular’s MAX inference engine and Mojo language are Qualcomm’s answer to that lock-in. The pitch is portability: write the code once, run it anywhere.
Qualcomm CEO Cristiano Amon called the deal “a pivotal moment not just for Qualcomm, but for the AI industry,” stating in the company’s official announcement that the industry is moving toward “disaggregated, multi-vendor architectures that demand a more open and modern software foundation.” That framing is accurate as a description of what AI companies say they want. Whether they will rewrite existing CUDA workloads to get there is a different question, and no one at Tuesday’s investor event had an answer for it.

The deal does not exist in isolation. Earlier this week, OpenAI unveiled its first custom inference chip, Jalapeño, built with Broadcom and claiming 50 percent lower cost per inference token than Nvidia’s Blackwell GPUs. Apple announced it would fast-track the M7, an AI-focused chip built around on-device inference, bypassing two intermediate chip generations entirely. The pattern is visible: AI infrastructure is mounting a broad, expensive, and simultaneous push to build around Nvidia from multiple directions at once.
Modular brings roughly 150 employees into Qualcomm, along with Lattner and Davis as co-founders the company says will stay on. What they are building alongside Mojo is MAX – a runtime framework that manages how AI models execute across different hardware. MAX already runs on Nvidia GPUs, AMD hardware, and Apple chips. The pitch to AI companies is that they can migrate inference workloads between those platforms – and eventually to Qualcomm’s own Oryon architecture in data centers – without rewriting model-execution code each time. Qualcomm set a target of $15 billion in data center revenue by fiscal 2029 at the same investor event.
Qualcomm also disclosed it is separately in advanced talks to acquire Tenstorrent, an AI chip startup led by chip architect Jim Keller, for $8 billion to $10 billion. Tenstorrent designs inference chips using the open RISC-V instruction set – hardware that, combined with Modular’s software, would give Qualcomm a complete alternative stack to Nvidia’s: open silicon plus portable software. If both deals close, Qualcomm will have spent more than $14 billion in a single week assembling an AI infrastructure play. Tenstorrent talks are ongoing; neither company has commented publicly.
What Qualcomm cannot deliver with Tuesday’s announcement is the thing Nvidia’s moat is actually made of: seventeen years of developer habit. CUDA has millions of lines of code written against it, thousands of libraries optimized for it, and a generation of AI engineers trained to work inside it. Lattner’s compiler credentials are genuine – LLVM did change how a large part of the industry writes software, replacing proprietary toolchains from Sun, SGI, and IBM over the better part of a decade. But that transition took more than ten years and happened at a pace the AI industry does not operate at.
The question is whether the financial pressure bearing down on Nvidia’s pricing creates a faster inflection. Oracle’s disclosure last week that it cut 21,000 jobs and named its own AI adoption as the cause is a reminder of how quickly the AI economy is restructuring around cost efficiency. Every company that built on Nvidia because it had no alternative is now watching competitors spend billions to create one. Modular is Qualcomm’s bet that when the alternative arrives, the software has to be ready first.
What the deal does not answer is the harder question of real-world performance. The $15 billion data center revenue target by 2029 implies Qualcomm winning a meaningful share of AI inference workloads currently running in CUDA. Whether Mojo and MAX can deliver at that scale – in production, at the latency and throughput requirements large AI deployments demand – has not been demonstrated outside Modular’s own benchmarks. Lattner has built compilers that worked before. This is a larger bet than any of them.

