OpenAI and Broadcom Unveil Custom AI Chip Jalapeño

SAN FRANCISCO – The most expensive part of running ChatGPT has never been the engineers or the office space. It is the chips, overwhelmingly chips made by Nvidia, purchased at Nvidia’s prices, consumed at a rate that has turned compute spending into one of OpenAI’s largest operating costs. On Wednesday, the company gave itself what it hopes is the beginning of an exit.

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI accelerator. The chip was designed from scratch by OpenAI’s engineering teams, manufactured by TSMC, and built to run inference workloads: the process of serving responses from a trained model to users of ChatGPT, Codex, OpenAI’s API, and what the company describes as the next generation of agentic AI products. It is not a training chip. It does not try to be general-purpose. It does one job, at the scale OpenAI operates, as efficiently as possible.

Early lab testing shows Jalapeño delivering performance on par with Nvidia’s Blackwell processors and Google’s own Tensor Processing Units, at roughly 50 percent lower cost per inference token. Those are significant numbers, provided they hold. The technical report with the underlying data has not been published. OpenAI says it is coming “in the months ahead.”

OpenAI President Greg Brockman, who received the first wafer alongside CEO Sam Altman in a handover ceremony with Broadcom CEO Hock Tan and President Charlie Kawwas, framed the announcement in terms of infrastructure control. “By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access,” Brockman told CNBC on Wednesday. The chip went from initial design to manufacturing tape-out in nine months, a timeline OpenAI and Broadcom describe as the fastest ever achieved in high-performance AI semiconductors. Parts of the design process were accelerated using OpenAI’s own AI models.

The pressure to build around Nvidia has been building for some time. AI inference workloads now account for two-thirds of computing in AI data centers, and the economics of that workload are what Jalapeño targets. AI memory chips have been in tight supply all year as demand from inference deployments has surged. Nvidia’s $25 billion bond sale last month, the company’s first debt offering in five years, signaled that even the largest beneficiary of the AI boom is racing to expand production capacity. Jalapeño is OpenAI’s most direct attempt yet to reduce its exposure to that supply chain.

The announcement puts OpenAI alongside companies that have been building custom silicon for years. Google has Tensor Processing Units, Amazon has Trainium, Meta is developing inference silicon of its own. But the comparison only partially captures what Jalapeño represents. Those chips were built by hyperscalers to serve external cloud customers. OpenAI is building a chip for its own products and its own users, a company that started as a research nonprofit now committing to one of the most capital-intensive engineering programs in the technology industry.

OpenAI Jalapeño custom ASIC chip package designed for LLM inference built on TSMC 3nm — The Jalapeño chip package, OpenAI’s first custom inference processor built on TSMC’s 3nm node with Broadcom. [Image Source: OpenAI]

Hock Tan said the collaboration with OpenAI is enabling “the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.” Gigawatt-scale is the unit of measurement the AI industry reaches for when describing the largest infrastructure buildouts. What it implies about Jalapeño’s production requirements, and the manufacturing yields TSMC will need to deliver, has not been disclosed. Board and rack integration is being handled by Celestica. The companies describe Jalapeño as the first step in a multi-generation compute platform, with initial deployment targeted for the end of 2026.

The chip’s architecture was designed specifically for inference on large language models rather than adapted from a training accelerator or a general-purpose processor. That design choice reflects the structural transition the AI industry is navigating: from building models to running them, cheaply, at enormous scale. OpenAI’s legal battles with Elon Musk’s xAI this year put in sharp relief how contested the company’s strategic decisions have become. This one is about the economics of operating at that scale not just this year but indefinitely, as TechCrunch reported.

What neither OpenAI nor Broadcom provided on Wednesday is anything a reader could use to independently evaluate the 50 percent cost claim. No published methodology. No third-party benchmark results. No yield data from TSMC. Jalapeño is a first-generation chip that has never run in a production environment, much less the gigawatt-scale one its makers were already describing. Bloomberg reported on the chip’s potential to run models faster and more cheaply. The potential, as of Wednesday, is what exists. The chip has been handed over. The evidence is still being written.