OpenAI and Broadcom Unveil Jalapeño, First Custom AI Chip for LLM Inference

OpenAI and Broadcom today unveiled Jalapeño, OpenAI's first Intelligence Processor designed for large language model inference. The chip was delivered to OpenAI CEO Sam Altman and President Greg Brockman by Broadcom President and CEO Hock Tan and President Charlie Kawwas. The accelerator represents the first AI chip in a multi-generation compute platform the companies are building together to make advanced AI faster, more reliable, and more accessible. OpenAI designed the chip from scratch around its understanding of LLM fundamentals, with Broadcom and Celestica helping industrialize the platform through chip implementation, board and rack system integration, high-performance networking, and scalable production systems.

OpenAI and Broadcom Deliver Jalapeño Chip to Company Leadership

The chip delivery marks an important step in OpenAI's strategy to build the full stack behind its models and products. Jalapeño was developed through collaboration between OpenAI, Broadcom, and Celestica, with each partner contributing specialized expertise to the platform.

OpenAI designed the chip architecture informed by its roadmap of models, kernels, serving systems, and product needs. Broadcom contributed chip implementation and networking technologies, including Tomahawk silicon, to bring the platform to large-scale production. Celestica provided board, rack system integration, and scalable production systems expertise.

Jalapeño Architecture Optimized for LLM Inference Workloads

Jalapeño is designed with flexibility to work with all LLMs guided by OpenAI's insights into the inference needs of current and future AI models across the industry. Engineering samples are already running ML workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark.

Early testing shows that Jalapeño will deliver performance per watt substantially better than current state-of-the-art. The architecture reduces data movement and balances compute, memory, and networking resources to achieve realized utilization much closer to theoretical peak performance. A detailed technical report will be presented in the coming months.

"Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers," said Richard Ho, who leads OpenAI's hardware program. "We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware's theoretical limits."

The chip is a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads. It is informed by the systems OpenAI runs every day across ChatGPT, Codex, the API, and future agentic products. The goal is to combine the power and throughput of today's leading AI accelerators with latency closer to the fastest specialized inference systems.

Development Completed in Nine-Month Timeline Using AI-Assisted Design

Jalapeño was co-developed from initial design to manufacturing tape-out in just nine months. The companies believe this represents the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors.

The accelerated timeline reflects deep software-hardware co-development with OpenAI's engineering teams, Broadcom's silicon implementation expertise, and the use of OpenAI models to accelerate parts of the design and optimization process. The same models served to users are helping improve the infrastructure used to run future models.

Multi-Generation Compute Platform Planned with Broadcom and Celestica

Jalapeño is the first step in a multi-generation compute platform combining OpenAI-designed accelerators with Broadcom silicon implementation, networking, and connectivity technologies, and Celestica's board, rack, and system expertise.

"The world is moving to a compute-powered economy," said Greg Brockman, President and Co-Founder of OpenAI. "Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access."

Deployment Scheduled by End of 2026

The multi-generation compute platform is designed for initial deployment by the end of 2026 and expanding in the years ahead. OpenAI operates across the full stack, including chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experience, with each layer optimized around making its models faster, more reliable, and more affordable for users.

FAQ

What is Jalapeño and when was it unveiled?

Jalapeño is OpenAI's first Intelligence Processor, an AI accelerator designed specifically for large language model inference. OpenAI and Broadcom unveiled the chip today.

How long did it take to develop Jalapeño?

Jalapeño was co-developed from initial design to manufacturing tape-out in nine months. The companies believe this represents the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors.

When will Jalapeño be deployed?

The multi-generation compute platform featuring Jalapeño is designed for initial deployment by the end of 2026, with expansion planned in the years ahead.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments