Broadcom’s (AVGO) Role in the Accelerators Market and the AI Chip Market’s Evolution
Broadcom is the quiet power behind the AI boom. While GPU makers grab the headlines, Broadcom supplies the plumbing—custom accelerators for hyperscalers and the high-speed networking chips that knit thousands of processors into one AI supercomputer.
Broadcom has played an integral yet behind-the-scenes role in developing Google’s custom AI chips known as TPUs. Broadcom is a key partner in Google’s Tensor Processing Unit (TPU) roadmap, collaborating on co-design and production of these AI processors. In fact, industry analysts note that Broadcom has co-developed all six generations of Google’s TPUs to date and is already involved in the upcoming TPU v7. Google itself has confirmed that it co-designs its AI chips with Broadcom, and the chipmaker is firmly lined up to supply Google’s sixth-generation TPU processors. In practical terms, Google provides the TPU architecture and specifications while Broadcom contributes its semiconductor design expertise and manages manufacturing (TPUs are fabricated by third-party foundries like TSMC). This close partnership means Broadcom effectively builds Google’s AI accelerator chips based on Google’s blueprints, making it an indispensable enabler of Google’s AI infrastructure.
Broadcom’s involvement with Google’s TPUs dates back to the project’s early days and has grown with each generation. Broadcom’s CEO, Hock Tan, has highlighted the Google relationship as a cornerstone of its AI business – and investors have taken note. Broadcom is widely seen as the second-biggest winner of the generative AI boom (after Nvidia), precisely because of these custom-chip deals. By mid-2023, Tan was predicting that AI-related projects (largely driven by Google TPUs) could account for more than a quarter of Broadcom’s semiconductor revenue in 2024. This underscores how critical Google’s TPU program has been to Broadcom’s growth in the AI chip arena.
It’s worth noting that Google’s reliance on Broadcom has come under scrutiny. In late 2023, reports emerged that Google had internally discussed moving future TPU design fully in-house by 2027 to reduce costs. The talks followed a pricing standoff, and Google even explored alternate suppliers (like Marvell for certain server chips). However, Google publicly reaffirmed its partnership with Broadcom, with a spokesperson praising Broadcom as “an excellent partner” and stating that Google sees “no change” in the engagement. In other words, despite periodic pricing tensions, Google continues to lean on Broadcom’s custom silicon capabilities for the foreseeable future. This long-running collaboration highlights Broadcom’s unique role: it isn’t just a component vendor, but a strategic co-developer helping Google maintain an edge with homegrown AI hardware.
Custom AI Accelerators for Meta and Others
Broadcom’s custom-chip expertise extends beyond Google. The company has quietly become a go-to ASIC developer for other tech giants aiming to build their own AI accelerators. Meta Platforms, for example, also partners with Broadcom on its in-house AI chips. Meta has been developing custom MTIA (Meta Training and Inference Accelerator) chips to power AI workloads, and Broadcom has been deeply involved in those projects. Industry analysis indicates that Broadcom contributed front-end design and IP for Meta’s MTIA, similar to its work on Google’s TPUs. While Meta’s custom AI chips are still in early stages (the company hasn’t yet deployed them at the same scale as Google’s TPUs), the collaboration underscores Broadcom’s strategy of forming “deep partnerships” with top cloud players to design bespoke silicon. In essence, Broadcom functions as an outsourced silicon engineering arm for these firms – a role that has made it the world’s second-largest AI chip supplier by revenue, behind only Nvidia.
OpenAI, the maker of ChatGPT, is reportedly developing its own AI chips with Broadcom – a move that could further reshape the AI semiconductor landscape. Broadcom’s newest marquee client may be OpenAI, the creator of ChatGPT. According to reports, OpenAI has embarked on a project to develop a custom AI accelerator for its needs, in partnership with Broadcom. Reuters reported that OpenAI’s first in-house AI chip is expected to arrive in 2026 and that Broadcom is co-designing and producing the silicon. OpenAI’s motive is to reduce its heavy reliance on Nvidia GPUs by getting its own tailor-made chips – and it tapped Broadcom to make it happen. In fact, Broadcom’s CEO disclosed in late 2024 that a new, unnamed “large customer” had placed over $10 billion in orders for AI infrastructure chips. Industry insiders widely believe that this customer is OpenAI. If so, it represents a massive design win for Broadcom and a clear sign that even the most cutting-edge AI firms trust Broadcom to translate their AI research into high-performance silicon. Beyond Google, Meta, and OpenAI, Broadcom has hinted at multiple other companies engaging its custom chip services as well. In summary, Broadcom is emerging as the chief custom silicon supplier for a who’s-who of AI heavyweights – a fact often overlooked amid the focus on Nvidia’s off-the-shelf GPUs.
AI Semiconductor Market: Training vs. Inference Segments
Broadcom’s fortunes are tied to a broader explosion in the AI semiconductor market, which can be divided into two distinct segments: training chips and inference chips. Training chips are used to develop (or “train”) AI models by processing vast datasets – this segment has been dominated by high-end GPUs (like Nvidia’s A100/H100) and specialized accelerators (Google TPUs, etc.). Inference chips, on the other hand, run the trained models in real-time to handle tasks like answering queries or making predictions; they range from data-center ASICs to edge AI processors. Training was the first market segment to take off in the last decade’s AI wave, but inference is now quickly becoming the larger and more pivotal segment as AI models get deployed at scale in every industry. Once a neural network is trained, it might be queried millions or billions of times across the world – which translates to a much greater aggregate demand for inference computation than for training cycles. As a result, the industry’s center of gravity is shifting: hardware for running AI (inference) is expected to eclipse hardware for training AI in both units and value.
By some accounts, the market is on track to multiply several-fold in just the next five years. AMD, which competes in this space, projects the data-center AI accelerator total addressable market (TAM) to exceed $150 billion by 2027, up from roughly $30 billion in 2023. More recently, AMD’s CEO revised that outlook even higher – expecting the TAM to surpass $500 billion by 2028 (a stunning ~60% compound annual growth rate). While that is an aggressive forecast, it aligns with the consensus that AI chip demand is entering exponential growth. Gartner, for instance, predicts that AI inference compute demand will grow 40–50% annually through 2027, far outpacing most other data-center workloads. Similarly, IDC research indicates global spending on AI-specific hardware is on pace to reach $200+ billion by 2028.
Training vs. inference – growth and outlook: Within this booming market, the inference segment is expected to grow faster than the training segment in the coming years. The rationale is straightforward: after a one-time training of a large AI model, companies will deploy that model widely, requiring many inference chips to serve end-users (whether in cloud servers, smartphones, or IoT devices). According to the International Data Corporation (IDC), by 2028 roughly 80–90% of all AI compute in the cloud will be devoted to inference, not training. This is a striking inversion of the early AI era, and it underscores that inference workloads (answering queries, generating content, powering AI features for billions of users) will utterly dwarf training workloads in volume. Industry leaders are already witnessing this shift. “We always believed that inference would be the driver of AI going forward,” AMD’s CEO Lisa Su remarked in 2025, noting an “inference inflection point” as new generative AI use cases take off. AMD now expects inference chip demand to grow over 80% annually in the near term – becoming the largest driver of AI compute spend, even as training continues to grow robustly. In practical terms, this means data centers will increasingly be filled with accelerator cards focused on serving AI models (and not just on training them), and chipmakers are adjusting their roadmaps accordingly (for example, Nvidia’s product line for inference, and various startups targeting inference efficiency).
Crucially for investors, multiple analyses predict that inference silicon will overtake training silicon in market size soon. Some experts pinpoint 2026 as the crossover year when global spending on AI inference chips will exceed spending on training chips. By that time, the deployment of AI-powered services (from search engines to enterprise assistants) should reach enough scale that inference hardware purchases outstrip those for new model training. In fact, by 2028 the balance is expected to tilt decisively – with well over half of AI chip revenue coming from inference/serving processors. Gartner’s and IDC’s data support this trajectory: nearly every incremental dollar in AI hardware beyond the mid-2020s is likely to flow to inference-centric devices. It’s also a question of sheer volume: one trained GPT-style model might need thousands of inference chips distributed globally to meet user demand, whereas only a relatively small cluster of training chips was needed to build the model in the first place. As a Deloitte analysis put it, even if generative AI chips make up ~20% of semiconductor revenues, they represent an outsized portion of computing power but a tiny fraction of total chip unit volume – meaning growth will come from scaling out AI to millions of endpoints.
Conclusion
Will inference surpass training in market size? All signs point to yes – and soon. In terms of unit volume, inference chips (which include not only cloud accelerators but also edge AI chips in phones, PCs, cars, etc.) already far outnumber the relatively niche pool of ultra-high-end training GPUs. The more interesting question is revenue: here, training accelerators (with their high price tags) have so far captured the larger share of dollars, but that gap is closing rapidly. As mentioned, forecasts from credible sources suggest inference will overtake training by 2025–2026 in revenue and then extend its lead. By the late 2020s, inference is expected to comprise the majority of the $100+ billion-per-year AI hardware market, reflecting a maturing of the AI industry from research-centric to deployment-centric. This doesn’t mean training will stagnate – on the contrary, training spend will keep growing as models get ever larger – but inference is set to grow even faster, driven by an explosion of AI-enabled services and products. For investors, the takeaway is clear: the next phase of AI gold rush will be fueled not just by the systems that create AI models, but even more so by the silicon that runs those models everywhere, from cloud data centers to your pocket. Broadcom’s broadening portfolio of custom AI chips (for Google, Meta, OpenAI, and others) positions it well to capitalize on this inference-heavy future, even as it continues to supply the training workhorses that kick-start each new generation of AI technology.
Author

Investment manager, forged by many market cycles. Learned a lasting lesson: real wealth comes from owning businesses with enduring competitive advantages. At Qmoat.com I share my ideas.
Sign up for QMoat newsletters.
Stay up to date with curated collection of our top stories.