Part IX · Chapter 69

The Five Labs and Their Compute

The big three frontier labs and the two smaller-but-powerful brothers, told through their chip choices. OpenAI: Microsoft-Azure dependence, Stargate diversification, the Broadcom custom-silicon project, the AMD MI450 warrant deal, AWS $38B partnership. Anthropic: Google TPU primary, AWS Trainium2/3, the Colossus rental of May 2026. Google DeepMind: TPU vertical integration as the structural advantage. Meta: MTIA inference + Nvidia GPU training clusters; Llama as open-source play; Hyperion. xAI: Colossus 1 → 2; the Feb 2026 SpaceX fold-back; cofounder exodus through March 2026. Apple Project ACDC / Baltra and Tesla Dojo's August 2025 cancellation as side-strokes. Cerebras / Groq / Tenstorrent / SambaNova as the inference-specialist alternatives. → How the labs' silicon dependencies define their competitive paths.

On the afternoon of Wednesday, May 6, 2026, two short blog posts appeared within minutes of each other on the websites of two companies whose principals had spent the previous three years describing each other as enemies. Anthropic announced a compute partnership with SpaceX, the rocket company that had two months earlier absorbed Elon Musk’s xAI in an all-stock merger. The terms were modest in language and seismic in substance. Anthropic would gain access, within the month, to more than three hundred megawatts of capacity at Colossus, the converted Electrolux factory off the Mississippi River that Musk’s xAI had built to train Grok. Roughly two hundred and twenty thousand Nvidia GPUs were packed inside it. The deal would, on New Street Research’s accounting, generate three to four billion dollars a year in revenue for SpaceX with cash margins above two and a half billion. Anthropic also expressed interest in working with SpaceX on multiple gigawatts of orbital data centers, a phrase whose feasibility no one in the industry took seriously and whose presence in the announcement told you everything about whose vocabulary the contract had been negotiated in.

Three months earlier, Musk had described Anthropic on his social network as “evil” and “misanthropic,” accusing it of hating Western civilization. Anthropic, founded by Dario and Daniela Amodei in 2021 with a mission built around AI safety, had spent five years cultivating an identity as the un-Musk: deliberate, governance-obsessed, the company most likely to slow down. Some of its researchers had left OpenAI in part because they felt the original safety mission had been swallowed by commercial urgency. Now they were renting their cycles from a man who had spent two decades insisting AI was the most dangerous technology humans had ever built and then built one of the largest AI clusters on the planet anyway. Musk explained the reversal in a post that read almost like a hostage statement. He had spent time, he wrote, with senior members of the Anthropic team. “Everyone I met was highly competent and cared a great deal about doing the right thing. No one set off my evil detector. So long as they engage in critical self-examination, Claude will probably be good.”

What the two posts described was the iconic image of compute scarcity in the spring of 2026. The Memphis cluster had been built faster than its builder could fill it. The lab that had spent five years architecting a multi-supplier portfolio had still, in its fifth year, run out of cycles. The two parties had transacted because the alternative, leaving compute idle while paying users went unserved, was economically intolerable for both sides. The contract reportedly included a clause giving Musk the right to reclaim compute if Anthropic’s models “engaged in actions that harm humanity,” a fig leaf that committed Anthropic to nothing it had not already promised in its own Responsible Scaling Policy. Within a week, Anthropic had hired Ross Nordeen, an xAI cofounder, to run its compute organization. The boundaries between the labs, never exactly bright, were now openly porous.

Two years earlier, in February 2024, the Wall Street Journal had reported that Sam Altman had been pitching investors on a project of cosmic ambition: between five and seven trillion dollars to build a global network of fabrication plants that would manufacture the AI accelerators the world needed and that the existing supply chain could not produce. The figure was larger than the GDP of every country except the United States, China, and Germany. It was, on its face, absurd. It was also a faithful reading of where Altman had concluded his company would end up if it tried to grow into demand using only the chips Nvidia and TSMC could supply. The seven-trillion-dollar number had not been built. Something stranger had been built instead. By 2026, the binding limit on a frontier lab’s growth was no longer talent or research throughput or even capital. It was compute. Each lab had developed a posture toward that constraint, and each posture had become a piece of the lab’s competitive identity.

OpenAI’s posture, by mid-2026, was the largest pile of overlapping compute commitments any company had ever assembled. Through the early ChatGPT years, OpenAI had run on Microsoft Azure under what was, in effect, an exclusivity agreement: Microsoft’s billion-dollar check in 2019, expanded to roughly thirteen billion by the early 2020s, came with the right to host every OpenAI model. By the end of 2023, the cumulative Microsoft commitment had reached the high tens of billions. ChatGPT and GPT-4 had trained on Azure’s H100 clusters in West Texas and Iowa. The dollars Microsoft handed to OpenAI flowed back as Azure compute credits, then onward to Nvidia as GPU purchases, a loop the partners’ shared finance teams had stopped trying to rationalize and had simply learned to operate. The arrangement had given OpenAI more compute than any other AI lab in the world. It had also, by Altman’s private admission to investors in early 2024, given OpenAI a single point of failure at the level of its hosting partner.

The break came on January 21, 2025, when Altman stood with Larry Ellison and Masayoshi Son in the East Room to announce Stargate. Underneath the headline number, the structural change was buried in a single sentence in the press release. Microsoft would no longer be OpenAI’s exclusive cloud provider. The exclusivity that had defined OpenAI’s compute strategy for six years had been replaced with a “right of first refusal” structure under which Microsoft could match new commitments but no longer monopolize them. Stargate was, in plain terms, OpenAI’s escape route from Redmond’s hand on the throttle.

By the end of 2025, the escape was structural. In September, OpenAI, Oracle, and SoftBank announced five additional Stargate sites in Texas, New Mexico, Ohio, and an unnamed Midwestern location, bringing the planned footprint to nearly seven gigawatts. On October 6, AMD and OpenAI announced a six-gigawatt agreement to deploy multiple generations of Instinct accelerators starting with the MI450 in the second half of 2026, sweetened by a warrant under which AMD granted OpenAI the right to purchase up to a hundred and sixty million shares, roughly ten percent of the company, at one cent per share, vesting against deployment milestones. A week later, OpenAI and Broadcom announced a ten-gigawatt custom XPU program. On November 3, AWS and OpenAI announced a thirty-eight-billion-dollar, seven-year partnership for hundreds of thousands of GB200 and GB300 GPUs and CPU capacity for agentic workloads. By the time Microsoft and OpenAI’s revised exclusivity terms officially expired on April 27, 2026, the policy change was a formality. The architecture had already changed.

The complete portfolio, by the spring of 2026, ran through Microsoft’s Azure capacity in West Texas and Iowa, Oracle’s Stargate Abilene buildout that Crusoe had brought from open dirt to operational racks in fifteen months, AWS under the November 2025 partnership, CoreWeave contracts grown by mid-2025 to roughly twenty-two billion dollars, AMD MI450 deployments scheduled for late 2026, and the OpenAI-Broadcom XPU starting on a TSMC node widely reported to be N3 or N3P. Cumulative committed compute across the portfolio exceeded a trillion dollars. OpenAI had spent six years inside one hyperscaler’s house and was now an industrial counterparty in its own right.

Anthropic’s posture had developed along a different axis. Anthropic had taken its first frontier-scale compute commitment from Google in late 2023, a two-billion-dollar investment that came with the expectation it would train on Google’s TPU fleet. Claude 1 and Claude 2 had run substantially on TPU pods. Then, in September 2023, AWS came in with a four-billion-dollar commitment that escalated to eight billion by March 2024, and Anthropic agreed to make AWS its primary cloud provider and, by late 2024, its primary training partner on Trainium. By the close of 2025, Project Rainier was running roughly half a million Trainium2 chips and was scaling toward a million by April 2026, when Amazon and Anthropic announced a follow-on commitment of more than a hundred billion dollars over ten years, structured around five gigawatts stretching from Trainium2 through Trainium4.

What made Anthropic’s posture distinctive was that none of the commitments displaced the others. By the close of 2025 the company was running production training across three accelerator architectures at once. Google TPUs on Google Cloud capacity in Mayes County, Oklahoma, under the October 2025 agreement that on The Information’s reporting scaled to roughly two hundred billion dollars over five years and granted access to up to a million Ironwood TPUs. AWS Trainium at New Carlisle, scaling toward five gigawatts. Nvidia GPUs through a thirty-billion-dollar Azure commitment announced on November 18, 2025, that pushed Anthropic’s valuation to roughly three hundred and fifty billion dollars and brought the upcoming Vera Rubin systems into the mix. Mike Krieger, the company’s chief product officer, spent much of late 2025 explaining the logic publicly. No single supplier could scale fast enough to absorb the company’s growth curve, which by April 2026 Dario Amodei was describing as roughly an eighty-times increase in revenue run-rate over two years. The blended cost of training across TPUs, Trainium, and Nvidia ran thirty to sixty percent below an all-Nvidia stack on Anthropic’s internal accounting. Compute allocation, Krieger said, had become existential. OpenAI had spent six years in a single hyperscaler’s house and was now scrambling out. Anthropic had been deliberately three-headed from the start, and the deliberation was now its moat. Even three heads, the May 6 lease made clear, were not enough.

Google’s compute geometry was the geometry of a company that had been preparing for this constraint since 2013. The TPU program had produced six successive generations by the time Trillium reached general availability in late 2024. The Google DeepMind merger of April 2023, in which Sundar Pichai had folded Google Brain into DeepMind under Demis Hassabis, had aligned the research engine with the silicon. Gemini 1.0, shipped in December 2023, trained on TPU v4 and v5p. Trillium trained Gemini 2.0. Ironwood, unveiled at Cloud Next in April 2025 and reaching general availability that November, was training and serving Gemini 3 at scale.

What made Google’s posture structurally different was that the silicon, the data centers, the model, and the customer relationships all sat on Alphabet’s balance sheet. Google designed the TPU with Broadcom on the silicon engineering. Google ordered wafers from TSMC against Alphabet’s own forecasts. Google built the data centers that ran the TPUs, in Mayes County and the Iowa sites and a string of new campuses across the South and Midwest. Google trained Gemini on those TPUs and sold Gemini, as an API and as the model behind Search’s AI Overviews and Workspace’s Duet, to its own customers. Anat Ashkenazi, Alphabet’s CFO, raised 2025 capex guidance three times during the year, from seventy-five billion dollars at Q1 to ninety-three billion by Q3. By early 2026 the company guided to between $180 billion and $190 billion of 2026 capex. The vertical integration created an externality the rest of the industry had to account for. Anthropic was running on Google TPUs at scale. Apple Intelligence, by Apple’s own July 2024 disclosure, had pretrained on TPU v4 and v5p clusters because Nvidia H100s had not been available in the volumes Cupertino’s schedule required. Frontier-lab competitors who depended on Nvidia could match Google’s training capacity only by buying it from Nvidia at margins Nvidia would not give up. Gemini’s competitive position came partly from architecture, partly from data, and partly, by 2025 and 2026, from the simple arithmetic of a vertically integrated supplier serving itself.

Meta’s posture was the loudest counter-example. Mark Zuckerberg had spent 2024 and 2025 making public commitments to AI capex on a scale unusual even by hyperscaler standards. Meta’s 2024 capex had landed at roughly forty billion dollars. The 2025 number rose from a sixty-billion January band to seventy-two billion, and Zuckerberg told analysts in early 2026 that 2026 capex would land between $115 billion and $145 billion, the largest single-year capex commitment ever made by a non-state actor. Inside that envelope, Meta had become Nvidia’s largest single buyer. An engineering blog post from Meta’s infrastructure team in early 2024 had disclosed a target of three hundred and fifty thousand H100 GPUs by year-end, with overall compute equivalent to roughly six hundred thousand H100s once A100s, custom inference silicon, and other accelerators were rolled in. Llama 3, released in April 2024, had trained on a sixteen-thousand-H100 cluster the team documented in detail, including a failure log averaging one component fault every three hours over the run’s fifty-four days. Llama 4 in 2025 had landed with less impact than expected and triggered an overhaul that put Alexandr Wang of Scale AI in charge of a new Meta Superintelligence Labs division and produced, in November 2025, the departure of Yann LeCun, the chief AI scientist who had built Meta’s research arm and remained skeptical that large language models would ever produce real intelligence.

What made Meta’s posture distinctive was the open-source flank. Llama, in successive generations, had been released under permissive license terms that allowed almost any developer to download the weights and run them on their own hardware. The strategy deprived OpenAI and Anthropic of the price umbrella their closed-weights models would otherwise have commanded. It made Meta’s models the default substrate for the long tail of fine-tuning, distillation, and on-device deployment in the West, the same way Qwen and DeepSeek had become the default in the East. None of that shifted the underlying compute math. Meta still had to train Llama 4 and its successors on Nvidia clusters that consumed tens of billions of dollars of capex. Underneath the posture sat Hyperion, the four-million-square-foot campus Meta was building on twenty-two hundred acres in Richland Parish, Louisiana, designed for over five gigawatts and fed by an Entergy complex that would eventually run to ten gas-fired generators and seven and a half gigawatts of contracted capacity, a thirty-percent expansion of Louisiana’s entire grid.

xAI had taken the most extreme version of the Meta bet and run it harder. Founded by Musk in July 2023 with twelve cofounders and a stated mission to “understand the universe,” xAI had launched the Grok chatbot in late 2023, embedded in Musk’s X social network. Through 2024, the company’s compute story had been a Memphis story. xAI had bought a former Electrolux factory off the Mississippi River and made a hundred-thousand-GPU H100 supercomputer operational on a hundred-and-twenty-two-day timeline that, when Musk announced it in September 2024, several hyperscalers’ infrastructure leads said publicly was not credible. The first phase of Colossus went live on schedule. The portable gas turbines that powered the early operation, the lawsuits from Boxtown and the NAACP, and the EPA rule revisions they triggered all belonged to a separate story about the grid. By early 2025 Colossus had doubled to two hundred thousand H100s and H200s. By late 2025, xAI had begun a second cluster, Colossus 2, in Southaven, Mississippi, with plans for a million-plus Blackwell GPUs and a power footprint scaling toward two gigawatts.

The speed was real. The economics were fragile. xAI’s commercial revenue, even after Grok 3 and Grok 4 had improved through 2025, remained a fraction of OpenAI’s or Anthropic’s. Cluster utilization, on credible analyst guesswork, was well below what the capital structure required to amortize. Musk had financed the Memphis and Mississippi builds through a sequence of equity rounds that valued xAI at fifty billion in mid-2024, then eighty, then one hundred and fifty by mid-2025, then two hundred billion by year-end on the back of a six-billion-dollar Series E. The valuations were carrying the build, but the build was not yet generating the revenue the valuations implied. By late 2025 the cofounder ranks had begun to thin. Igor Babuschkin had left in August 2025 to start a venture firm. Greg Yang had stepped back in January 2026 citing Lyme disease.

Then, in early February 2026, the structure broke. On February 2, in a deal Bloomberg called the largest merger of all time, SpaceX acquired xAI in an all-stock transaction that valued xAI at two hundred and fifty billion dollars and the combined entity at one and a quarter trillion. Each xAI share converted to roughly 0.1433 SpaceX shares, packaged ahead of a SpaceX IPO Musk planned for later in the year at a target valuation as high as one and a half trillion. Musk’s blog post called the merger the most ambitious vertically integrated innovation engine on and off Earth. The strategic argument was that orbital data centers were the next frontier and that xAI would be a more credible builder of them as a SpaceX subsidiary than as a standalone.

The unmaking happened faster than the merger had. On February 10, 2026, Tony Wu, one of xAI’s most operationally central cofounders, resigned. Within twenty-four hours, Jimmy Ba, the University of Toronto researcher who had coauthored the 2014 Adam optimization paper and had run xAI’s research and safety efforts under direct report to Musk, also resigned. By February 11, Musk had reorganized what remained into four product areas: Grok, coding, the Imagine video product, and a new “Macrohard” line meant to be staffed by digital agents. Through March the rest of the cofounder bench departed. By month’s end, all eleven of xAI’s original cofounders other than Musk had left. Musk, in a March 13 post, said xAI had not been built right the first time and would be rebuilt from the foundations up. On May 6, in the same hour Anthropic’s announcement landed, Musk wrote on X that xAI would no longer exist as a separate company. The AI products would continue under a new banner, SpaceXAI, run as part of SpaceX. The Memphis supercomputer that xAI had built to train Grok would, within the month, be running Claude. The cofounders who had built it were gone. The cluster they had assembled was now a SpaceX rental property, paid for by the lab Musk had spent the previous three years calling evil.

Around the five frontier labs, two adjacent stories ran on slower clocks. Apple, having embarrassed its own silicon team in mid-2024 by disclosing the TPU origins of Apple Intelligence, was building its own answer in the Baltra program. Tesla had reached the opposite conclusion by the opposite path. The Dojo supercomputer that had been the cornerstone of Tesla’s autonomy strategy since 2021 was disbanded in mid-August 2025, when Musk called Dojo 2 “an evolutionary dead end” and announced that Tesla’s compute would run on the AI5 and AI6 chips manufactured by TSMC and Samsung respectively. Peter Bannon, Dojo’s lead, left the company. About twenty engineers from the team formed a new chip startup called DensityAI. The retreat said something the bull case for vertical integration could not conceal. Even a company with Tesla’s scale could conclude, when the math went against it, that the compute fight was best lost cheaply.

Beneath the named labs ran a smaller race among the inference specialists, the Cerebras and Groq and SambaNova and Tenstorrent challengers whose chips reshaped the inference market without ever displacing the frontier-lab map.

Each lab had developed a posture toward the constraint, and each posture had become a piece of the lab’s competitive identity. OpenAI’s was the largest pile of overlapping commitments anyone had ever assembled, with Microsoft exclusivity formally expiring on April 27, 2026 and the company’s restructuring into a public benefit corporation leaving it free to operate as a fully independent counterparty. Anthropic’s was the most architected portfolio, built deliberately across three accelerator generations and now one rented supercomputer. Google’s was vertical integration so complete it functioned as an externality the other labs had to plan around. Meta’s was balance-sheet brute force matched to an open-source distribution play that would either justify itself in the second half of the decade or become one of the larger industrial misallocations in technology history. xAI’s was speed traded off against everything else, and the trade had not, on the corporate plane, held.

Through the summer of 2026, Trainium2 was reaching saturation; Trainium3 racks were beginning to install; Ironwood pods were filling Mayes County. AMD MI450 capacity was scheduled for West Texas in the second half of the year. Broadcom’s first OpenAI XPU was on the calendar to begin production ramp. Apple’s Baltra was approaching tape-out. Hyperion’s first buildings were closing in on energization. Colossus 2 was approaching its first gigawatt. Whether the cumulative buildout would exceed demand or fall behind it was a question the participants, in private, did not pretend to know. What they knew was that the silicon under each lab’s name was the substance of the lab itself. The model on the screen was an artifact of the rack in the data center. The rack was an artifact of the wafer that had come off the line in Tainan or Hsinchu the previous quarter. The labs that had won the chip game were the labs that would still be running models in 2027. The lab that had played it fastest, by the spring of 2026, no longer existed.