The Fab visualizations
Part IX · Chapter 61

The October 2023 Reload

Oct 17, 2023 BIS expansion: closing the H800 / A800 loophole, performance-density metrics, advanced computing rules, gaming chip dual-use concerns. Nvidia's H20 China-specific chips. The ratchet pattern: every Beijing workaround answered, every six to twelve months. → Why export controls became a rolling negotiation rather than a single act.

On the morning of October 17, 2023, exactly fifty-three weeks after the Bureau of Industry and Security had published the rule that broke open the chip war, Gina Raimondo’s department published the rule that admitted the first one had not been enough. The notice that landed on the Federal Register’s morning queue carried the bureaucratic title that BIS rules always carry, “Implementation of Additional Export Controls: Certain Advanced Computing Items; Supercomputer and Semiconductor End Use; Updates and Corrections,” and it ran to one hundred and thirty-eight pages of triple-column small type. A companion rule on semiconductor manufacturing equipment ran to another one hundred and twenty. A third rule, smaller and quieter, added thirteen Chinese firms to the Entity List, including two of the cloud-computing subsidiaries Tencent and Alibaba had spent the previous year stocking with American silicon. The combined publication amounted to roughly the volume of a master’s thesis. The industry it governed had spent the calendar year designing chips around the rule that this rule replaced.

What the new rule replaced was the bandwidth threshold. In October 2022, BIS had defined a controlled chip in part by its interconnect speed, on the engineering judgment that the most damaging applications, training large language models on clusters of thousands of accelerators, required not only powerful single chips but the wide pipes that let those chips talk to one another at speed. Nvidia’s product organization in Santa Clara had read that definition the way any reasonable engineering team would: as a parameter to design beneath. By March 2023, four months after the original rule took effect, the company had begun shipping into China a chip called the H800, which was identical to its flagship H100 in almost every respect except that its inter-GPU NVLink bandwidth had been clipped from nine hundred gigabytes per second to four hundred, and its double-precision floating-point unit, useful for scientific computing but largely irrelevant to AI training, had been disabled almost entirely. The H800 was the H100 with a wire cut. Beneath it, the A800 occupied the same niche relative to the older A100. Both chips had been engineered to fall, by the narrowest possible margin, beneath the October 7 thresholds.

Tencent, Alibaba, Baidu, and ByteDance had treated the new chips as a reprieve. By the spring of 2023, the four Chinese internet groups had collectively placed orders for what Reuters reporting at the time put at five billion dollars in A800 and H800 silicon, with the deliveries scheduled to spool out across 2023 and 2024. Tencent’s executives told analysts in earnings calls through the year that the company was sitting on what it called one of the largest AI accelerator inventories in China. Alibaba Cloud was already advertising H800 instances. Baidu had begun training what would become its Ernie series on stockpiled silicon. The Chinese hyperscalers had read the October 2022 rule as a leak and were filling their reservoirs as fast as Nvidia’s order book allowed. Nvidia, for its part, had read the new business as twenty percent of its data-center revenue and a substantial fraction of the year’s growth.

This was the world the October 2023 rule was written to undo. The drafting team inside BIS, working under Assistant Secretary Thea Rozman Kendler and the Under Secretary, Alan Estevez, had spent the spring and summer studying the H800 the way the original team had once studied the A100. They concluded that bandwidth had been the wrong metric. A chip with constrained interconnect could still, given enough copies, train any model the unconstrained chip could train; the workaround at the cluster level was as simple as buying more nodes. What mattered for AI training, the BIS team came to believe, was raw arithmetic throughput per unit of silicon. They proposed two new metrics to replace the bandwidth threshold. The first was Total Processing Performance, abbreviated TPP, defined as twice the chip’s peak multiply-accumulate operations per second multiplied by the bit length at which those operations were performed, summed across all processing elements on the die. The second was performance density, defined as TPP divided by the area, in square millimeters, of any logic die fabricated with non-planar transistors, which since 16 nanometers had meant essentially every leading-edge process. A chip’s TPP captured how much arithmetic it could deliver. Its performance density captured how much arithmetic it could pack into a fixed amount of die.

The thresholds the rule set were arithmetic and unsentimental. A chip with TPP of 4800 or more was controlled outright; a license would be required and the presumption was denial. A chip with TPP at or above 1600 and a performance density of 5.92 or higher was also controlled at the same level. Beneath that lay a gray zone, called Notified Advanced Computing in the rule’s vocabulary, in which a chip with intermediate TPP and density values could still ship under a license exception, but only after the exporter notified BIS twenty-five days in advance and waited for clearance. The aggregate effect of the new structure was to capture, in a single stroke, every Nvidia chip the company had been shipping into China in 2023. The H800, with the same Hopper die as the H100 and the same arithmetic peak, sailed past the 4800 TPP threshold by a wide margin. The A800 did the same against its smaller budget. Both chips were now subject to the same license-presumed-denial regime as the H100 and A100 they had been engineered to replace. The bandwidth-based loophole that Nvidia had spent six months designing through had been closed by deleting the parameter that defined it.

The other surprise in the rule was a category nobody had expected: gaming graphics cards. Nvidia’s GeForce RTX 4090, the consumer flagship the company had launched a year earlier for use in high-end PCs and computer-aided design workstations, retailed in the United States for $1,599 and was sold mostly to gamers willing to spend three thousand dollars on a desktop. Inside the silicon, the 4090 carried 16,384 CUDA cores on the Ada Lovelace architecture, capable of more than 1,300 teraOPS at 8-bit integer precision. Run the TPP calculation, and the consumer card cleared the 4800 threshold without effort. BIS’s drafters had concluded, on the back of intelligence reporting and CSET-published academic analysis, that a sufficient number of 4090s wired together could train the same models a smaller cluster of H100s could train, and that Chinese AI firms had been quietly assembling exactly such clusters out of cards bought through Newegg and Amazon and the gray markets of Shenzhen’s Huaqiangbei electronics district. The new rule swept the 4090 into the same regulatory bucket as the H100. Beginning November 17, 2023, no new 4090s could ship to a Chinese customer without a license, and the license would be denied.

The reaction in China was immediate and almost theatrical. Within seventy-two hours of the rule’s publication, retail prices for the 4090 in mainland China doubled. Listings on JD.com that had quoted 12,999 yuan two weeks earlier reappeared at 26,000. Resellers in Shenzhen reported that scalpers had begun buying every available unit, often in pallet quantities, betting on a window in which the cards could still legally enter the country before the rule’s effective date. Photographs began circulating on Chinese social media of unmarked boxes, then of stacks of 4090s in airport luggage, then of warehouses outside Hong Kong with rows of palletized cards waiting for their cross-border movements. Nvidia, watching its own retail channel turn into a logistics scrum, paused North American consumer-card shipments to several distributors and accelerated negotiations with the U.S. government over a downbinned variant. By late December, the company had taped out a specifically Chinese SKU called the GeForce RTX 4090 D, with eleven percent fewer CUDA cores, calibrated to clear the new thresholds, and priced at 12,999 yuan to slot exactly into the original card’s launch tier. Jensen Huang’s spokesperson, asked about the new gaming product, said it had been designed in consultation with the U.S. government to comply with export controls. The careful phrasing was the kind of thing the company had not had to issue, before 2022, about a video card.

The country list expanded too. The October 2022 rule had drawn its perimeter around the People’s Republic of China, Hong Kong, and Macau. The October 2023 rule extended the controlled-destination list to twenty-two additional jurisdictions, the entire set of countries against which the United States maintained an arms embargo, ranging from Iran and North Korea to Venezuela and Zimbabwe. More consequentially, it added a sweep of Middle Eastern and Asian destinations under the rationale that they had become potential transshipment routes. Saudi Arabia and the United Arab Emirates, which had spent the previous year quietly accumulating thousands of H100s for ambitious sovereign AI projects, now found themselves needing case-by-case licenses. Vietnam, where Nvidia’s distributors had reported a curiously brisk recent business in 4090s, was added too. Inside Nvidia’s investor-relations team, the country-by-country licensing matrix that had once been a single line of text now became a matrix that lawyers had to look at every time an order was opened.

The third change was the most legally significant and the least visible. The rule extended worldwide license requirements to any company that was headquartered in a controlled destination, regardless of where the actual customer was located. A subsidiary in Singapore controlled by a parent in Shanghai now required a license to receive a controlled chip. A company in Dubai owned by a Chinese investor was caught under the same scope. The drafters had been studying the patterns by which the original rule’s perimeter had been crossed, and they had concluded that the corporate structures of the global tech sector, with shell companies and intermediate holding entities and procurement subsidiaries scattered across half a dozen jurisdictions, had been routing controlled silicon around the rule with no particular sophistication. The new structure tried, with mixed success, to follow ownership rather than addresses.

In Santa Clara, the response came in stages. Nvidia’s earnings call following the announcement, on November 21, 2023, devoted unusual attention to the regulatory environment. The company disclosed that it expected its Chinese data-center business to decline materially in the fourth quarter of fiscal 2024 and into the first half of fiscal 2025. It declined to quantify the hit. Within the same month, however, the company began previewing to Chinese customers and to the financial press a new China-specific lineup. The flagship product was called the H20, built on the same Hopper die as the H100 but with seventy-eight of one hundred and forty-four streaming multiprocessors disabled and the FP8 throughput capped at 296 teraflops, a fraction of the H100’s 1,980. To compensate, the company kept the memory subsystem fully provisioned: the H20 carried 96 gigabytes of HBM3 with 4 terabytes per second of bandwidth, more memory and more bandwidth than the H100 itself. Beneath the H20 came the L20 and the L2, two inference-focused parts based on the smaller Ada Lovelace die, intended to capture the Chinese inference market the H20 was too expensive to serve at scale. The lineup represented a new species of product: chips designed not to compete in the global market, but to thread the maximum performance allowable into the regulatory envelope of a specific country.

The H20, in particular, was an unusual artifact. Most of the silicon on the die was inert. The interconnect was full-bandwidth NVLink, inherited from the H100. The HBM stack was the largest the company had ever shipped. The arithmetic, though, was so heavily disabled that the chip’s compute-to-memory ratio looked nothing like a training accelerator and everything like an inference engine, and a strange one at that. As Dylan Patel and his colleagues at SemiAnalysis pointed out within weeks of the chip’s specifications leaking, the H20’s particular constellation of strengths, vast memory bandwidth and modest arithmetic, made it unusually well suited to certain inference workloads, particularly large mixture-of-experts models where memory access dominated and matrix multiplies were sparse. For some inference patterns, the analysts wrote, the H20 was actually faster than the H100 it had been engineered to fall beneath. The rule had drawn its threshold in the dimension Nvidia could most cheaply contort. The result was a chip that, in narrow but commercially significant configurations, was better at the very task Chinese AI labs most cared about.

The H20 entered mass production in the second quarter of 2024 at Wistron, with Nvidia’s Taiwanese ODM partners ramping shipments through the year. Initial sales projections were modest. They proved low. Through 2024, Chinese hyperscalers, having lost access to the H100 and now to the A800 and H800 they had stockpiled, accepted the H20 as the only Hopper-class part they could legally buy, and bought it in volumes that startled Nvidia’s own forecasting team. Industry estimates by year-end put H20 shipments to China at roughly one million units in calendar 2024, generating somewhere between twelve and fifteen billion dollars of Nvidia revenue. The H20 had become, in the awkward arithmetic of the chip war, a major product line that existed only because of the rule that had been written to constrain it.

For Jensen Huang, the company’s founder and chief executive, the rolling regulatory environment had become a public irritation by the back half of 2023 and a structural concern by the end of it. In an interview with the Financial Times that May, before the rule revision was even finalized, Huang had told the paper that the U.S. government had effectively asked Silicon Valley to compete in China with, in his words, “our hands tied behind our back.” On a quarterly earnings call later that year he repeated the formulation with more emphasis. Privately, executives at Nvidia and at AMD told reporters that the rule cycle had begun to remind them of nothing so much as a chess game in which the opposing player was permitted to revise the rules between every move. The view from inside Commerce was different. In a December 2023 onstage interview at the Reagan National Defense Forum, Raimondo addressed the chipmakers directly. “If you redesign a chip around a particular cut line that enables them to do AI,” she said, “I’m going to control it the very next day.” A few minutes later she described her department’s enforcement work as a game of whack-a-mole.

The whack-a-mole framing caught what the export-control regime had become. October 7, 2022 had been written and presented as a single, decisive intervention, the kind of regulatory blow that carved a permanent boundary into the world. By the autumn of 2023 the boundary had moved, and by the autumn of 2024 it would move again. Through 2024 BIS expanded the foundry due-diligence rules, tightened semiconductor manufacturing equipment controls in coordination with the Netherlands and Japan, added Huawei-affiliated fabs to the Entity List, and finalized rules around high-bandwidth memory that would, before the year was out, capture the H20’s particular profile and force a third generation of redesigns. Each iteration named the loophole created by the previous one; each iteration was answered by a corporate response engineered to fit beneath the new threshold; each response was studied, in turn, by the next drafting cycle. The cadence was roughly six to twelve months. Industry lawyers had begun calling it, half-seriously, the BIS update season.

What this implied for the industry’s strategic posture was difficult to absorb. A chip designer building a new architecture on a three-year R&D cycle could no longer assume that the rule against which it had designed at tape-out would be the rule in effect at first-customer-shipment. The constraint had stopped being an external boundary and become an internal optimization variable, jointly negotiated between the company’s engineering team, its government-affairs team, and the lawyers in Washington who tried to read the next rule from the comments BIS opened on the current one. Inside the United States, parallel debates erupted over whether the cycle was too aggressive, jeopardizing American chipmakers’ commercial position, or not aggressive enough, allowing repeated workarounds in the months between updates. CSIS analysts, including Gregory Allen, who had described the original October 2022 rule as an unprecedented act of economic statecraft, framed the 2023 update in less dramatic terms. It was, he wrote, the first iteration of what was now plainly a continuous regulatory regime. The strategic question had stopped being whether the United States would impose semiconductor export controls. It was how often, against what new metric, and at what cumulative cost.

The cost was being accumulated on multiple ledgers at once. American chipmakers were absorbing it in lost Chinese revenue, in design overhead, and in the operational expense of running parallel product lines for export-controlled and unrestricted markets. The U.S. government was absorbing it in regulatory complexity that BIS, by its own under-resourced admission, was struggling to enforce. By 2024 reports were beginning to emerge of organized circumvention networks. American distributors, mostly in California and Texas, were filing controlled chips through cutouts in Malaysia and Singapore. The most spectacular case of the era, when prosecutors finally brought it, would name three executives associated with Super Micro Computer, the San Jose server-builder whose Nvidia-loaded systems had been a staple of every U.S. hyperscaler’s data center, and would allege a 2.5-billion-dollar smuggling operation routing servers through a Southeast Asian pass-through company and into Chinese buyers. The case would not be unsealed for two more years. The pattern it represented had been visible to anyone reading port-of-entry data by 2024.

The Chinese side absorbed the cost differently. Through 2024 the Big Fund, which had announced its third tranche of 47.5 billion dollars in May, accelerated disbursements toward indigenous design tooling, equipment substitution, and fab capacity. Huawei’s Ascend program, building toward chips that would later be named 910B and 910C, was ramping production at SMIC’s 7-nanometer node in volumes that, by the end of the year, U.S. analysts would no longer be able to dismiss. Cambricon, the small Beijing AI-chip designer that had spent most of the decade as a curiosity, was reporting quarterly revenue growth measured in multiples. Domestic Chinese hyperscalers were running internal pilot programs to migrate critical workloads off Nvidia entirely, partly out of necessity, partly as insurance against the next BIS update. The thesis the rule had been written to defend, that depriving China of leading-edge accelerators would slow its frontier AI ambitions, was being tested in real time, and the answer was beginning to look more contingent than its drafters had hoped.

Inside the headquarters on Pennsylvania Avenue, Raimondo’s team understood this. They also understood that no plausible domestic political coalition would tolerate a halt in the cycle. The October 2023 rule, like the October 2022 rule before it, had been bipartisan in a way almost nothing else in Washington was that decade. Republican hawks and Democratic industrial-policy partisans had found common ground on the proposition that letting any frontier AI capability reach Beijing was unacceptable, regardless of cost. The rule cycle could be slowed only by a political environment that did not exist. It could not be paused by an administration without paying a price the next administration would extract.

What the October 17, 2023 rule did, beyond the technical work of closing the bandwidth loophole and capturing the H800 and pulling the 4090 into the controlled bucket, was admit something the architecture of the original act had not. It admitted that export controls against an adversary as resourceful and well-capitalized as the People’s Republic of China would not be one act but a sequence of acts. The relationship between American regulators and Chinese chip ambitions had become, by the autumn of 2023, less a wall than a weather system, in which fronts moved in cycles and each side adapted to the prevailing pattern of the moment. The rule was no longer a rule. It was a process. The companies that built the world’s most valuable chips and the companies that bought them most desperately were now both, in different ways, tenants of that process. None of them, including the people writing it, knew where the cycle would end. The next rule was already being drafted, in the offices on Pennsylvania Avenue, while the H20 wafers came off the Tainan line and were boxed for shipment to Shenzhen.