Part VI · Chapter 38

Apple Silicon

Apple becomes a chip designer. → Vertical integration's return; how custom silicon became a moat.

In the spring of 2008, Bob Mansfield placed a call to a manager at IBM Israel and asked him to come to California. Mansfield was Apple’s senior vice president of hardware engineering, the unassuming Texan who had quietly run Mac and iPhone hardware for years. The man on the other end of the line was Johny Srouji, a Haifa-born engineer with master’s-level training from the Technion who had spent the bulk of his career inside Intel’s Israeli design center, then jumped to IBM to manage the team building the POWER7 processor. Srouji was forty-four. He had two reasons to stay where he was. He had a senior post at IBM, and he had family roots in Haifa that he had no intention of pulling up.

Mansfield’s pitch, according to the Bloomberg account that surfaced eight years later when Srouji had become impossible to ignore, was structured around a sentence that almost no other company in the world could plausibly say. Apple, he told Srouji, was going to start designing its own processors. There was no team yet to speak of. Steve Jobs had decided, after the experience of building the first iPhone on Samsung’s reference silicon and watching Intel decline to be the iPhone’s chip supplier, that Apple could not let somebody else hold the part of the product that mattered most. Mansfield was offering Srouji the chance to build that team from scratch, to set the architecture, to hire the people, to ship a chip into a product that the world had not yet seen and that would, if Jobs was right, end up in hundreds of millions of pockets. Srouji took a few weeks. He moved his family to Cupertino in March of that year and took the badge.

He arrived to find roughly forty engineers working on chip-related problems inside Apple. By the end of the decade his organization would be in the high hundreds. By the time Apple launched the M1 twelve years later, the silicon group would number in the thousands, distributed across Cupertino, Austin, Munich, Herzliya, Haifa, and a half-dozen other sites. The largest in-house semiconductor design organization in any consumer electronics company had been built one quiet hire at a time, mostly out of public view, from the foundation that Mansfield laid that spring.

The deeper origin of that bet was a phone call Steve Jobs had placed the previous year. Sometime in the run-up to the original iPhone, Jobs had asked Paul Otellini, then chief executive of Intel, whether Intel would be willing to manufacture a custom processor for the device he was about to ship. Otellini was running the most successful semiconductor company in history. He had committee meetings to listen to, margin targets to defend, and a roadmap dominated by x86 chips that had little overlap, technically or commercially, with what a smartphone needed. After internal review, he turned Apple down. Years later, in an interview with the journalist Alexis Madrigal at The Atlantic, Otellini would describe the rejection in the language of a regretted bet: the volume forecast at the time, he said, had not justified the investment, and the world would have been a lot different if Intel had said yes.

Jobs, denied Intel, settled in 2007 for a Samsung-built system-on-a-chip designated the S5L8900. Inside it sat a 32-bit ARM 1176JZF-S core licensed from a small British design house in Cambridge, plus a PowerVR graphics block from Imagination Technologies, all stitched together by Samsung’s foundry on a 90-nanometer process. The phone shipped on June 29, 2007. It was a triumph, and it was a problem. Apple was now utterly dependent on Samsung for the most expensive component inside its most important product. Samsung was about to release a phone of its own, the Galaxy S, that would compete directly with the iPhone. Tony Fadell, who ran the iPod and early iPhone hardware, had pushed Apple toward chip suppliers willing to do power-efficient parts; he could not, by himself, make Apple a chip company. Jobs concluded that Apple needed to become one.

The vehicle was a Santa Clara startup called P.A. Semi, originally Palo Alto Semiconductor, founded in 2003 by an engineer named Daniel Dobberpuhl. Dobberpuhl was the kind of designer that the people who designed processors talked about with a particular reverence. He had spent the 1980s and early 1990s at Digital Equipment Corporation in Hudson, Massachusetts, where he had led the design of the DEC Alpha 21064, the highest-performance microprocessor of its era, and then the StrongARM chip, an extraordinary piece of engineering that had achieved 160 megahertz on a CMOS RISC core consuming under a third of a watt. The StrongARM, after DEC’s slow collapse, had been transferred to Intel as part of a 1997 patent settlement, and Dobberpuhl had moved on. He had founded a company called SiByte in 1998, sold it to Broadcom for over two billion dollars in stock, and in 2003 founded P.A. Semi to build a low-power Power-architecture processor called the PWRficient for embedded markets. He had recruited an unusually deep bench of architects to do it. Among them was Jim Keller, the engineer who had designed the K7 and K8 processors at AMD, in turn the chips that for one brief period at the start of the 2000s had given AMD a real performance lead over Intel.

P.A. Semi had been close to a deal with Apple once before. Around 2005, the company had been a candidate to provide the chip for what would become Apple’s transition away from PowerPC, but Jobs had instead announced at WWDC that June that Apple would adopt Intel for the Mac, citing Intel’s stronger roadmap and the impossibility of getting the performance per watt Apple wanted out of the IBM and Freescale processors then powering the Power Macs. P.A. Semi went back to its embedded customers. Apple went back to thinking about phones.

By April 2008, Jobs was ready. Apple announced it had acquired P.A. Semi for roughly $278 million in cash. Of the company’s roughly 150 engineers, most signed on with Apple. Dobberpuhl himself stayed only briefly. Keller would stay for four years, lead the early A-series work, and then leave in 2012 to return to AMD. The rest, the architects who knew how to build cores from scratch, formed the nucleus of what Srouji would soon be running. At a press event in June, Jobs was asked obliquely about the deal and declined to elaborate. Apple, he said, had long been involved in customizing chips for its handheld products and the P.A. Semi engineers would help that work continue. He did not mention that the work was about to expand into a sustained, decade-long campaign to take ownership of the entire silicon stack.

The first product was modest by intention. Internally codenamed K48, it was a single-chip system combining an ARM Cortex-A8 CPU core, a PowerVR SGX 535 graphics core, and assorted memory controllers and accelerators on a single die. Most of the heavy architectural lifting was done not in Cupertino but in collaboration with Samsung and a small Austin firm called Intrinsity, which had developed a proprietary set of design tools called Fast14 capable of running stock ARM cores at clock speeds far above their reference design. The Intrinsity-tuned Cortex-A8, which Samsung shipped under the codename Hummingbird, ran at one gigahertz on a 45-nanometer process when most Cortex-A8 implementations were lucky to hit 600 megahertz. Apple negotiated exclusive use of the resulting design and, in April 2010, quietly acquired Intrinsity for about $121 million. The press release was three sentences long. By then the chip itself had already shipped.

It was called the A4. Steve Jobs introduced it on January 27, 2010, when he walked onto a stage at the Yerba Buena Center for the Arts in San Francisco, slung an aluminum-and-glass slab a little larger than a notebook into a leather chair, and unveiled the iPad. The keynote was uneven, the product reception mixed. The press dwelled on the name. Inside the device, however, was something the press could not see and most of the audience would not have appreciated: a complete system-on-a-chip designed under Apple’s roof, fabricated by Samsung, running iOS without an intermediary and without a competitor knowing what was inside. By June, Apple had put the same chip in the iPhone 4. The first product Jobs had ever shipped with a processor he could plausibly call his own was the most successful smartphone to date.

The A4 was not architecturally radical. Its CPU core was an off-the-shelf design from ARM, however much it had been tuned. Its GPU came from Imagination. Its real significance lay in what it permitted Apple to do next. Once Cupertino owned the SoC, every subsequent generation could be tilted further toward what Apple’s software actually needed: more cache here, fewer cores but wider ones, an integrated image signal processor for the camera, dedicated blocks for video encoding, custom power management. Apple stopped buying chips and started commissioning them.

The next two generations, the A5 in 2011 and the A6 in 2012, walked up that ladder. The A5, which went into the iPad 2, doubled the cores and tripled the graphics throughput while staying on the same fabrication node. The A6, which appeared in the iPhone 5 in September 2012, was the moment Apple stopped using ARM’s reference cores at all. Its CPU, codenamed Swift, was an Apple-designed implementation of the ARMv7 instruction set, built from scratch by the team P.A. Semi had seeded. AnandTech, the boutique technical publication that for a decade was the closest thing the industry had to a chip-by-chip audit, took a die shot of the A6 within hours of teardown and noted what stood out to a trained eye: this was no longer a tweaked Cortex. The cores were physically distinct. The microarchitecture was custom. Apple was now in the same business as Intel, AMD, and ARM itself, designing CPUs from first principles.

The proof point came thirteen months later, and it landed in the industry like a slap. On September 10, 2013, at the iPhone 5s launch, Phil Schiller introduced the A7 with a single line that few people in the audience parsed correctly the first time: it was the first 64-bit chip in a smartphone. The 64-bit transition had, on every previous platform, been a years-long affair coordinated through messy software ecosystems, with operating system vendors arguing with chip vendors about register layouts and pointer widths and binary compatibility. Apple, in contrast, simply turned it on. iOS 7 had shipped a 64-bit-ready kernel that summer. The compiler, LLVM, had been quietly extended to emit 64-bit ARM code. Developer documentation was already in place. The chip, internally codenamed Cyclone, had been in design for years. None of this had leaked.

The reaction was instructive. Anand Chandrasekher, then chief marketing officer at Qualcomm, dismissed the A7 to a reporter from the Korean outlet Geek as “a marketing gimmick” delivering “zero benefit” to users. Within weeks Qualcomm had publicly walked the comment back; within months Chandrasekher had been reassigned. The actual reaction inside Qualcomm’s CPU group was darker. An anonymous engineer there told the technology writer Hubert Nguyen, in an account widely cited at the time, that the A7 had “hit us in the gut” and left the team “slack-jawed, and stunned, and unprepared.” No major Android chip vendor had a 64-bit core on its near-term roadmap. Once Apple had shipped one, every Android licensee suddenly demanded one, and ARM Holdings, which had not yet finalized the 64-bit Cortex-A57 reference design for licensees, had to scramble. The mobile industry’s unspoken assumption that Apple was operating on roughly the same chip cadence as everyone else turned out to have been wrong by about a generation.

It would not be the last time. From the A7 onward, every fall iPhone launch began with the chip slide. The A8 in 2014 was the first Apple silicon manufactured by TSMC rather than Samsung, an early step in a foundry transition that would, by 2016, become exclusive. The A9 in 2015 was briefly dual-sourced between Samsung and TSMC, an experiment that ended in the consumer-press episode known as chipgate after iFixit-style sleuths discovered that the Samsung-fabbed parts ran slightly hotter and drained battery faster under sustained load than their TSMC twins. The TSMC version had used a 16-nanometer FinFET process; the Samsung version used a denser 14-nanometer design. The functional gap was tiny in absolute terms and easily lost in measurement noise, but it taught Apple a lesson it had already half-learned. From the A10 onward, every Apple SoC would be made in Taiwan.

The next architectural beat came in September 2017, when the A11 Bionic shipped in the iPhone 8 and the iPhone X. It contained two performance cores, four efficiency cores, an Apple-designed graphics block that for the first time replaced the Imagination GPU Apple had used for a decade, and a small new block of silicon Apple called the Neural Engine. The Neural Engine was a dedicated accelerator for neural network inference, capable of six hundred billion operations per second, and it existed primarily to make Face ID work. Face ID required the iPhone to run a face-recognition model in real time, against an infrared dot projection of the user’s face, in low light, fast enough that the phone would unlock as the user picked it up. A general-purpose CPU could not have hit the energy budget. A GPU could, but at the cost of every other graphics task on the phone. A custom inference accelerator solved the problem at a power footprint a few hundred milliwatts wide. It was the most visible early instance of Apple using its custom silicon to enable a feature that no off-the-shelf chip could plausibly have delivered.

The pattern, by then, was clear. Apple’s silicon team was not merely catching up to industry merchants like Qualcomm; it was running ahead of them on metrics that mattered. By the late 2010s, AnandTech’s reviews of the iPhone showed Apple’s CPU cores running at roughly the same clock frequency as ARM’s high-end designs, but extracting forty to fifty percent more instructions per cycle, because Apple was willing to spend transistor budget on a much wider issue width. The A14 core, codenamed Firestorm, decoded eight instructions per clock at a time when most competing designs decoded four or five. To support that width, Apple had built a re-order buffer well over six hundred entries deep. The chip was not faster because it was clocked higher. It was faster because it was structurally fatter, and it could be structurally fatter because Apple had no need to share the design with anyone, no licensee performance targets to hit, and no merchant-margin model to defend.

What this added up to, by 2019, was something that Tim Cook and Mansfield’s eventual successor, Johny Srouji, had begun to recognize as a strategic asymmetry of unusual depth. The chip in the iPhone 11 Pro that fall, the A13 Bionic, beat almost every laptop processor Intel had shipped in the prior eighteen months on single-threaded benchmarks, while drawing a fraction of the power. The chip in the iPad Pro that came out a year later was, in some workloads, faster than the chip Apple was buying from Intel for its own MacBook Pro. The phone was outperforming the laptop. The component supplier was outperforming the OEM. From the inside, the question was no longer whether Apple should put its A-series silicon into the Mac. It was when, and how, and how quickly the resulting machines could be shipped before the contradiction became absurd.

The “when” was settled at WWDC on June 22, 2020, broadcast as a virtual keynote because the world was four months into a pandemic. Tim Cook came on screen from an empty Steve Jobs Theater, in front of an unlit auditorium, and announced what Apple had been preparing for the better part of a decade. The Mac, he said, was going to transition off Intel and onto Apple silicon over a two-year period. The first machines would ship before the end of the year. Cook framed it in the language of vertical integration that Apple had begun rehearsing publicly only a few quarters earlier. “We have a long-term strategy of owning and controlling the primary technologies behind the products that we make,” he said, in a phrase that he had used some version of in nearly every analyst call for years. The chip transition was that strategy made literal.

Behind the camera, the operational coordination was more elaborate than the keynote suggested. Tim Cook, ever the supply-chain operator, had been the one to greenlight the bet. Jeff Williams, by then chief operating officer, had been managing the multi-year ramp of TSMC’s five-nanometer process around Apple’s volume requirements, a relationship that had become, in everything but legal form, an exclusive partnership. John Ternus, who had succeeded Mansfield in the day-to-day management of Mac hardware, had been overseeing the redesign of MacBook thermal envelopes around the assumption that Apple silicon would draw a fraction of what Intel chips drew. Craig Federighi’s software organization had spent two years rebuilding macOS for ARM and had developed Rosetta 2, a binary translator that would let Intel-compiled apps run on the new Macs unmodified, often faster than they had run natively on Intel. Srouji’s chip team had taken its A14 core, the Firestorm-Icestorm pairing already destined for the iPhone 12, and architected a Mac-class chip around it: more cores, more cache, more memory channels, an integrated GPU expanded from four execution units to eight, all wrapped around a unified memory architecture that put CPU, GPU, and Neural Engine on the same physical die, sharing the same pool of LPDDR4X. The chip was called the M1. It taped out in late 2019. It would be the most consequential single piece of silicon Apple had ever shipped.

When Apple unveiled it on November 10, 2020, in a streamed keynote captioned simply “One More Thing,” the company did something unusual: it let Srouji do the chip slide himself. Srouji, who had been giving versions of this presentation internally for a decade, walked the camera through five-nanometer transistor counts and core layouts and unified-memory bandwidth like a man who had been waiting his entire career for the moment. The M1, he said, contained sixteen billion transistors, four high-performance Firestorm cores, four high-efficiency Icestorm cores, an eight-core GPU, and a sixteen-core Neural Engine, all on a single die fabricated at TSMC on the company’s first-generation five-nanometer process. The first three machines built around it, the MacBook Air, the entry-level thirteen-inch MacBook Pro, and the Mac mini, were unveiled in the same hour. The MacBook Air had no fan. None had ever existed. Apple had taken silicon designed initially for a phone and put it into a laptop with no active cooling at all, because the chip drew so little power that air convection was sufficient.

The reviews came in quickly and, by the standards of Mac press, were strange. John Gruber, on Daring Fireball, called the new machines “astonishingly good,” and singled out a quality that benchmarks could not capture: the laptops never got hot, and they never got loud, and the battery just kept going. AnandTech, which had been audits-of-record for chip reviews since the late 1990s, ran a deep analysis by the Romanian engineer Andrei Frumusanu under the headline “Apple Shooting for the Stars: x86 Incumbents Beware.” Frumusanu’s measurements showed the M1’s Firestorm cores beating Intel’s contemporary Tiger Lake designs on integer single-thread performance by margins of fifty percent or more, while drawing roughly a third the power. The Mac mini, sitting on a desk, was outperforming desktop Intel chips that had been shipped six months earlier. Twenty percent of the Geekbench leaderboard for single-threaded performance, in the days after the M1 launched, was suddenly populated by a thousand-dollar laptop and a seven-hundred-dollar desktop. Even Intel-compiled software, run through Rosetta 2 translation, ran faster on the new Macs than it had on the Intel-based Macs the M1 had replaced. The internal joke at Intel, repeated to Bloomberg reporters by people who had been there, was that Apple had managed to beat Intel at running its own instruction set.

The implications spread outward in concentric rings. For Intel, the loss of the Mac account, which had been worth perhaps five percent of unit volume but a higher share of mindshare, was an early visible piece of a much larger problem the company would spend the next several years unable to fix. Pat Gelsinger, the engineer who had once worked on the 386 design team and would soon return to run the company, would later describe the Apple loss as one of the events that crystallized for him the depth of Intel’s manufacturing slippage. For TSMC, Apple’s Mac transition meant the Taiwanese foundry now had a single customer responsible for a quarter of revenue and the lion’s share of leading-edge wafer demand, an arrangement that gave the foundry the volume it needed to run the bleeding edge profitably and gave Apple the priority access that Samsung had always denied. For ARM Holdings, Apple’s growing dominance in custom cores was a complicated victory. Apple was the largest licensee of the ARM instruction set. Apple was also, demonstrably, more capable of designing high-performance ARM cores than ARM itself. The licensee had become the leader.

For the rest of the consumer-electronics industry, the M1 was a kind of public lecture. Vertical integration of design and software, married to outsourced manufacturing at TSMC, had produced a personal computer that beat Intel’s best on performance per watt by margins that x86 could not close at any thermal envelope Intel could realistically ship. The lesson was not that Apple had built a better chip. The lesson was that owning the chip, the operating system, the compiler, the application frameworks, and the device thermals together let Apple co-design across boundaries that nobody else in the industry controlled. Microsoft had Windows but not Surface volumes. Qualcomm had chips but not the OS. ARM had cores but not the device. Intel had x86 and fabs but not the application stack. Only Apple, after twelve years of patient hiring and acquisition that began with Mansfield’s call to Haifa, had assembled all of it under one roof.

The strategic move ran against a generation of conventional wisdom. The 1990s and 2000s had separated design from fabrication and architecture from product, on the premise that nobody could afford to do everything anymore. The cost curves had become too steep. The expertise had become too specialized.

Apple’s bet had been that the conventional wisdom was correct about manufacturing, where the cost curves had genuinely become impossible at consumer-electronics volumes, and wrong about everything upstream. Design, architecture, and integration could still be vertically owned, if a company was willing to spend a decade and several billion dollars assembling the team, and if the volumes underneath the team were large enough to amortize the investment. The iPhone had supplied those volumes. The acquisitions had supplied the team. TSMC had supplied the manufacturing. What came out the other end was a Mac that Intel could not match, an iPhone that Qualcomm could not match, and a defense moat that competitors trying to reproduce it would need a decade and a willing foundry partner of their own to even attempt.

In the months after the M1 launch, when reporters from Bloomberg and the Wall Street Journal pressed Srouji on what had made the chip work, he kept returning to a phrase that appeared in nearly every interview. When you have control, he said, you can do things you cannot do by buying merchant silicon. It was a sentiment Steve Jobs had been articulating in different forms since 2008, and one Tim Cook had been quietly enacting since taking over in 2011. The chip Srouji was holding up at the 2020 keynote was the proof of it. Vertical integration, declared dead by every Silicon Valley analyst writing about the fabless revolution in the 1990s, had simply waited a generation, gathered its tools, and come back through a different door, carried in by an engineer who had moved his family from Haifa twelve years earlier on the strength of a phone call.