Processor Whispers: About TLAs and IPOs
by Andreas Stiller
Intel's Haswell processor casts its shadow before it, AMD starts an open-standard IP initiative for SoCs, Apple partner Audience Inc is going public, and the final end of the Itanium is on the horizon.
At a server workshop in Oregon in the beginning of February, Intel's head of development, Ronak Singhal, smiled meaningfully when asked about the "hardware transactional memory" (HTM) architecture extension for the company's generation-after-next Haswell processor – as revealed in an earlier issue of Processor Whispers. He declared it a hypothetical possibility over which clarity would soon reign, if it were real, when the necessary TM instructions were introduced – those instructions were documented by Intel only a few days later. He also dropped a hint as to the implementation of fused multiply-add (FMA3) along the lines of, "If you do something, do it right". This could be interpreted as criticism of the current Bulldozer FPU design and suggests that Haswell will offer a much better FMA implementation, possibly with two 256-bit parallel FMA operations instead of only one. So we would be talking about 16 flops/clock/core at double precision operation. With that, Haswell would draw level with the Chinese processor Loongson 3B, which had been expected last year, but which apparently got stuck somewhere on the long march.
Currently, 8 flops/clock/core – Sandy Bridge, Power7, SPARC, VIII8fx, Bluegene/Q, ShenWei – is the status quo for performance. Sandy Bridge manages that without FMA, with a port for the 256-bit multiplication and another one for the addition, which runs in parallel. AMD's Bulldozer-FPU offers FMA, but both its half-modules have to share a single, common FPU pipeline for 256-bit operations, so that only 4 flops/clock/core remain.
The Haswell version for desktop and mobile devices is ready. According to processor images, which are roaming the Internet, the die with four cores should measure about 185 mm² and thus have around 1.5 billion transistors. Right now, Singhal said, Intel is working on the details of the server version, Haswell EP. As currently with the Sandy Bridge, Intel also plans to first launch new architectures in the desktop and mobile version, while the server versions will roll out a year or more later. The naked processor cores are, as usual, mostly identical to the ones of their desktop colleagues, but the surroundings on the chip are quite different. The Xeons are designed for two, four or even more sockets and mostly come with more cores. They also offer different memory, PCI Express and QPI options, which the upcoming launch of the Xeon E5 Server (Sandy Bridge EP) will soon show.
Many of Intel's partners received large shipments of Xeon E5 processors quite a while ago, particularly those in the supercomputing sector, and so, numerous performance results are already available, especially for the Linpack benchmark. Partner Hewlett-Packard had even published details for the next Gen 8 server generation on its web site, but the pages with the product information, for instance for the Proliants ML350p, DL380p, BL460c – all marked "Gen 8" – were quickly removed again.
CCC in Flux
Meanwhile, competitor AMD hasn't been idle. At an Analyst Day in Sunnyvale, the company presented its new roadmap with many changes and a multitude of new codenames. The event was titled CCC, as was Intel's developers forum IDF once. With Intel, this TLA (three-letter acronym) stood for computing, communications, convergence. AMD has now reinterpreted it as consumerisation, cloud and convergence.
Consequently, convergence seems to be most important, and so, AMD's HSA initiative (heterogeneous system architecture) will probably take centre stage. AMD has been talking to partners, but also to competitors, about the possibility of specifying a shared open interface standard for various modules on chip, some kind of on-chip PCI with a cache coherent memory model. An independent HSA consortium is supposed to watch over the standard and drive its further development.
AMD will then be able to make its weight – high-end processor and graphics on a single chip – count by linking it to HSA-compatible third-party modules. These could be ARM chips or even NVIDIA GPUs, if NVIDIA adapts the HSA. According to rumours, one or other manufacturer of video game consoles is very interested in HSA for its next console generation, which would create a wide basis for this undertaking.
The question arises: which process and which manufacturing partner does AMD have in mind for the systems-on-a-chip in HSA technology scheduled for 2014? In a breakout session about the fab-less supply chain, AMD announced that it plans to leapfrog the 22 nm step and instead move directly from 28 nm to 20 nm. In any case, Globalfoundries as well as TSMC have 20 nm on their roadmap.
Whether processor manufacturer Apple is interested in HSA, is still unknown. What's clear now is that Apple has experience with employing third-party IP and used a DSP from Audience for noise cancellation in the A5 processor; as we had presumed some months ago. Audience is now planning to go public with shares worth around $75 million and thus had to give names and other details in a filing submitted to the US Securities and Exchange Commission (SEC). Accordingly, Apple – through its contract manufacturers Foxconn and Protek – accounted for about 80% of Audience's sales during the first nine months of 2011, 17% came from Samsung. Until now, Apple is Audience's only OEM that doesn't only buy complete chips but also takes IP under license. The payment is only made after the end of a quarter, so in light of the exorbitant Apple sales in the previous quarter, Audience should be drowning in money right now. This should make for an exciting IPO.
Itanium Bites the Dust
Oracle's fraud counter-claim against Hewlett-Packard was dismissed by a Californian Superior Court, but during the proceedings it became clear that the Itanium line really doesn't have much of a future any more. About two years after the eight-core chip Poulson, scheduled for this year, rolls out, the Kittson is supposed to follow, and we might still see the minimally improved Kittson+. Then, around 2016, the Itanium line will finally cease to be.