Processor Whispers: About memristors and other cannibals
by Andreas Stiller
The back to school season starts with many new processors, although IBM's large server chips probably don't have much to do with the end of the holidays. And there are going to be a few delays – maybe with AMD's Steamroller and certainly with HP's memristor.
They are out now: Intel's new Atom processors Z2670 for tablets with the Clover Trail platform including Clover View 2 processor, AMD's Trinity APUs for desktop PCs and, in the high-end sector, IBM's Power7+. Intel's Z2670 hasn't really outed itself though – as a pure OEM product it mostly remains a black box for the general public: no entry in the Intel database, no data sheets, no specification updates, no details on its inner workings, nothing about TDP, number of transistors, die size and so on. The latter will surely soon get measured by the company Chipworks, after they have torn the chip down, like they already did with its predecessor, the Atom Z2460 (Medfield platform with Penwell-1 processor), for which they reported a modest surface area of 63.4mm².
The Canadian company Chipworks, specialists in reverse engineering and examining chips for possible patent infringements, has recently established a subsidiary in Europe as well – so, if someone wants a chip taken apart and analysed ... Chipworks has been quite busy with the iPhone 5 processor A6. Now we know for sure that it is manufactured by Samsung, has two CPU cores and three Power VR GPU cores and that the whole chip measures 97mm² – about the same as a mobile Ivy Bridge M2 with two cores. According to Chipworks, the ARM core itself measures 15.8mm², covering about 50% more space than the one inside its predecessor A5, despite the fact that the latter was still manufactured for the iPhone 4 with the less space-saving 45nm process. By rule of thumb, that would mean that the new core has about three times as many transistors – well, the increase in performance by a factor of two has to be based on something. The analysis by Chipworks has also revealed that Apple, in contrast to earlier chips, has apparently not solely relied on tools from Cadence and the like to design the chip layout this time, but has instead done some manual tuning.
Source: Chipworks So the A6 is neither a default Cortex A9 nor a Cortex A15, but a proprietary Apple chip, Apple's own further development with ARMv7s architecture, somewhere half way between the two ARM designs. The attached "s" is important because it signifies "vector floating point unit version 4" (VFPv4), which features fused multiply-add and Float16. And it is probably thanks to VFPv4 that, in GeekBench, the iPhone 5 scores three times as high for floating point performance than its predecessor. What is still uncertain is whether the floating point unit works with single precision or (probably) double precision and if it has 16 or 32 registers.
It isn't necessary to tear apart AMD's Trinity: the quad-core chip is manufactured by Globalfoundries in Dresden, Germany, in the 32nm HKMG SOI process and, at 246mm², is more than twice as large as Apple's A6. Here, the open questions revolve around the future of its socket and AMD's APU roadmap as a whole. In any case, PCIe 3.0 is nowhere to be seen and its successor, the Kalveri, with two to four Steamroller cores and HSA application support, which, last spring, had been announced for 2013, seems to have gone into hiding for now. Apparently, the Richland, also referred to as Trinity 2.0, with Piledriver cores scaled down to 28nm technology, will be squeezed in.
The really large chips, however, come from IBM. At the end of August, the zNext processor for mainframes was released, with a record-breaking clock speed of 5.5GHz. Now, the two P systems P700 and P780 have been updated with the Power 7+, which reaches up to 4.2GHz of clock speed with four and six cores and up to 3.8GHz with eight cores. Each processor additionally features 4-way SMT – and if that's not enough, there's also a dual-chip module with 16 cores available. In the single module with 8 cores, the processor is supposed to be about 40% faster in SPECint_rate2006 than its predecessor. Up to 64 cores can collaborate in the P770 and up to 128 in the P780. Only PCIe 3.0 is still missing from the systems' features.
No, it's not because of technical problems that the memristors promised for mid-2013 will only be released around the end of 2014, but because HP's manufacturing partner SK Hynix would otherwise cannibalise its own flash business. Memristor inventor and HP fellow Stan Williams rather offhandedly announced this at a Round Table of the Kavli Foundation. In recent years, scientists have consistently doubted that HP's memristors could work as described at all and the narrow minded complained that they were not true two-terminal circuit elements like capacitors, inductors or resistors. Williams jovially countered by saying that, as long as they work, he doesn't care about the scientific classification.
Now, there are new reasons for HP and Hynix to pick up the pace because otherwise someone else might snatch the new memory technology away. For instance, researchers at the Oregon State University announced a breakthrough having produced memristors with cheap zinc-tin oxide, whereas HP and Hynix are using significantly more expensive titanium dioxide.
At the aforementioned conference of the Kavli Foundation, which was held under the motto "How Atomic Scale Devices Are Transforming Electronics", Professor Michelle Simmons from the Australian Centre of Excellence for Quantum Computation and Communication Technology at the University of New South Wales reported progress being made in moving individual atoms with a scanning tunnelling microscope and using them as memory; keyword: single-dopant transistors. At the same time, colleagues at the same university attracted international attention with a paper in Nature magazine.
For the first time, they had managed to selectively set and then read a spin with such a single-dopant transistor, a phosphor atom in high-purity silicon. So now you have a quantum bit (qubit) that differs from the common bit in that it can not only be 0 or 1, but also both at the same time, or rather an overlap of both states. What's still missing is a quantum-mechanical entanglement of as many of these qubits as possible in order to build a quantum computer with conventional silicon technology. That might all happen very quickly and would then be something of a quantum leap in quantum computing.