Processor Whispers: About remedies, losses and farewells
by Andreas Stiller
Once a year, at the International Electron Devices Meeting (IEDM), the experts of the craft meet to report on the advancements in transistor, memory and manufacturing technology. Intel is struggling with the C1-step of its newest processors and AMD miscounts.
New materials are supposed to enhance the mobility of the charge carriers in PMOS and NMOS by a factor of 30, said Intel's production chief, Mark Bohr, at the IEDM in Washington DC. In the future, field effect transistors are expected to be able to run on voltages down to 0.3 V and thus consume ten times less energy, and with nanowires it would be possible to create even smaller transistors.
IBM also talked about these carbon nanotube transistors, as well as about the so-called racetrack memory technology with nanowires. First prototypes, manufactured in standard CMOS technology, are still comparatively large, but IBM hopes to achieve a storage density two magnitudes greater than hard disks while undercutting the power consumption and costs of flash technology. IMEC and Samsung showed off new non-volatile memories: resistive random access memory (RRAM) with a switching time of only 10 ns.
Micron has teamed up with IBM to bring to market three-dimensional memory blocks, called Hybrid Memory Cubes (HMC), that can transfer up to 128 GB per second. With the help of IBM, the HM-cubes are supposed to reach market-readiness in the second half of 2013, instead of sometime around 2014/2015, as Micron had projected. The server manufacturers are already planning ...
The Deeply Depleted-channel technology from SuVolta changes the characteristic of the transistors by over 0.2 V and thus saves 50 per cent or more energy.
However, it was a small start-up from California that managed to draw the most attention. SuVolta has developed a Deeply Depleted Channel (DDC) transistor technology that runs on only 0.425 V, works with current CMOS technology and, at equal performance, only consumes half as much power as conventional transistors. Fujitsu has already produced the first promising test chips with DDC transistors.
Meanwhile, Intel has reached 14 nm with the first test chips running in the labs – this will be the future for Intel, but first, it will have to roll out the postponed Sandy Bridge EP Xeons in current 32 nm technology. Anyone who has a copy of Intel's Processor Sighting Report #452856, which you can only get under a non-disclosure agreement (NDA), has known for a while now that one of the 91 known bugs in the C1 stepping of the processors for the LGA2011 socket causes problems in connection with the I/O virtualisation VT-d. But now, this bug is also officially listed in the specification update for the recently released Sandy Bridge-E (Core i7-3960x and i7-39xxK), under BS90.
There is an easy remedy: do without VT-d altogether or at least refrain from using the affected function (queued invalidation). But that's not necessarily the appropriate solution for servers, which probably is one of the main reasons why the Sandy Bridge EP Xeons haven't been launched yet. The VT-d bug is supposed to have been fixed in the current C2-stepping. For desktop systems, VT-d should rarely be of importance, but still, the new i7 processors for LGA2011 have apparently been shipped rather reluctantly. "Due to limited supply", it says in Intel's product change notification (PCN 111178-00), the intention is to quickly switch to processors with C2-stepping.
Losses
In the meantime, competitor AMD released a very curious specification update of a different kind. Apparently, the intern responsible slightly miscalculated when counting the number of transistors on the Bulldozer chip Zambezi, ending up with 800 million too many. Now, instead of 2 billion transistors, there are supposedly only 1.2 billion. And this in turn means that the performance-per-transistor analysis now gives somewhat more reasonable values in comparison with the quad-core Sandy Bridge with its 1 billion transistors – with the former number this comparison suggested that AMD's new processor was awfully inefficient.
With only 1.2 billion transistors the Bulldozer die seems rather large at 315 mm², the Sandy Bridge die only measures 216 mm². However, AMD talks about "active transistors" – apparently, the automatic design tool must have added lots of passive transistors. Or maybe the chip has lots of additional functions and units, officially planned only for later generations, for test purposes. Such secret test features have always been present on Intel's chips too. For instance, the first Pentium P5 prototypes featured physical address extension (PAE), but it was disabled for the final production version. In the first user's manual it was still documented by mistake, but was then quickly removed. Some years later, the Pentium Pro finally came with PAE. And the first Pentium 4 Willamette was already internally equipped with hyperthreading, while Foster and Northwood, which officially featured it, arrived two years later. So who knows what's lying dormant in the Bulldozer's lost 800 million transistors?
AMD announced yet another loss: the Torrenza has fallen by the wayside. The concept, announced in the summer of 2006, was meant to link coprocessors more closely to the main processor. Part of it was the Torrenza Innovation Socket, which was supposed to give the coprocessor cache-coherent access to the shared main memory, similar to a CPU, via hypertransport. There was hardly any hardware for it, though.
In an interview with the journalist Anna Filatova of Xbit Labs, AMD's director of ISV Relationship, Neal Robison, said that they have bid farewell to this concept and are now favouring APU's that are directly integrated with the CPU in a socket. It also seems certain that AMD will go for integrated PCIe 3.0 for its CPUs in the not-too-distant future; Intel's LGA2011 processors will come equipped with it. The PCIe train is moving fast: at the start of December, the PCI SIG announced PCIe 4.0 for 2014/15, which is supposed to be twice as fast as PCIe 3, running with 16 GT/s per lane.
PCIe 3 is expected to soon find its way into the discrete high-end graphics and GPGPU cards from AMD and NVIDIA – but not Intel, whose planned coprocessor card, Knights Corner, is said to come equipped with the leisurely PCIe 2. It's rather funny though: Intel has the compatible CPUs, the others have the GPUs.
(djwm)



















