Intel Core i7 965 Extreme Edition runs up against the peak performance of the x86
The first three processors in Intel's new Nehalem generation shouldn't really have come out until mid-November, but Intel obviously wanted to forestall AMD's first 45-nanometre server processors, which are already being listed by online dealers. The market leader in semiconductors has at any rate allowed the world's press to start reporting today on the test kits that were distributed a few weeks ago. These contain the Core i7 965 Extreme Edition – 3.2 GHz – and Core i7 920 – 2.66 GHz – 45-nm four-core processors developed under the codename Bloomfield, the DX58SO motherboard – codenamed "Smackover" – with the X58 – Tylersburg 36S/I10R – chipset, the LGA1366 processor socket and three DDR3-1066/PC3-8500 memory channels, two processor coolers, and one of the X25-M solid-state disks (SSDs) presented some time ago – DX58SO Smackover X25-M.
As with the Atom, Intel has again reactivated hyper-threading (HT, Intel's implementation of simultaneous multi-threading, SMT) in the Nehalem generation processors. HT was introduced with the Pentium 4 and later abandoned. Each of the four Core i7 cores thus reports a second "logical" or virtual core to the operating system, so that in certain situations better use is made of the available arithmetic and logical units. Hyper-threading is just one of many Nehalem innovations, however. Another is that Intel has now definitively turned away from the front side bus architecture. The memory controller is now in the processor itself, no longer in the Northbridge of the chipset. This is intended to shorten latency times appreciably when accessing RAM. As in the K10 generation of the AMD64 processors, all four cores of the Core i7 – each of which has 256 kilobytes of L2 cache – now have a memory controller – with three DDR3 channels, an – 8 MB – L3 cache shared jointly by all cores, as well as a QuickPath Interconnect (QPI, up to 25.6[ gigabyte/s]) housed on a chip, with 731 million transistors jostling each other in an area of 263 square millimetres. For comparison, in its Phenoms and quad-core Opterons fabricated in 65-nm technology in Dresden, AMD currently squeezes 450 million transistors on to an SOI die with an area of 285 square millimetres.
Although we are talking here about Intel's handpicked test specimens of its new processors, the first benchmark results do nevertheless show their enormous potential. In the SPEC CPU2006, and with high optimized code thanks to Intel's latest C/C++ and Fortran compilers in version 11 beta, which already use SSE4.2 commands, a Core i7 965 Extreme Edition scored 110 points in integer throughput – int_rate_base_2006 – and 85.1 points in floating-point operations – fp_rate_base_2006, measured under 32-bit Windows Vista in each case. This first representative of Nehalem thus overtakes not only all previous x86 and x64 processors, but also most of the tandems made from two quad-core Opterons – 2360 SE: 92.7/84.7 points – and, in floating-point throughput, approaches two 3.2-GHz Xeons. Unusually, we had to carry out the CPU2006 tests under 32-bit Windows instead of 64-bit Linux, because the 64-bit code of the benchmark suite requires 2 gigabytes of RAM per core, thus a total of 16 gigabytes for eight cores, but the Core i7 processors with 2-gigabyte DIMMs on boards with six slots can only drive a maximum of 12 gigabytes of RAM. Unbuffered DDR3 SDRAM DIMMs with a capacity of 4 gigabytes cannot yet be supplied.
In more practical benchmarks, the Core i7 965 Extreme Edition can't so clearly outdo its predecessor, the Core 2 Extreme QX9770, which also has a clock frequency of 3.2 GHz, particularly in applications that compute with a single thread or with only a few threads in parallel. The Core i7 965 was in any case just 8 per cent faster than the Core 2 Extreme QX9770 on an X48 motherboard with PC3 12800 memory – DDR3-1600 – in the BAPCo SYSmark 2007 benchmark, and in 3D games the Nehalem's lead was negligible most of the time – BAPCo SYSmark 2007 X48-Mainboard. Only World in Conflict, which obviously exploits several cores, ran somewhat faster on the Core i7 965. With some other games, even a Core 2 Duo E8600 – 3.33 GHz – held the lead.
Multi-threading applications, such as compiling a Linux kernel, ran 26 per cent faster, and the Cinebench R10 rendering benchmark ran 34 per cent faster. Hyper-threading yielded marked advantages in compiling – 22 per cent – and rendering – 11 per cent, and HT only minimally slowed down the BAPCo SYSmark 2007.
For our benchmarks, we had activated the new Turbo Mode, in which the processor over clocks itself unless all cores are working to full capacity. Depending on the CPU's version, Turbo Boost raises the clock frequency by one or more steps, each step being 133 MHz. That is the basic frequency of the processor, which governs the higher clock frequencies of its arithmetic and logical units, the L3 cache, the memory controller, the memory modules, and the QPI. In our measurements, both Core i7 965 and Core i7 920 could be over clocked by one step in each case, which gave a performance boost of at best 5 per cent – but this makes the computer's power consumption rise markedly under full load. At 194 watts with the CPU under full load and 82 watts in no-load operation, the system with the Core i7 965 was nevertheless still somewhat thriftier than the comparison system with the Core 2 Extreme QX9770 and with the same fittings as far as possible – the graphics card being a Radeon HD 4550 in each case. By the way, we used a standard, but quite lively SATA hard disk instead of the Intel SSD for our measurements.
With a very powerful cooler and on motherboards with overclocking functions, you can set a higher thermal design power (TDP) for the Core i7 than it nominally has – 130 watts. If you then enable even higher Turbo Boost multipliers, automatic overclocking will reach the 4-GHz mark with the expensive Core i7 965 Extreme Edition.
Besides the Core i7 965 Extreme Edition – list price $999 – and the Core i7 920 – $284, Intel also intends to release a Core i7 940 – 2.83 GHz – $562. The Lynnfield – quad-core, possibly without HT – and Havendale – dual-core plus graphics – versions of Nehalem that are intended for medium-range boards with an LGA1160 socket won't come out until the third or fourth quarter of 2009. Besides Intel itself, at least Asus, EVGA, Gigabyte, Foxconn, and MSI intend to release LGA1366 boards with an X58 chipset, some with the SLI function. Any such board will cost more than 200 euros. Of course, Intel must now prove it can deliver the Core i7 and the X58 chipset as planned. Data sheets for the new products are not expected until mid-November.